# AUSTRALIAN BEER FORECAST

## Task setup

Using only the following data (https://www.kaggle.com/sergiomora823/monthly-beer-production),
please provide a forecast of monthly Australian beer production for the year 1996.
Verbally summarize the forecast and give a comment on what you did,
why you did what you did and,
how you ended up with the final forecast. Use a Kaggle notebook.

# # python Libraries

Here a list is provided of the the python libraries imported to be used by the script of the analysis that follows below

In [None]:
# Python modules used
import pandas as pd
import numpy as np
import datetime
import matplotlib.pylab as plt
from matplotlib.pylab import rcParams
rcParams['figure.figsize'] = 15, 6
import statsmodels.api as sm
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.holtwinters import ExponentialSmoothing
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.arima_model import ARIMA
from statsmodels.graphics.tsaplots import plot_pacf
from scipy import stats
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
import itertools
from sklearn.metrics import mean_squared_error
import warnings

## Data load

Data is loaded from a text (.csv) file

In [None]:
beer_prod_df = pd.read_csv('../input/monthly-beer-production/datasets_56102_107707_monthly-beer-production-in-austr.csv', header=0)

## Descriptive analysis

We have a single time series data with monthly frequency, representing the total Australian beer output in megalitres
during the period from January 1956 till August 1995

In [None]:
beer_prod_df

# First, let's re-format data to fit the python requirements:

*Month string variable is converted into datetime variable

In [None]:
beer_prod_df['Month']=beer_prod_df['Month'].astype('datetime64[ns]')

*Output string variable is converted into float variable

In [None]:
beer_prod_df['Monthly beer production']=beer_prod_df['Monthly beer production'].astype('float')

*The 'Month' column is renamed according to its data type, as well as the output column 'Monthly beer production' to make it informative and succinct

In [None]:
beer_prod_df.columns = ['DATETIME', 'MEGALITRES']

*Set DATETIME as an index

In [None]:
beer_prod_df = beer_prod_df.set_index('DATETIME')

*Plotting the output time series implies:
    -> 3-4 year-to-year trends, stable (unchanging) for approximately 10 years and,
    -> seasonality characterized by peak at the Australian summer (Dec-Mar) and bottom at the Australian winter (Jun-Sep)
Theoretically speaking, a scientific guess can be made that trend is driven by income demand effects, while seasonality is driven by taste demand factors, e.g. athmospheric temperature (the hotter the weather, the higher the thirst for beer) )
    -> There is a distinctive structural break in the winter of 1974

In [None]:
beer_prod_df.plot(figsize=(15, 7))
plt.title('Beer Production')
plt.ylabel('megalitres')
plt.show()

*Data is obviously non-stationary following an AR(1) process that does not oscilate around its initial anchor value. Rigorously, this could be shown by the results from the unit-root Dickey-Fuller test

In [None]:
def plot_test_stationarity(timeseries):
    #Determing rolling statistics
    rolmean = timeseries.rolling(12).mean()
    rolstd = timeseries.rolling(12).std()
    
    #Plot rolling statistics:
    orig = plt.plot(timeseries, color='blue',label='Original')
    mean = plt.plot(rolmean, color='red', label='Rolling Mean')
    std = plt.plot(rolstd, color='black', label = 'Rolling Std')
    plt.legend(loc='best')
    plt.title('Rolling Mean & Standard Deviation')
    plt.show(block=False)
    
    #Perform Dickey-Fuller test:
    print('Results of Dickey-Fuller Test:')
    dftest = adfuller(timeseries, autolag='AIC')
    dfoutput = pd.Series(dftest[0:4], index=['Test Statistic','p-value','#Lags Used','Number of Observations Used'])
    for key,value in dftest[4].items():
        dfoutput['Critical Value (%s)'%key] = value
    print(dfoutput)

In [None]:
plot_test_stationarity(beer_prod_df['MEGALITRES'])

*Test (unit root t) statistic exceeds the critical level at 5% and even 10% significance thus providing no evidence for rejecting the null hypothesis of having non-stationary AR(1) process

*Respective data transformations need to be applied to finish with a stationary dataset which allows for statistical inference and prediction:
    -> taking a natural logarithm, linearizes the exponential segments of the year-to-year trend, i.e. it leads to a decrease in the trend curvature

In [None]:
plot_test_stationarity(np.log(beer_prod_df['MEGALITRES']))

*Still the data annual moving averages are characterized by high volatility visualized by the upward sloping red line at the plot above (zero hypothesis cannot be rejected at 5% significance level).
        -> First differencing of the logged values could be applied to reduce variability of means

In [None]:
plot_test_stationarity(np.log(beer_prod_df.iloc[1:, :])-np.log(beer_prod_df.iloc[:,:].shift(1).dropna()))

*The test statistics from the Dickey-Fuller test on the diff-log data provides evidence to reject the zero hypothesis for non-stationary data at even 1% significance level

# PREDICTION 1: EXTRAPOLATION

*The simplest forecasting method with stationary data is seasonal extrapolation which in the case of monthly forecast implies taking the computed growth rate in the same month but in the previous year and applying them to the corresponding lagged test data points. In our case, we can estimate the forecast values for the test period from January 1986 till December 1996 based on growth rates for the years in the range January 1984 till December 1994, respectively.

In [None]:
index = pd.date_range("1956-01-01", periods=500, freq="MS")
beer_prod_pred_df=pd.DataFrame(index)
beer_prod_pred_df.columns = ['DATETIME']
beer_prod_pred_df=beer_prod_pred_df.set_index('DATETIME').join(beer_prod_df.iloc[:, :], how = 'left')
beer_prod_pred_df.columns=['MEGALITRES_OBS']
beer_prod_pred_df['LOG_MEGALITRES']=np.log(beer_prod_pred_df.iloc[:, :1])
beer_prod_pred_df['LOG_MEGALITRES_LAG1']=np.log(beer_prod_pred_df.iloc[:, :1].shift(1))
beer_prod_pred_df['LOG_MEGALITRES_DIFF1']=np.log(beer_prod_pred_df.iloc[:, :1])-np.log(beer_prod_pred_df.iloc[:, :1].shift(1))
beer_prod_pred_df['LOG_MEGALITRES_DIFF1_LAG12b']=(np.log(beer_prod_pred_df.iloc[:, :1])-np.log(beer_prod_pred_df.iloc[:, :1].shift(1))).shift(24)
start_date=datetime.date(1985, 12, 1)
end_date=datetime.date(1996, 12, 1)
beer_prod_pred0_df=beer_prod_pred_df.loc[start_date:end_date, :].copy()
beer_prod_pred0_df=beer_prod_pred0_df[['MEGALITRES_OBS', 'LOG_MEGALITRES_LAG1','LOG_MEGALITRES_DIFF1_LAG12b']]

In [None]:
beer_prod_pred0_df

*Run rolling forecast for the period from 1986-01-01 till 1996-12-31

In [None]:
warnings.filterwarnings("ignore")
beer_prod_pred2_df=pd.DataFrame()
for i in range(1, 133):
    beer_prod_pred1_df=beer_prod_pred0_df.iloc[i:i+1,:]
    beer_prod_pred1_df['LOG_MEGALITRES_PRED']=beer_prod_pred1_df['LOG_MEGALITRES_LAG1']+beer_prod_pred1_df['LOG_MEGALITRES_DIFF1_LAG12b']
    beer_prod_pred0_df.iloc[i+1:i+2,1:2]=beer_prod_pred1_df['LOG_MEGALITRES_PRED'][0]
    beer_prod_pred2_df=beer_prod_pred2_df.append(beer_prod_pred1_df)

In [None]:
beer_prod_pred2_df

In [None]:
beer_prod_pred3_df=beer_prod_pred2_df.copy()
beer_prod_pred3_df['MEGALITRES_EXTRAPRED']=np.exp(beer_prod_pred3_df['LOG_MEGALITRES_PRED'])

In [None]:
beer_prod_pred3_df

In [None]:
beer_prod_extrapred_df=beer_prod_pred3_df[['MEGALITRES_OBS', 'MEGALITRES_EXTRAPRED']]
beer_prod_extratest_df=beer_prod_extrapred_df.iloc[:-16,:]

In [None]:
beer_prod_extratest_df

In [None]:
plt.plot(beer_prod_extrapred_df['MEGALITRES_OBS'], label='observed')
plt.plot(beer_prod_extrapred_df['MEGALITRES_EXTRAPRED'], label='extrapred')
plt.title('Beer Production')
plt.ylabel('megalitres')
plt.legend(loc='best')

In [None]:
extra_rmse=np.sqrt(mean_squared_error(beer_prod_extratest_df['MEGALITRES_OBS'], beer_prod_extratest_df['MEGALITRES_EXTRAPRED']))

In [None]:
print(extra_rmse)

# PREDICTION 2: Exponential smoothening with seasonality a la Holt(1957)-Winters(1960)

*Here, a simple exponential smoothening model is applied based on Holt(1957)-Winters(1960) seasonal method

In [None]:
beer_prod_expotrain_df=beer_prod_df.iloc[:360,:]
beer_prod_expotest_df=beer_prod_df.iloc[360:,:]

In [None]:
beer_prod_expotrain_df

*Select the optimal combination of hyperparameters based on the minimal RMSE metrics

In [None]:
metrics_df=pd.DataFrame()
levels = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
trends = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
seasonal_components = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
for l in levels:
    for t in trends:
        for s in seasonal_components:
            beer_prod_expofit_df=(ExponentialSmoothing(beer_prod_expotrain_df, trend='add', seasonal='add',
                      seasonal_periods=12).fit(smoothing_level=l, smoothing_trend=t, smoothing_seasonal=s)).forecast(116)
            beer_prod_expofit_df=beer_prod_expofit_df.to_frame()
            beer_prod_expofit_df=beer_prod_expofit_df.reset_index()
            beer_prod_expofit_df.columns=['DATETIME', 'MEGALITRES_EXPOPRED']
            beer_prod_expofit_df=beer_prod_expofit_df.set_index('DATETIME')
            beer_prod_expofit_df=beer_prod_expofit_df.join(beer_prod_expotest_df, how='right')
            beer_prod_expofit_df.columns=['MEGALITRES_EXPOPRED', 'MEGALITRES_OBSERVED']
            rmse=np.sqrt(mean_squared_error(beer_prod_expofit_df['MEGALITRES_OBSERVED'], beer_prod_expofit_df['MEGALITRES_EXPOPRED']))
#             print('level={}, trend={}, seasonal_component={}, rmse ={}'.format(l, t, s, rmse))
            metrics_dict0 = {'level':[l], 'trend':[t], 'seasonal_component':[s], 'rmse':[rmse]}
            metrics_dict0_df = pd.DataFrame(metrics_dict0)
            metrics_df=metrics_df.append(metrics_dict0_df)
print(metrics_df[metrics_df['rmse']==min(metrics_df['rmse'])])

*Apply the optimal parameter values to forecast beer production in the period from 1986-01-01 till 1996-12-31

In [None]:
l=0.1
t=0.9
s=0.3
beer_prod_expopred_df=(ExponentialSmoothing(beer_prod_expotrain_df, trend='add', seasonal='add', seasonal_periods=12).fit(smoothing_level=l, smoothing_trend=t, smoothing_seasonal=s)).forecast(132)
beer_prod_expopred_df=beer_prod_expopred_df.to_frame()
beer_prod_expopred_df=beer_prod_expopred_df.reset_index()
beer_prod_expopred_df.columns=['DATETIME', 'MEGALITRES_EXPOPRED']
beer_prod_expopred_df=beer_prod_expopred_df.set_index('DATETIME')
beer_prod_expopred_df=beer_prod_expopred_df.join(beer_prod_expotest_df, how='left')
beer_prod_expopred_df.columns=['MEGALITRES_EXPOPRED', 'MEGALITRES_OBSERVED']

In [None]:
beer_prod_expopred_df

In [None]:
plt.plot(beer_prod_expopred_df['MEGALITRES_OBSERVED'], label='observed')
plt.plot(beer_prod_expopred_df['MEGALITRES_EXPOPRED'], label='expopred')
plt.title('Beer Production')
plt.ylabel('megalitres')
plt.legend(loc='best')

In [None]:
expo_rmse=metrics_df[metrics_df['rmse']==min(metrics_df['rmse'])]['rmse'][0]

In [None]:
print(expo_rmse)

# PREDICTION 3: SARIMA model

*Here, the classical time-series econometrics autocorrelation SARIMA model will be applied to forecast the monthly beer production in Australia

In [None]:
sarima_traintest_df=beer_prod_df.copy()
sarima_traintest_df['LOG_MEGALITRES']=np.log(beer_prod_df['MEGALITRES'])
sarima_traintest_df=sarima_traintest_df[['LOG_MEGALITRES']]
sarima_train = sarima_traintest_df.iloc[:360,:]
sarima_test = sarima_traintest_df.iloc[360:,:]

*After specifying the train and test data subsets, the optimal lags and differencing orders will be identified for the seasonal and non-seasonal parts of the SARIMA model

In [None]:
p = d = q = range(0, 3)
pdq = list(itertools.product(p, d, q))
pdq

In [None]:
seasonal_pdq = [(x[0], x[1], x[2], 12) for x in list(itertools.product(p, d, q))]

In [None]:
metrics_df=pd.DataFrame()
for param in pdq:
    for param_seasonal in seasonal_pdq:
        mod = sm.tsa.statespace.SARIMAX(sarima_train, order=param, seasonal_order=param_seasonal, enforce_stationarity=False, enforce_invertibility=False)
        results = mod.fit()
        print('ARIMA{}x{}12- AIC:{}'.format(param, param_seasonal, results.aic))
        metrics_dict0 = {'Non-seasonal:':[param], 'Seasonal':[param_seasonal], 'AIC':[results.aic]}
        metrics_dict0_df = pd.DataFrame(metrics_dict0)
        metrics_df=metrics_df.append(metrics_dict0_df)
print(metrics_df[metrics_df['AIC']==min(metrics_df['AIC'])])

*The optimal combination of SARIMA parameters is applied to fit the model

In [None]:
mod = sm.tsa.statespace.SARIMAX(sarima_train,
                                order=(1, 1, 2),
                                seasonal_order=(1, 0, 1, 12),
                                enforce_stationarity=False,
                                enforce_invertibility=False)

results = mod.fit()

print(results.summary().tables[1])

*Residuals analysis implies stationarity and unbiasedness of the estimated coefficients

In [None]:
residuals1 = pd.DataFrame(results.resid[1:])
residuals1.plot(figsize=(20,5))
plt.show()

In [None]:
residuals1.plot(kind='kde', figsize=(20,5))
plt.show()
print(residuals1.describe())

In [None]:
print("KS P-value = "+str(round(stats.kstest(residuals1, 'norm')[1], 10)))
print("D’Agostino and Pearson’s P-value = "+str(round(stats.normaltest(residuals1, axis=0)[1][0], 6)))

*Finally, a rolling forecast is run for the period from 1986-01-01 till 1996-12-31

In [None]:
beer_prod_sarimapred_df=pd.DataFrame()
for i in range(0, 132):
    print(i)
    t=360+i
    if t<476:
        sarima_train=sarima_traintest_df.iloc[:t,:]
        mod=sm.tsa.statespace.SARIMAX(sarima_train, order=(1, 1, 2), seasonal_order=(1, 0, 1, 12))
        mod_fit=mod.fit(disp=0)
        beer_prod_datetime = sarima_traintest_df.index[t]
        beer_prod_sarimapred_value = np.exp(mod_fit.forecast()[0])
        beer_prod_observed_value = np.exp(sarima_test.iloc[i:i+1, -1:]['LOG_MEGALITRES'][0])
    else:
        train_dict={'DATETIME': [beer_prod_datetime], 'LOG_MEGALITRES': [mod_fit.forecast()[0]]}
        train_dict_df=pd.DataFrame.from_dict(train_dict)
        sarima_train=sarima_train.reset_index()
        sarima_train=sarima_train.append(train_dict_df)
        sarima_train=sarima_train.set_index('DATETIME')
        mod=sm.tsa.statespace.SARIMAX(sarima_train, order=(1, 1, 2), seasonal_order=(1, 0, 1, 12))
        mod_fit=mod.fit(disp=0)
        if beer_prod_datetime.month==12:
            beer_prod_datetime=datetime.datetime(beer_prod_datetime.year+1, 1, beer_prod_datetime.day)
        else:
            beer_prod_datetime=datetime.datetime(beer_prod_datetime.year, beer_prod_datetime.month+1, beer_prod_datetime.day)
        beer_prod_sarimapred_value = np.exp(mod_fit.forecast()[0])
        beer_prod_observed_value = float("NaN")
    print(beer_prod_datetime)
    beer_prod_sarimapred_dict={'DATETIME': [beer_prod_datetime], 'MEGALITRES_SARIMAPRED': [beer_prod_sarimapred_value], 'MEGALITRES_OBSERVED': [beer_prod_observed_value]}
    beer_prod_sarimapred_df0=pd.DataFrame.from_dict(beer_prod_sarimapred_dict)
    beer_prod_sarimapred_df=beer_prod_sarimapred_df.append(beer_prod_sarimapred_df0)
beer_prod_sarimapred_df=beer_prod_sarimapred_df.set_index('DATETIME')

In [None]:
beer_prod_sarimapred_df

In [None]:
plt.plot(beer_prod_sarimapred_df['MEGALITRES_OBSERVED'], label='observed')
plt.plot(beer_prod_sarimapred_df['MEGALITRES_SARIMAPRED'], label='sarimapred')
plt.title('Beer Production')
plt.ylabel('megalitres')
plt.legend(loc='best')

In [None]:
sarima_rmse=np.sqrt(mean_squared_error(beer_prod_sarimapred_df.iloc[:116,-1:], beer_prod_sarimapred_df.iloc[:116,-2:-1]))

In [None]:
print(sarima_rmse)

# PREDICTION 4: LSTM model

*Finally, let's use a LSTM (Long Short-Term Memory) recurrent neural network architecture for predicting the beer production in Australia

In [None]:
lstm_traintest_df=beer_prod_df.copy()
lstm_traintest_df['LOG_MEGALITRES']=np.log(beer_prod_df['MEGALITRES'])
lstm_traintest_df=lstm_traintest_df[['LOG_MEGALITRES']]
lstm_train = lstm_traintest_df.iloc[:360,:]
lstm_test = lstm_traintest_df.iloc[360:,:]

*Before training the LSTM model, it is necessary to normalize data within the min-max scale

In [None]:
scaler = MinMaxScaler()
scaler.fit(lstm_train)
lstm_train_normalized = scaler.transform(lstm_train)
lstm_test_normalized = scaler.transform(lstm_test)

*Next, define the generator of the input and output data over which the LSTM model should be trained and fitted

In [None]:
generator = TimeseriesGenerator(lstm_train_normalized, lstm_train_normalized, length=12, batch_size=1)

*Now, the model could be trained

In [None]:
lstm_model = Sequential()
lstm_model.add(LSTM(200, input_shape=(12, 1)))
lstm_model.add(Dense(1))
lstm_model.compile(optimizer='adam', loss='mean_squared_error')
lstm_model.summary()

*Use the generator to fit the model

In [None]:
lstm_model.fit_generator(generator,epochs=15)

*Plot the iterative convergence path of the recursive optimization

In [None]:
losses_lstm = lstm_model.history.history['loss']
plt.figure(figsize=(20,5))
plt.xticks(np.arange(0,15,1))
plt.plot(range(len(losses_lstm)),losses_lstm);

*Run rolling forecast of the beer production in the period from 1986-01-01 till 1996-12-31

In [None]:
beer_prod_lstmpred = list()

batch = lstm_train_normalized[-12:]
current_batch = batch.reshape((1, 12, 1))

for i in range(132):   
    lstm_pred = lstm_model.predict(current_batch)[0]
    beer_prod_lstmpred.append(lstm_pred) 
    current_batch = np.append(current_batch[:,1:,:],[[lstm_pred]],axis=1)

In [None]:
beer_prod_lstmpred = scaler.inverse_transform(beer_prod_lstmpred)

In [None]:
beer_prod_lstmpred

In [None]:
beer_prod_lstmpred_df=pd.DataFrame(beer_prod_lstmpred, columns=['LOG_MEGALITRES_LSTMPRED'])

In [None]:
index = pd.date_range("1986-01-01", periods=132, freq="MS")
beer_prod_pred_df=pd.DataFrame(index)

In [None]:
beer_prod_lstmpred_df=beer_prod_lstmpred_df.set_index(index)
beer_prod_lstmpred_df=beer_prod_lstmpred_df.reset_index()
beer_prod_lstmpred_df.columns=['DATETIME', 'LOG_MEGALITRES_LSTMPRED']
beer_prod_lstmtest_df=beer_prod_lstmpred_df.copy()
beer_prod_lstmpred_df=beer_prod_lstmpred_df.set_index('DATETIME').join(lstm_test, how='left')
beer_prod_lstmtest_df=beer_prod_lstmtest_df.set_index('DATETIME').join(lstm_test, how='right')
beer_prod_lstmpred_df['MEGALITRES_LSTMPRED']=np.exp(beer_prod_lstmpred_df['LOG_MEGALITRES_LSTMPRED'])
beer_prod_lstmpred_df['MEGALITRES_OBSERVED']=np.exp(beer_prod_lstmpred_df['LOG_MEGALITRES'])
beer_prod_lstmtest_df['MEGALITRES_LSTMPRED']=np.exp(beer_prod_lstmtest_df['LOG_MEGALITRES_LSTMPRED'])
beer_prod_lstmtest_df['MEGALITRES_OBSERVED']=np.exp(beer_prod_lstmtest_df['LOG_MEGALITRES'])

In [None]:
beer_prod_lstmpred_df

In [None]:
beer_prod_lstmpred_df=beer_prod_lstmpred_df[['MEGALITRES_LSTMPRED', 'MEGALITRES_OBSERVED']]

In [None]:
beer_prod_lstmpred_df

In [None]:
plt.plot(beer_prod_lstmpred_df['MEGALITRES_OBSERVED'], label='observed')
plt.plot(beer_prod_lstmpred_df['MEGALITRES_LSTMPRED'], label='lstmpred')
plt.title('Beer Production')
plt.ylabel('megalitres')
plt.legend(loc='best')

In [None]:
beer_prod_lstmtest_df

In [None]:
lstm_rmse=np.sqrt(mean_squared_error(beer_prod_lstmtest_df['MEGALITRES_OBSERVED'], beer_prod_lstmtest_df['MEGALITRES_LSTMPRED']))

In [None]:
print(lstm_rmse)

# CONCLUSION

*Summarizing the RMSE metrics for the 4 forecasting methods applied leads to the conclusion that the forecast based on the SARIMA model performed best in the tested period, only marginally outperforming the exponential smoothening model.

In [None]:
error_metrics_dbeer_prod_lstmpred_df=pd.DataFrame({'Model':['EXTRA', 'EXPO', 'SARIMA', 'LSTM'], 'RMSE': [extra_rmse, expo_rmse, sarima_rmse, lstm_rmse]})

In [None]:
error_metrics_dbeer_prod_lstmpred_df

In [None]:
beer_prod_allpred_df=pd.DataFrame()
beer_prod_allpred_df=beer_prod_allpred_df.append(beer_prod_extrapred_df)
beer_prod_allpred_df=beer_prod_allpred_df.join(beer_prod_expopred_df.iloc[:, -2:-1], how='left')
beer_prod_allpred_df=beer_prod_allpred_df.join(beer_prod_sarimapred_df.iloc[:, -2:-1], how='left')
beer_prod_allpred_df=beer_prod_allpred_df.join(beer_prod_lstmpred_df.iloc[:, -2:-1], how='left')

In [None]:
beer_prod_allpred_df

*Here's a summary plot of all forecasts

In [None]:
plt.plot(beer_prod_allpred_df['MEGALITRES_OBS'], label='observed')
plt.plot(beer_prod_allpred_df['MEGALITRES_EXTRAPRED'], label='extrapred')
plt.plot(beer_prod_allpred_df['MEGALITRES_EXPOPRED'], label='expopred')
plt.plot(beer_prod_allpred_df['MEGALITRES_SARIMAPRED'], label='sarimapred')
plt.plot(beer_prod_allpred_df['MEGALITRES_LSTMPRED'], label='lstmpred')
plt.title('Beer Production')
plt.ylabel('megalitres')
plt.legend(loc='best')

*Extrapolation and exponential smoothing tend to overshoot the summer peak each December while on the contrary the LSTM forecast estimates tend to overshoot the winter bottom months.
SARIMA trend tracks most closely the observed variance of beer production which is in line with RMSE results.
Therefore, my final conclusion is that for the purpose of making an annual forecast (with monthly frequency) of the Australian beer production in the period from 1986-01-01 till 1996-12-31,
out of the 4 methods considered, autocorrelation (SARIMA) modelling performs as having the highest predictive power. Still if the forecast needs to be done regularly and time of execution is a factor, exponential smoothening comes as faster 2nd best method to apply without big losses of precision (at the summer months).

In [None]:
beer_prod_allpred_df.to_csv('beer_prod_allpred.csv', index=True)