# Getting Started with ETS

Automatic forecasting tools tackle the needs for predictions over large collections of univariate time series that often arise in business practice and other contexts. Among these solutions, R's forecasting package `ets` has been a reference for their accuracy and high quality for many years.

Unfortunately, baselines with their accuracy and computational efficiency were not available for Python yet. For this reason, we developed our new and highly efficient pure-Python implementation of these classic algorithms that we showcase in this notebook.


In [None]:
import numpy as np
import pandas as pd
from IPython.display import display, Markdown
from tqdm.autonotebook import tqdm
import matplotlib.pyplot as plt
from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA, ETS

## Loading six group merchants Data

In [None]:
# you need to change the data path
data_path = "../../../data/raw/Time_Series_Merchants_Transactions_Anonymized.csv"
df_merchant_transactions = pd.read_csv(data_path)

In [None]:
df_merchant_transactions = df_merchant_transactions.drop(columns='Merchant Name')

In [None]:
# replacing columns names with standard date format
stddates = pd.date_range(start='2020-08', end='2022-10', freq="M")
df_merchant_transactions.columns = stddates
df_merchant_transactions.head()
#stddates

In [None]:
df= {
    'unique_id':[1.0,1.0,1.0,1.0,1.0,
                1.0,1.0,1.0,1.0,1.0,
                1.0,1.0,1.0,1.0,1.0,
                1.0,1.0,1.0,1.0,1.0,
                1.0,1.0,1.0,1.0,1.0,1.0],
    'ds':stddates,
    'y' :df_merchant_transactions.iloc[7,:].values
      }
Y_df = pd.DataFrame(df)
Y_df.head()

## Fit AutoETS

**ETS:** The exponential smoothing (ETS) algorithm is especially suited for data with seasonality and trend. ETS computes a weighted average over all observations in the input time series dataset as its prediction. In contrast to moving average methods with constant weights, ETS weights exponentially decrease over time, capturing long term dependencies while prioritizing new observations.

In [None]:
Y_train_df = Y_df[Y_df["ds"]<="2022-04-30"] # 18 train
Y_test_df = Y_df[Y_df["ds"]>"2022-04-30"] # 8 test

In [None]:
season_length = 12
horizon = len(Y_test_df)
models = [
    ETS(season_length=season_length, model='ZMZ')
]
model = StatsForecast(
    df=Y_train_df, 
    models=models,
    freq='M', 
    n_jobs=-1,
)

Y_hat_df = model.forecast(horizon).reset_index()
Y_hat_df.head()

## Plot and Evaluate Predictions

We are going to plot the models againts the real values of test.

In [None]:
fig, ax = plt.subplots(1, 1, figsize = (20, 7))
Y_hat_df = Y_test_df.merge(Y_hat_df, how='left', on=['unique_id', 'ds'])
plot_df = pd.concat([Y_train_df, Y_hat_df]).set_index('ds')

plot_df[['y', 'ETS']].plot(ax=ax, linewidth=2)

ax.set_title('Merchants Transactions Forecast', fontsize=22)
ax.set_ylabel('Monthly Transactions', fontsize=20)
ax.set_xlabel('Timestamp [t]', fontsize=20)
ax.legend(prop={'size': 15})
ax.grid()

Finally, we evaluate the predictions accuracy using the Mean Absolute Error:

$$
\qquad MAE = \frac{1}{Horizon} \sum_{\tau} |y_{\tau} - \hat{y}_{\tau}|\qquad
$$

In [None]:
def mae(y_hat, y_true):
    return np.mean(np.abs(y_hat-y_true))

y_true = Y_test_df['y'].values
ets_preds = Y_hat_df['ETS'].values

print('ETS   MAE: %0.3f' % mae(ets_preds, y_true))