### Prof. Pedram Jahangiry

You need to make a copy to your own Google drive if you want to edit the original notebook! Start by opening this notebook on Colab 👇

<a href="https://colab.research.google.com/github/PJalgotrader/Deep_forecasting-USU/blob/main/Lectures%20and%20codes/DF%20Spring%202023/Module%203-%20Exponential%20Smoothing/Module3-exponential_smoothing_ETS.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> 



![logo](https://upload.wikimedia.org/wikipedia/commons/4/44/Huntsman-Wordmark-with-USU-Blue.gif#center) 


## 🔗 Links

[![linkedin](https://img.shields.io/badge/LinkedIn-0A66C2?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/pedram-jahangiry-cfa-5778015a)

[![Youtube](https://img.shields.io/badge/youtube_channel-1DA1F2?style=for-the-badge&logo=youtube&logoColor=white&color=FF0000)](https://www.youtube.com/channel/UCNDElcuuyX-2pSatVBDpJJQ)

[![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/PedramJahangiry.svg?style=social&label=Follow%20%40PedramJahangiry)](https://twitter.com/PedramJahangiry)


---


# Module 3: Exponential Smoothing Methods + ETS models 

Exponential smoothing was proposed in the late 1950s (Brown, 1959; Holt, 1957; Winters, 1960), and has motivated some of the most successful forecasting methods. A forecast generated by exponential smoothing uses weighted averages of past observations, with the weights decaying exponentially over time. In other words, the more recent the observation the higher the associated weight.

In this module:

* First, we present the mechanics of the most important exponential smoothing methods
* Then, we present the statistical models that underlie exponential smoothing methods. These models generate identical point forecasts to the methods discussed in the first part of the chapter, but also generate prediction intervals.

Documentation: 

1. **PyCaret**: https://pycaret.readthedocs.io/en/latest/index.html PyCaret3.0
2. **sktime** : https://www.sktime.org/en/stable/api_reference/forecasting.html

# Installation

Follow the steps here: https://pycaret.gitbook.io/docs/get-started/installation


In [2]:
#only if you want to run it in Google Colab: 
# for this chapter, we can install the light version of PyCaret as below. 

# !pip install --pre pycaret

In [1]:
# if you got a warning that you need to "RESTART RUNTIME", go ahead and press that button. 

# let's double ckeck the Pycaret version: 
from pycaret.utils import version
version()

'3.0.0.rc4'

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# Importing Dataset

In [5]:
#get_data('index')

In [3]:
from pycaret.datasets import get_data
airline = get_data('airline')

Period
1949-01    112.0
1949-02    118.0
1949-03    132.0
1949-04    129.0
1949-05    121.0
Freq: M, Name: Number of airline passengers, dtype: float64

In [11]:
# or alternatively, 
df = pd.read_csv("https://raw.githubusercontent.com/PJalgotrader/Deep_forecasting-USU/main/data/airline_passengers.csv", index_col="Month")
df.head()

Unnamed: 0_level_0,Passengers
Month,Unnamed: 1_level_1
1949-01,112
1949-02,118
1949-03,132
1949-04,129
1949-05,121


In [12]:
# if you are working with Pandas, your first job should be changing the type of the index to datetime and then to period! This is a compatibility issue with other packages. 
df.index

Index(['1949-01', '1949-02', '1949-03', '1949-04', '1949-05', '1949-06',
       '1949-07', '1949-08', '1949-09', '1949-10',
       ...
       '1960-03', '1960-04', '1960-05', '1960-06', '1960-07', '1960-08',
       '1960-09', '1960-10', '1960-11', '1960-12'],
      dtype='object', name='Month', length=144)

In [13]:
pd.to_datetime(df.index)

DatetimeIndex(['1949-01-01', '1949-02-01', '1949-03-01', '1949-04-01',
               '1949-05-01', '1949-06-01', '1949-07-01', '1949-08-01',
               '1949-09-01', '1949-10-01',
               ...
               '1960-03-01', '1960-04-01', '1960-05-01', '1960-06-01',
               '1960-07-01', '1960-08-01', '1960-09-01', '1960-10-01',
               '1960-11-01', '1960-12-01'],
              dtype='datetime64[ns]', name='Month', length=144, freq=None)

In [14]:
df.index = pd.to_datetime(df.index).to_period('M')
df.index

PeriodIndex(['1949-01', '1949-02', '1949-03', '1949-04', '1949-05', '1949-06',
             '1949-07', '1949-08', '1949-09', '1949-10',
             ...
             '1960-03', '1960-04', '1960-05', '1960-06', '1960-07', '1960-08',
             '1960-09', '1960-10', '1960-11', '1960-12'],
            dtype='period[M]', name='Month', length=144)

Setting up PyCaret Experiment:

In [15]:
from pycaret.time_series import *

In [16]:
exp = TSForecastingExperiment()
exp.setup(data = df, target='Passengers' ,  fh = 12, coverage=0.90)

Unnamed: 0,Description,Value
0,session_id,3834
1,Target,Passengers
2,Approach,Univariate
3,Exogenous Variables,Not Present
4,Original data shape,"(144, 1)"
5,Transformed data shape,"(144, 1)"
6,Transformed train set shape,"(132, 1)"
7,Transformed test set shape,"(12, 1)"
8,Rows with missing values,0.0%
9,Fold Generator,ExpandingWindowSplitter


<pycaret.time_series.forecasting.oop.TSForecastingExperiment at 0x1fb5efa3fa0>

In [12]:
exp.check_stats()

Unnamed: 0,Test,Test Name,Data,Property,Setting,Value
0,Summary,Statistics,Transformed,Length,,144.0
1,Summary,Statistics,Transformed,# Missing Values,,0.0
2,Summary,Statistics,Transformed,Mean,,280.298611
3,Summary,Statistics,Transformed,Median,,265.5
4,Summary,Statistics,Transformed,Standard Deviation,,119.966317
5,Summary,Statistics,Transformed,Variance,,14391.917201
6,Summary,Statistics,Transformed,Kurtosis,,-0.364942
7,Summary,Statistics,Transformed,Skewness,,0.58316
8,Summary,Statistics,Transformed,# Distinct Values,,118.0
9,White Noise,Ljung-Box,Transformed,Test Statictic,"{'alpha': 0.05, 'K': 24}",1606.083817


In [13]:
exp.plot_model(plot='train_test_split')

In [14]:
exp.plot_model(plot='acf', data_kwargs={'nlags':30})

In [15]:
exp.plot_model(plot='pacf', data_kwargs={'nlags':30})

In [16]:
exp.plot_model(plot = 'decomp')


In [17]:
exp.plot_model(plot = 'decomp', data_kwargs = {'type' : 'multiplicative'})


In [18]:
exp.models()

Unnamed: 0_level_0,Name,Reference,Turbo
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
naive,Naive Forecaster,sktime.forecasting.naive.NaiveForecaster,True
grand_means,Grand Means Forecaster,sktime.forecasting.naive.NaiveForecaster,True
snaive,Seasonal Naive Forecaster,sktime.forecasting.naive.NaiveForecaster,True
polytrend,Polynomial Trend Forecaster,sktime.forecasting.trend.PolynomialTrendForeca...,True
arima,ARIMA,sktime.forecasting.arima.ARIMA,True
auto_arima,Auto ARIMA,sktime.forecasting.arima.AutoARIMA,True
exp_smooth,Exponential Smoothing,sktime.forecasting.exp_smoothing.ExponentialSm...,True
croston,Croston,sktime.forecasting.croston.Croston,True
ets,ETS,sktime.forecasting.ets.AutoETS,True
theta,Theta Forecaster,sktime.forecasting.theta.ThetaForecaster,True


---
---
## Exponential Smoothing Methods:

---
### SES method

In [19]:
ses = exp.create_model('exp_smooth', trend=None, seasonal=None, sp= None, cross_validation=False )

Unnamed: 0,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2
Test,2.5006,2.9849,76.1426,103.1245,0.1428,0.1616,-0.9198


In [20]:
ses.get_params()

{'damped_trend': False,
 'damping_trend': None,
 'initial_level': None,
 'initial_seasonal': None,
 'initial_trend': None,
 'initialization_method': 'estimated',
 'method': None,
 'minimize_kwargs': None,
 'optimized': True,
 'random_state': None,
 'remove_bias': False,
 'seasonal': None,
 'smoothing_level': None,
 'smoothing_seasonal': None,
 'smoothing_trend': None,
 'sp': None,
 'start_params': None,
 'trend': None,
 'use_boxcox': None,
 'use_brute': True}

---
Let's make some "in sample" predictions:

In [21]:
exp.plot_model(ses, plot='insample')

In [22]:
# In-sample Performance metrics?
df.index[:-12] # train set index 

PeriodIndex(['1949-01', '1949-02', '1949-03', '1949-04', '1949-05', '1949-06',
             '1949-07', '1949-08', '1949-09', '1949-10',
             ...
             '1959-03', '1959-04', '1959-05', '1959-06', '1959-07', '1959-08',
             '1959-09', '1959-10', '1959-11', '1959-12'],
            dtype='period[M]', name='Month', length=132)

In [23]:
y_pred = ses.predict(df.index[:-12])  # alternatively we could do: y - ses.predict_residuals(y) 
y_pred

Unnamed: 0_level_0,Passengers
Month,Unnamed: 1_level_1
1949-01,118.466667
1949-02,112.032333
1949-03,117.970162
1949-04,131.929851
1949-05,129.014649
...,...
1959-08,547.618697
1959-09,558.943093
1959-10,463.479715
1959-11,407.282399


In [24]:
from sklearn.metrics import r2_score, mean_absolute_percentage_error

In [25]:
r2_score(df['Passengers'][:-12], y_pred)

0.9133722405707403

In [26]:
mean_absolute_percentage_error(df['Passengers'][:-12], y_pred)

0.08966385456088362

---
Now, let's make some forecasts:

In [79]:
exp.plot_model(ses, plot='forecast')

In [28]:
exp.plot_model(ses, plot='forecast', data_kwargs={'fh':36})

In [29]:
# again, we can manually construct the forecast values and report the R2, but there are some built in functions for this. 
# manually: 
y_forecast = ses.predict(df.index[-12:]) # by default, the predict function, generates the predictions for hold-out set. 
y_forecast

Unnamed: 0_level_0,Passengers
Month,Unnamed: 1_level_1
1960-01,404.786132
1960-02,404.786132
1960-03,404.786132
1960-04,404.786132
1960-05,404.786132
1960-06,404.786132
1960-07,404.786132
1960-08,404.786132
1960-09,404.786132
1960-10,404.786132


In [30]:
r2_score(df['Passengers'][-12:], y_forecast)

-0.9197953372342429

In [31]:
mean_absolute_percentage_error(df['Passengers'][-12:], y_forecast)

0.14279017396066332

In [32]:
# using built-in function: 
exp.predict_model(ses)

Unnamed: 0,Model,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2
0,Exponential Smoothing,2.5006,2.9849,76.1426,103.1245,0.1428,0.1616,-0.9198


Unnamed: 0,y_pred
1960-01,404.7861
1960-02,404.7861
1960-03,404.7861
1960-04,404.7861
1960-05,404.7861
1960-06,404.7861
1960-07,404.7861
1960-08,404.7861
1960-09,404.7861
1960-10,404.7861


---
### Holt's linear trend method:

In [77]:
ht = exp.create_model('exp_smooth', trend='add', seasonal=None, cross_validation=False)

Unnamed: 0,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2
Test,2.2069,2.7073,67.2002,93.5338,0.1259,0.1402,-0.5793


In [88]:
exp.plot_model(ht, plot='forecast', data_kwargs={'fh':24, 'labels':['Holts linear trend']})

In [85]:
exp.predict_model(ht, fh=24)

Unnamed: 0,y_pred
1960-01,406.8585
1960-02,408.9205
1960-03,410.9826
1960-04,413.0446
1960-05,415.1066
1960-06,417.1687
1960-07,419.2307
1960-08,421.2927
1960-09,423.3547
1960-10,425.4168


---
### Holt-Winters method

In [67]:
hw_add = exp.create_model('exp_smooth', trend='add', seasonal='add', sp= 12, cross_validation=False)

Unnamed: 0,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2
Test,0.4394,0.4915,13.3805,16.9799,0.028,0.028,0.948


In [68]:
hw_mult = exp.create_model('exp_smooth', trend='add', seasonal='mul', sp=12, cross_validation=False)

Unnamed: 0,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2
Test,0.3382,0.4575,10.2997,15.8074,0.0221,0.0216,0.9549


In [69]:
hw_damped = exp.create_model('exp_smooth', damped_trend='True' , trend='add', seasonal='mul', sp=12, cross_validation=False)

Unnamed: 0,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2
Test,0.399,0.5069,12.1481,17.5115,0.0254,0.0248,0.9446


In [35]:
hw_mult.get_params()

{'damped_trend': False,
 'damping_trend': None,
 'initial_level': None,
 'initial_seasonal': None,
 'initial_trend': None,
 'initialization_method': 'estimated',
 'method': None,
 'minimize_kwargs': None,
 'optimized': True,
 'random_state': None,
 'remove_bias': False,
 'seasonal': 'mul',
 'smoothing_level': None,
 'smoothing_seasonal': None,
 'smoothing_trend': None,
 'sp': 12,
 'start_params': None,
 'trend': 'add',
 'use_boxcox': None,
 'use_brute': True}

In [89]:
exp.plot_model([ses,hw_add, hw_mult], plot='insample', data_kwargs={'labels':["SES", "Holt-Winter-additive", "Holt-Winter-Multiplicative"]})

In [90]:
exp.plot_model([ses,hw_add, hw_mult], plot='forecast', data_kwargs={'labels':["SES", "Holt-Winter-additive", "Holt-Winter-Multiplicative"]})

In [94]:
exp.plot_model([ses,ht, hw_add, hw_mult], plot='forecast', data_kwargs={'fh':36, 'labels':["SES", "Holt's linear trend" ,"Holt-Winter-additive", "Holt-Winter-Multiplicative"]})

Let's see which one is the best model so far? 

In [96]:
exp.compare_models(include=[ses, ht, hw_add, hw_mult], cross_validation=False)

Unnamed: 0,Model,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2,TT (Sec)
3,Exponential Smoothing,0.3382,0.4575,10.2997,15.8074,0.0221,0.0216,0.9549,0.22
2,Exponential Smoothing,0.4394,0.4915,13.3805,16.9799,0.028,0.028,0.948,0.14
1,Exponential Smoothing,2.2069,2.7073,67.2002,93.5338,0.1259,0.1402,-0.5793,0.03
0,Exponential Smoothing,2.5006,2.9849,76.1426,103.1245,0.1428,0.1616,-0.9198,0.02


So, according to the tabe above, the hw_mult is the best performer in the hold out set. Later in the course, we will be comparing the model performance in cross validation sets as it is more reliable. 

In [95]:
exp.plot_model(estimator=hw_mult, plot="diagnostics")

How to read the periodogram: 
the periodogram is a visual representation of the frequency spectrum of a time series dataset. It is used to identify the presence of any repeating patterns or seasonal trends in the data. A periodogram of white noise will generally have a flat, uniform appearance, with no clear peaks or patterns.

---
---
## ETS models

In [119]:
ets_ANN = exp.create_model('ets', error="add", trend=None, seasonal=None ,cross_validation=False ) # this is equivalent to point estimate of SES


Unnamed: 0,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2
Test,2.496,2.9807,76.0029,102.9795,0.1425,0.1612,-0.9144


In [118]:
ets_AAN = exp.create_model('ets', error="add", trend="add", seasonal=None ,cross_validation=False ) # this is equivalent to point estimate of Holt's Linear trend

Unnamed: 0,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2
Test,2.1789,2.6833,66.3466,92.7039,0.1243,0.1382,-0.5514


In [120]:
ets_AAM = exp.create_model('ets', error="add", trend="add", seasonal="mul" ,cross_validation=False )


Unnamed: 0,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2
Test,0.3129,0.4475,9.5287,15.4598,0.0203,0.0199,0.9569


In [123]:
exp.compare_models([ets_ANN, ets_AAN, ets_AAM], cross_validation=False)

Unnamed: 0,Model,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2,TT (Sec)
2,ETS,0.3129,0.4475,9.5287,15.4598,0.0203,0.0199,0.9569,0.12
1,ETS,2.1789,2.6833,66.3466,92.7039,0.1243,0.1382,-0.5514,0.04
0,ETS,2.496,2.9807,76.0029,102.9795,0.1425,0.1612,-0.9144,0.02


In [121]:
exp.plot_model(ets_AAM, plot='insample')

In [124]:
exp.plot_model(ets_AAM, plot='forecast')

In [125]:
exp.plot_model(ets_AAM, plot='forecast', data_kwargs={'fh':48})

In [126]:
exp.plot_model([ets_ANN, ets_AAN, ets_AAM], plot='forecast', data_kwargs={'labels':["ETS(A, N, N)", "ETS(A, A, N)", "ETS(A, A, M)"], 'fh':36})

In [128]:
exp.plot_model(estimator=ets_AAM, plot="diagnostics")

---
## Predict Model

This function predicts Label using a trained model. When data is None, it predicts label on the holdout set.

note: so far, our best model is the ets model


In [131]:
naive = exp.create_model('naive', cross_validation=False)

Unnamed: 0,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2
Test,2.4959,2.9807,76.0,102.9765,0.1425,0.1612,-0.9143


In [133]:
exp.compare_models(include=[naive, ses,hw_add, hw_mult,ets_ANN, ets_AAN, ets_AAM], cross_validation=False)

Unnamed: 0,Model,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2,TT (Sec)
6,ETS,0.3129,0.4475,9.5287,15.4598,0.0203,0.0199,0.9569,0.14
3,Exponential Smoothing,0.3382,0.4575,10.2997,15.8074,0.0221,0.0216,0.9549,0.16
2,Exponential Smoothing,0.4394,0.4915,13.3805,16.9799,0.028,0.028,0.948,0.15
5,ETS,2.1789,2.6833,66.3466,92.7039,0.1243,0.1382,-0.5514,0.04
0,Naive Forecaster,2.4959,2.9807,76.0,102.9765,0.1425,0.1612,-0.9143,0.02
4,ETS,2.496,2.9807,76.0029,102.9795,0.1425,0.1612,-0.9144,0.03
1,Exponential Smoothing,2.5006,2.9849,76.1426,103.1245,0.1428,0.1616,-0.9198,0.02


In [134]:
holdout_pred = exp.predict_model(ets_AAM)

Unnamed: 0,Model,MASE,RMSSE,MAE,RMSE,MAPE,SMAPE,R2
0,ETS,0.3129,0.4475,9.5287,15.4598,0.0203,0.0199,0.9569


## Finalize Model

This function trains a given estimator on the entire dataset including the holdout set.

Model finalization is the last step in the experiment. This workflow will eventually lead you to the best model for use in making predictions on new and unseen data. The finalize_model() function fits the model onto the complete dataset including the test/hold-out sample. The purpose of this function is to train the model on the complete dataset before it is deployed in production.

In [135]:
final_model = exp.finalize_model(ets_AAM)

In [136]:
final_model

---
### Final prediciton on unseen data

The predict_model() function is also used to predict on the unseen dataset.

In [137]:
exp.plot_model(plot='train_test_split')

In [140]:
exp.plot_model(final_model, plot='forecast', data_kwargs={'fh':24})

In [141]:
unseen_predictions = exp.predict_model(final_model, fh=10)
unseen_predictions

Unnamed: 0,y_pred
1961-01,445.4229
1961-02,418.3921
1961-03,464.7036
1961-04,494.5817
1961-05,505.5179
1961-06,573.3778
1961-07,663.6585
1961-08,654.8065
1961-09,546.7023
1961-10,488.2774


## Save Model

This function saves the transformation pipeline and trained model object into the current working directory as a pickle file for later use.

In [142]:
exp.save_model(final_model, 'best_smoothing_model')

Transformation Pipeline and Model Successfully Saved


(ForecastingPipeline(steps=[('forecaster',
                             TransformedTargetForecaster(steps=[('model',
                                                                 ForecastingPipeline(steps=[('forecaster',
                                                                                             TransformedTargetForecaster(steps=[('model',
                                                                                                                                 AutoETS(seasonal='mul',
                                                                                                                                         sp=12,
                                                                                                                                         trend='add'))]))]))]))]),
 'best_smoothing_model.pkl')

## Load model

This function loads a previously saved pipeline.



In [143]:
my_model = load_model('best_smoothing_model')

Transformation Pipeline and Model Successfully Loaded


In [59]:
my_model

ForecastingPipeline(steps=[('forecaster',
                            TransformedTargetForecaster(steps=[('model',
                                                                ForecastingPipeline(steps=[('forecaster',
                                                                                            TransformedTargetForecaster(steps=[('model',
                                                                                                                                AutoETS(seasonal='mul',
                                                                                                                                        sp=12,
                                                                                                                                        trend='add'))]))]))]))])

Done!