# Time Series Analysis
<ol>
    <li><a href="#HP">Hodrick-Prescott filter</a></li>
    <li><a href="#ETS">Error-Trend-Seasonal decomposition</a></li>
    <li><a href="#MA">Moving Average</a>
        <ol>
            <li><a href="#SMA">Simple Moving Average</a>: rolling window</li>
            <li><a href="#EWMA">Exponential Weighted MA</a></li>
            <li><a href="#DTEWMA">Double &Triple Exponential Smoothing</a></li>
            <li><a href="#Damped EWMA">Damped EWMA</a></li>
        </ol>
    </li>
</ol>

In [None]:
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import warnings

warnings.filterwarnings('ignore')
%matplotlib inline

We will use statsmodel built-in dataset.

In [None]:
df = sm.datasets.macrodata.load_pandas().data
df.index = pd.Index(sm.tsa.datetools.dates_from_range('1959Q1', '2009Q3'))

df.head(3)

We will focus on `realgdp` column

In [None]:
ax = df['realgdp'].plot()
ax.set(ylabel='Real GDP');


# <a id="HP">1) Hodrick-Prescott filter</a>
We use this filter to decompose time-series $y_t$ into 2 components that is trend component $\tau_t$ and cyclical component $c_t$.

$y_t = \tau_t + c_t$

The components are determined by minimizing the following quadratic loss function, where $\lambda$ is a smoothing parameter:

$\min_{\\{ \tau_{t}\\} }(\sum_{t=1}^{T}c_{t}^{2}+\lambda\sum_{t=1}^{T}\left[\left(\tau_{t}-\tau_{t-1}\right)-\left(\tau_{t-1}-\tau_{t-2}\right)\right]^{2})$

The  𝜆  value above handles variations in the growth rate of the trend component.
When analyzing quarterly data, the default lambda value of 1600 is recommended. Use 6.25 for annual data, and 129,600 for monthly data.

In [None]:
from statsmodels.tsa.filters.hp_filter import hpfilter

gdp_cycle, gdp_trend = hpfilter(df['realgdp'], lamb=1600)

*Decomposition* means when we add up all the components we'll get back the original data.

In [None]:
all(gdp_cycle + gdp_trend == df['realgdp'])

In [None]:
df['realgdp'].plot(figsize=(12,6)).autoscale(tight=True)
gdp_trend.plot(label='trend')

plt.legend();

# <a id="ETS">2) ETS decomposition</a>
With Statsmodels' `seasonal_decompose` which attempts to isolate individual components such as *error, trend, and seasonality* (ETS), we can visually see each component contributes to overall behavior of the time-series. 

- We apply an **additive model** when it seems that the trend is more linear and the seasonality and trend components seem to be constant over time.

- A **multiplicative model** is more appropriate when we are increasing (or decreasing) at a non-linear rate. 

In [None]:
from statsmodels.tsa.seasonal import seasonal_decompose

ets = seasonal_decompose(df['realgdp'], model='add')
ets.plot();

# <a id="MA">3) Moving Average</a>

In [None]:
airline = pd.read_csv('/kaggle/input/international-airline-passengers/international-airline-passengers.csv',parse_dates=['Month'])
airline.dropna(inplace=True)
airline['Month'] = pd.to_datetime(airline['Month'])
airline.set_index('Month', inplace=True)
airline.rename(inplace=True, columns={"International airline passengers: monthly totals in thousands. Jan 49 ? Dec 60":"Thousands of Passengers"})
airline.head()

In [None]:
airline.index.freq = 'MS'

## <a id="SMA">3.1) Simple MA</a>
Just averaging rolling window.

In [None]:
airline['6-month-SMA'] = airline['Thousands of Passengers'].rolling(window=6).mean()
airline['12-month-SMA'] = airline['Thousands of Passengers'].rolling(window=12).mean()

airline.plot(figsize=(12,6));

## <a id="EWMA">3.2) Exponential Weighted MA</a>
A SMA gives all observation the same weight (no matter how old they are) which sometimes does not make much sense.
basic SMA has some weaknesses:
* <u>Smaller windows</u> will lead to <u>more noise</u>, rather than signal
* It will <u>always lag</u> by the size of the window
* It will <u>never reach to full peak</u> or valley of the data due to the averaging.
* <u>Does not inform</u> you about <u>possible future behavior</u>, all it really does is <u>describe trends</u> in your data.
* Extreme historical values can skew your SMA significantly

The idea of EWMA is that we should give more attention(weight) to more recent observation than older ones. 
The <u>amount of weight</u> applied to the most recent values will depend on the actual <u>parameters</u> used in the EWMA and the number of periods given a <u>window size</u>.


___
The formula for EWMA is: $$y_t =   \frac{\sum\limits_{i=0}^t w_i x_{t-i}}{\sum\limits_{i=0}^t w_i}$$

Where $x_t$ is the input value (for our dataset now, $x_t$ is `realgdp` at time `t`), $w_i$ is the applied weight (Note how it can change from $i=0$ to $t$), and $y_t$ is the output (for our dataset now, $y_t$ is `EWMA` of `realgdp` at time `t`).

Now the question is, <u>how do we define the weight term $w_i$?</u>

This depends on the <tt>adjust</tt> parameter you provide to the <tt>.ewm()</tt> method.

* When <tt>adjust=True</tt> (default) is used, weighted averages are calculated using weights equal to $w_i = (1 - \alpha)^i$ which gives

$$EMA_t = \frac{y_t + (1 - \alpha)y_{t-1} + (1 - \alpha)^2 y_{t-2} + ...
+ (1 - \alpha)^t y_{0}}{1 + (1 - \alpha) + (1 - \alpha)^2 + ...
+ (1 - \alpha)^t}$$

* When <tt>adjust=False</tt> is specified, moving averages are calculated as:

$$\begin{split}EMA_0 &= y_0 \\
EMA_t &= (1 - \alpha) y_{t-1} + \alpha y_t,\end{split}$$
___

For the smoothing factor $\alpha$, one must have $0<\alpha≤1$, and while it is possible to pass <em>alpha</em> directly, it’s often easier to think about either the <em>span</em>, <em>center of mass</em> (com) or <em>half-life</em> of an EW moment:
\begin{split}\alpha =
 \begin{cases}
     \frac{2}{s + 1},               & \text{for span}\ s \geq 1\\
     \frac{1}{1 + c},               & \text{for center of mass}\ c \geq 0\\
     1 - \exp^{\frac{\log 0.5}{h}}, & \text{for half-life}\ h > 0
 \end{cases}\end{split}
 
* <strong>Span</strong> corresponds to what is commonly called an “N-day EW moving average”.
* <strong>Center of mass</strong> has a more physical interpretation and can be thought of in terms of span: $c=(s−1)/2$
* <strong>Half-life</strong> is the period of time for the exponential weight to reduce to one half.
* <strong>Alpha</strong> we can also specifies the smoothing factor directly.

In [None]:
# We have to pass precisely one of the above into the .ewm() function

airline['EWMA span=12'] = airline['Thousands of Passengers'].ewm(span=12, adjust=False).mean()
airline['adjusted EWMA span=12'] = airline['Thousands of Passengers'].ewm(span=12, adjust=True).mean()

airline[['Thousands of Passengers', 'EWMA span=12', 'adjusted EWMA span=12']].plot(figsize=(12,6));

In [None]:
airline[['Thousands of Passengers','EWMA span=12','12-month-SMA']].plot(figsize=(12,6)).autoscale(axis='x',tight=True);

The above example employed <em>Simple Exponential Smoothing</em> with one smoothing factor <strong>α</strong>. Unfortunately, this technique does a poor job of forecasting when there is a trend in the data as seen above. In the next section we'll look at <em>Double</em> and <em>Triple Exponential Smoothing</em> with the Holt-Winters Methods.

## <a id="DTEWMA"> 3.3) Double &Triple Exponential Smoothing</a>

Previously, on <strong>Exponentially Weighted Moving Averages</strong> (EWMA) we applied <em>Simple Exponential Smoothing</em> using just *one smoothing factor* $\alpha$ (alpha). This failed to account for other contributing factors like <u>trend</u> and <u>seasonality.</u>

we'll look at Double and Triple Exponential Smoothing with the **Holt-Winters Methods**.

-> *Recall : Simple Expo Moving Average*<br>
$$\begin{split}EMA_0 &= y_0 \\
EMA_t &= (1 - \alpha) y_{t-1} + \alpha y_t,\end{split}$$

### 3.3.1) Double Exponential Smoothing
<strong>Holt’s linear trend method</strong>

Holt (1957) extended simple exponential smoothing to allow the forecasting of data with a trend. This method involves a forecast equation and two smoothing equations (one for the level and one for the trend):

$$\text{forcast equation} : \hat{y}_{t+h} = l_t + hb_t$$
$$\text{level equation} : l_t = \alpha y_t + (1-\alpha)(l_{t-1}+b_{t-1})$$
$$\text{trend equation} : b_t = \beta(l_t - l_{t-1}) + (1-\beta)b_{t-1}$$

- Where $l_t$ denotes an estimate of the level of the series at time $t$.
- $b_t$ denotes an estimate of the trend (slope) of the series at time $t$
- $\alpha$ is the smoothing parameter for the level, $0\le \alpha \le 1$
- $\beta$ is the smoothing parameter for the trend, $0\le \beta \le 1$

We can address different types of change (growth/decay) in the **<u>trend component</u>**. 
- **Additive**, if a time series displays a straight-line sloped trend, you would use an  adjustment. 
- **Multiplicative**, if the time series displays an exponential (curved) trend.

In [None]:
airline.index.freq = 'MS'

from statsmodels.tsa.holtwinters import SimpleExpSmoothing, ExponentialSmoothing

airline['Double expo additive'] = ExponentialSmoothing(airline['Thousands of Passengers'], trend='add').fit().fittedvalues.shift(-1)
airline[['Thousands of Passengers','Double expo additive','EWMA span=12','6-month-SMA']].iloc[:24].plot(figsize=(12,6));

### 3.3.2) Triple Exponential Smoothing
<strong>Holt-Winters’ seasonal method</strong>

Holt (1957) and Winters (1960) extended Holt’s method to capture seasonality.The Holt-Winters seasonal method comprises the forecast equation and three smoothing equations — one for the level ($l_t$) one for the trend ($b_t$) and one for the seasonal component ($s_t$) ,with corresponding smoothing parameters $\alpha$ ,$\beta$ and $\gamma$. We use $m$ to denote the frequency of the seasonality,

There are two variations to this method that differ in the nature of the **<u>seasonal component</u>**. 
- The **additive** method is preferred when the seasonal variations are roughly constant through the series
- the **multiplicative** method is preferred when the seasonal variations are changing proportional to the level of the series.

In [None]:
airline['Triple expo add. trend, add. seasonal'] = ExponentialSmoothing(airline['Thousands of Passengers'], trend='add', seasonal='add', seasonal_periods=12).fit().fittedvalues.shift(-1)
airline[['Thousands of Passengers','Double expo additive','EWMA span=12','Triple expo add. trend, add. seasonal']].iloc[:24].plot(figsize=(12,6));

## <a id="Damped EWMA">3.4) Damped EWMA</a>

**Holt's model : Double EWMA** <br>
Holt’s linear method display a constant trend (increasing or decreasing) indefinitely into the future. Obviously, this method tends to over-forecast, especially for longer forecast horizons.

Gardner & McKenzie (1985) introduced a parameter that *“dampens”* the trend to a flat line some time in the future. Methods that include a damped trend have proven to be very successful.

With damping parameter : $0\le \phi \le 1$
$$\text{forcast equation} : \hat{y}_{t+h} = l_t + (\phi^1 + \phi^2 + ... + \phi^h) b_t$$
$$\text{level equation} : l_t = \alpha y_t + (1-\alpha)(l_{t-1}+\phi b_{t-1})$$
$$\text{trend equation} : b_t = \beta(l_t - l_{t-1}) + (1-\beta)\phi b_{t-1}$$

- If $\phi = 1$, the method is identical to Holt’s linear method.
- $0\le \phi \le 1$, it dampens the trend so that it approaches a constant some time in the future. Infact, h$\to\infty$ makes $\hat{y}$ converges to $l_t + b_t \frac{\phi}{1-\phi}$. That means *trend* is dissolved as longer forecast

In [None]:
model = ExponentialSmoothing(airline['Thousands of Passengers'], trend='add').fit()
pred_3_yr = model.predict(start='1960-12-01', end='1970-12-01')

model_damped = ExponentialSmoothing(airline['Thousands of Passengers'], trend='add', damped_trend=True).fit()
pred_3_yr_damped = model_damped.predict(start='1960-12-01', end='1970-12-01')

# Plot
plt.figure(figsize=(12,6))
airline['Thousands of Passengers'].iloc[-80:].plot(label='Original')
pred_3_yr.plot(label='Not damped')
pred_3_yr_damped.plot(label='Damped')
plt.legend(loc='lower right');