# 🎲 A Pholosophical Change in Perspective

Before this point in time, we were used to discuss only about the ***ways*** to calculate the moving average. Now, it is the time to **forecast** points.

For that, we will build up from what we have learnt so far *(EWMA)* and **use it instead of** calculating moving average but **to forecast**. That is gonna be awsome — let's do that!

### Using new notations
Till now we have:

# $$ \text{EWMA} = \alpha x_t + (1 - \alpha)\bar x_{t - 1}$$

###### 

Just a little change as we are ***now forecasting***:

# $$ \hat y_t = \alpha y_t + (1 - \alpha)\hat y_{t - 1}$$

###### 

The ***official*** forecasting model:

# $$ \hat y_{t+1 | t} = \alpha y_t + (1 - \alpha)\hat y_{t | t - 1}$$

- $\hat y_{t+1 | t}:$ The "*future predicted*" value of `t + 1` point ***given*** the current point `t`
- $\hat y_{t | t - 1}:$ The "*current predictied*" value of  `t` ***given*** the previous value of `t - 1`

This ↑ one, I know is not clear to me as well at this point in time `t` 😅 but that will get clear as we will move forward.

###### 

Express our model **in the component** form:

#### 1️⃣ Forecast Equation
# $$\hat y_{t + h | t} = l_t $$
Where, <br>
$h=1,2,3 ...$ 
<br>Meaning `t + h` th forecast in the future. Recall the `h` value from previous notebooks. `1. Timeseries Basics → 4. Types of tasks → 2.1 Incremental Forecast`

#### 2️⃣ Smoothing Equation
# $$l_t = \alpha y_t + (1 - \alpha)l_{t - 1} $$

Which, <br>
is just the plugging the value of $l_t$ from 2️⃣ equation to the 1️⃣ equation. Might **not** make any sense now but here we are setting **THE LEVEL** which later add some other elements and building such habit will result smoother later.

> **NOTE / NOTICE** that, the original indices are back when we **represent the equation** in the form of the component form ie. instead of writing $l_{t + 1 | t}$ we've just written $l_t$ as before under the section: "*Just a little change as we are now forecasting*" equation in this notebook.

# $l_t$ is the Level.
This is the first time that we have been introduced in the ***jargons*** of the time-series. It is simple for now but then it will be more complex. Don't worry we will get there easily.

The term **level** is first appeared at this path: `2. Exponential Smoothing and ETS → 1. Moving and Exp` where we discussed that the level is the *constant* values around which the value of the time-series fluctuates. 

> Thus, **the level** can be thought of as the moving average which represents the mean of the fluctuations in that period.

Here, in thsis simple SES, we will **only be able to predict** the mean fluctuation of the 1 step further in time because the value of `h` 2, 3, 4... will be just the same.

###### 

## 👨‍💻 Great Goin'
Let's now have a look at some code — and this time we will be using `statsmodels`

#### A Bit of Skeleton With Statsmodels
Till now Aayush, you have worked with `sklearn` and used fit and predict. Here, the flow changes a bit. Let's have a look.

```python
# Model initialization 
model = someModel(param_1=0.1, param_2="euclidean") # sklearn
model = someModel(data) # statsmodels

# Fitting
model.fit(data) # sklearn
model.fit(param_1=0.1, param_2="euclidean") # statsmodels
```
See? Here in `statsmodels` the flow is a bit flipped.

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# This is the model that we are going to work with
from statsmodels.tsa.holtwinters import SimpleExpSmoothing

In [3]:
passengers = pd.read_csv("../data/airline_passengers.csv", index_col=0, parse_dates=True)

In [5]:
passengers.head()

Unnamed: 0_level_0,Passengers
Month,Unnamed: 1_level_1
1949-01-01,112
1949-02-01,118
1949-03-01,132
1949-04-01,129
1949-05-01,121


In [6]:
ses = SimpleExpSmoothing(passengers)

  self._init_dates(dates, freq)


So, you **see the warnings** right? They are telling that "the model doesn't know that what frequency is there in the data" meaning, is our data spread out monthly? weekly? yearly? 3 yearly? what?

Thus, it *by default* takes the monthly. But in-case of other frequancy that we have in the data, we will need to pass them in the parameter.

In [7]:
# See that the frequency is NONE
passengers.index

DatetimeIndex(['1949-01-01', '1949-02-01', '1949-03-01', '1949-04-01',
               '1949-05-01', '1949-06-01', '1949-07-01', '1949-08-01',
               '1949-09-01', '1949-10-01',
               ...
               '1960-03-01', '1960-04-01', '1960-05-01', '1960-06-01',
               '1960-07-01', '1960-08-01', '1960-09-01', '1960-10-01',
               '1960-11-01', '1960-12-01'],
              dtype='datetime64[ns]', name='Month', length=144, freq=None)

In [15]:
# You can try setting "D", "Y" etc... but that will result an error as it will
# check that `really` the frequancy is there or not, so better to use the  correct one
passengers.index.freq = "MS"

Here `MS` means **Month Start** and the `M` means **Month End**. So, keep that in mind. Or just refer your [Time in Pandas](https://github.com/AayushSameerShah/Pandas_Book/tree/main/2.%20Pandas/6.%20TIME) repository!

In [19]:
# Again need to put the data in to reflect the `freq` change
ses = SimpleExpSmoothing(passengers,
                        initialization_method="legacy-heuristic")