# Forecasting

We want to forecast future observations based on past observations:

-   Naive methods
-   Exponential Smoothing models
-   BATS and TBATS
-   ARIMA/SARIMA models
-   How to set up a one-step-ahead forecast

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm

We use a dataset of monthly totals (1000s) of international airline
passengers between 1949 and 1960:

In [None]:
# Load the AirPassengers dataset
data = sm.datasets.get_rdataset("AirPassengers").data

# Convert to datetime and set as index
data['Month'] = pd.date_range(start='1949-01-01', periods=len(data), freq='MS')
data.set_index('Month', inplace=True)

# Convert passengers column to time series
air_passengers = data['value']

# Create training and validation sets
training = air_passengers['1949-01-01':'1956-12-01']
validation = air_passengers['1957-01-01':]

## 1. Naive Methods

Any forecasting method should be evaluated by being compared to a naive
method. This helps ensure that the efforts put in having a more complex
model are worth it in terms of performance.

The simplest of all methods is called simple naive forecasting.
Extremely simple: the forecast for tomorrow is what we are observing
today.

Another approach, seasonal naive, is a little more "complex": the
forecast for tomorrow is what we observed the week/month/year (depending
what horizon we are working with) before.

Here is how to do a seasonal naive forecast:

In [None]:
from sklearn.metrics import mean_absolute_percentage_error

# Seasonal naive forecast: repeat last 12 months of training for the forecast horizon
season_length = 12
h = len(validation)

# Get the last season from training data
last_season = training[-season_length:]

# Repeat last season to match the forecast horizon
naive_forecast = np.tile(last_season.values, h // season_length + 1)[:h]

# Compute MAPE
mape = mean_absolute_percentage_error(validation, naive_forecast) * 100
print(f'MAPE: {mape:.2f}%')

This gives us a **MAPE of 19.5%**.

In [None]:
# Plot the full original series
plt.figure(figsize=(10, 6))
plt.plot(air_passengers, color='blue', label='Actual', linewidth=1)

# Create a datetime index for the forecast
forecast_index = validation.index
plt.plot(forecast_index, naive_forecast, color='red', label='Seasonal Naive Forecast', linewidth=2)

# Add labels and title
plt.xlabel('Year')
plt.ylabel('Passengers')
plt.title('Seasonal Naive Forecast')
plt.legend()
plt.grid(True)
plt.show()

what happened in the last year of data is repeated as a forecast for the
entire validation set.

## 2. Exponential Smoothing

In exponential smoothing we give a declining weight to observations: the
more recent an observation, the more importance it will have in our
forecast.

Parameters can also be added. You can for instance add a trend parameter
(**Holt method**) or add a seasonality (**Holt-Winters**).

### Holt method

Function `forecast::ets()`. The model (additive or multiplicative) is
chosen automatically if not specified:

$$
y_t = f(S_t, T_t, E_t)
$$

  - S: seasonal component - T: trend component - E: error (remainder)
  -   **Additive model**: $S_t + T_t + E_t$
  -   **Multiplicative model**: $S_t \cdot T_t \cdot E_t$
  -   model: error type \| trend type \| season type: A = additive; M =
    multiplicative; Z = automatically selected (default)

In [None]:
from statsmodels.tsa.holtwinters import ExponentialSmoothing

# Fit ETS model with additive error, trend, and seasonality
ets_model = ExponentialSmoothing(
    training,
    trend='add',
    seasonal='add',
    seasonal_periods=12
).fit(optimized=True)

# Forecast
ets_forecast = ets_model.forecast(len(validation))

# Compute MAPE
ets_mape = mean_absolute_percentage_error(validation, ets_forecast) * 100
print(f'MAPE (ETS): {ets_mape:.2f}%')

In [None]:
# Plot the full original series
plt.figure(figsize=(10, 6))
plt.plot(air_passengers, color='blue', label='Actual', linewidth=1)

# Create a datetime index for the forecast
forecast_index = validation.index
plt.plot(forecast_index, ets_forecast, color='red', label='Seasonal Naive Forecast', linewidth=2)

# Add labels and title
plt.xlabel('Year')
plt.ylabel('Passengers')
plt.title('Exponential smoothing - additive model')
plt.legend()
plt.grid(True)
plt.show()