In [9]:
%load_ext autoreload
%autoreload 2

In [12]:
import sys
sys.path.append('..')
import warnings
warnings.filterwarnings('ignore')

## 5.2 Some simple forecast methods
Some simple baseline models:
- **Mean**: the forecasts of all future values are equal to the average (or "mean") of the historical data. If historical data is denoted by $y_1,\ldots y_T$, then the forecast is given by $\hat{y}_{T+h|T} = \bar{y} = (y_1+\cdots+y_T)/T$.
- **Naive**: set all forecasts to be the value of the last observation, i.e. $\hat{y}_{T+h|T} = y_T$. This method works remarkably well for many economic and financial time series. Because a naive forecast is optimal when data follow a random walk, these are also called *random walk forecasts*.
- **Seasonal naive method**: A similar method used for highly seasonal data. Here we set each forecast to be equal to the last observed value from the same season (e.g. the same month of the previous year). Formally, the forecast for time $T+h$ is written as $\hat{y}_{T+h|T} = y_{T+h-m(k+1)}$, where $m=$ the seasonal period, and $k$ is the integer part of $(h-1)/m$ (i.e. the number of complete years in the forecast period prior to $T+h$).
- **Drift method**: Allow the forecasts to increase or decrease over time, where the amount of change over time (called the **drift**) is set to be the average change in the historical data. Thus the forecast for time $T+h$ is given by 
$$
\hat{y}_{T+h|T} = y_T + \frac{h}{T-1}\sum_{t=2}^T(y_t-y_{t-1}) = y_T + h\left(\frac{y_T-y_1}{T-1}\right).
$$ 
This is equivalent to drawing a line between the first and last observations, and extrapolating it into the future.

In [1]:
import statsmodels.api as sm

In [3]:
import pandas as pd

In [13]:
df = pd.read_csv('../data/tsibbledata/aus_production.csv')
aus_production = (df
    .assign(Date=pd.to_datetime(df['Quarter'].str.replace(' ', '')))
    .set_index('Date')
)

In [6]:
bricks = aus_production['1970-01-01': '2004-01-01']