In [None]:
import warnings
warnings.filterwarnings('ignore')

## QBUS3850 Lab 4 (ARIMA Models) Task

In this tutorial we will work on the Minneapolis traffic data available on Canvas. This data contains daily observations of traffic volume and was originally sourced from John Hogue who made it available at the [UCI machine learning repository](https://archive.ics.uci.edu/ml/datasets/Metro+Interstate+Traffic+Volume). Some post-processing of the data has been carried out.

Your task will be to use all data up to 2018-07-31 to forecast the next two months ahead (August and September 2018). You will use the following models

- An AR(2)
- An MA(2)
- An ARIMA selected by auto_arima
- A seasonal ARIMA selected by auto_arima 
- A regression with holiday dummies and errors following seasonal ARIMA selected by auto_arima 
- A regression with holiday dummies and two pairs of (i.e.four) Fourier terms and errors following seasonal ARIMA selected by auto_arima

After estimating each model, plot the forecasts together with some training data and the test data. Comment on how the forecasts reflect the properties of each model.

### Preliminary analysis

First import and plot the data

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# read csv and plot the data

Weekly effects apparent, but since the period over which data are collected is short, the variance does not get bigger over time. There is no need to take log transformation.

Although an intercept and the Holiday dummies are included in the dataset, Fourier terms are not. It is best to compute these now.

In [None]:
import numpy as np
from datetime import datetime
#Day of year
dat['DayOfYear'] = dat.loc[:,'date'].dt.dayofyear
#Need to compute days in year. These are not the same since 2016 is a leap year
dat['Year']= dat.loc[:,'date'].dt.year
dat['DaysInYear']=365.0
dat.loc[dat['Year']==2016,'DaysInYear']=366.0
# Now compute Fourier terms
dat['FourS1']=np.sin(2*np.pi*(dat['DayOfYear']/dat['DaysInYear']))
dat['FourC1']=np.cos(2*np.pi*(dat['DayOfYear']/dat['DaysInYear']))
dat['FourS2']=np.sin(2*2*np.pi*(dat['DayOfYear']/dat['DaysInYear']))
dat['FourC2']=np.cos(2*2*np.pi*(dat['DayOfYear']/dat['DaysInYear']))
          
#Split out training and test data
train = dat.loc[(dat['date']<='2018-07-31')]
test = dat.loc[(dat['date']>'2018-07-31')]

### Autoregressive Model (AR2)

Fit an AR(2) model

In [None]:
# fit AR(2) and print summary

In [None]:
fc_ar2 = fitar2.forecast(len(test))
# plot train, test, and forecast in the same graph


fig, ax = plt.subplots(1,figsize=(25,6))
end = train.loc[(train['date']>'2018-05-31')]
ax.plot(end['date'],end['Traffic'],color='black')
ax.plot(test['date'],test['Traffic'],color='gray')
ax.plot(test['date'],fc_ar2,color='orange')

### MA(2)

Fit an MA(2) model

In [None]:
# TODO: fit MA(2) and print summary

In [None]:
fc_ma2 = fitma2.forecast(len(test))

# TODO: plot train, test, and forecast in the same graph

We can plot results from the AR and MA together

In [None]:
# TODO: Plot AR and MA together 

### Auto arima without seasonality

In [None]:
import numpy as np
from statsforecast.models import auto_arima
from statsforecast.arima import auto_arima_f
train_arr = train.to_numpy(dtype='float')
out = auto_arima_f(train_arr[:,1],seasonal=False)
print(out['arma'])

Auto arima suggests an ARIMA(3,1,3)

In [None]:
# TODO: fit an arima model with the suggested order

In [None]:
fc_aans = fitaans.forecast(len(test))
fig, ax = plt.subplots(1,figsize=(25,6))
ax.plot(end['date'],end['Traffic'],color='black')
ax.plot(test['date'],test['Traffic'],color='gray')
ax.plot(test['date'],fc_aans,color='green')


### Auto arima with weekly seasonality

In [None]:
out = auto_arima_f(train_arr[:,1],seasonal=True, period=7)
print(out['arma'])

Recall that the 
- first element is the (non-seasonal) AR order
- the second element is the (non-seasonal) MA order
- the third element is the seasonal AR order
- the fourth element in the seasonal MA order
- the fifth element is the period
- the sixth element is the order of non-seasonal differencing
- the seventh element is the order of seasonal differencing.


Auto arima suggests a SARIMA (0,0,3)(2,1,1)[7]

In [None]:
# TODO: fit an arima model with the suggested seasonal order

In [None]:
fc_aas = fitaas.forecast(len(test))
fig, ax = plt.subplots(1,figsize=(25,6))
ax.plot(end['date'],end['Traffic'],color='black')
ax.plot(test['date'],test['Traffic'],color='gray')
ax.plot(test['date'],fc_aas,color='orange')


### Auto arima with seasonality and holiday dummies 

In [None]:
out = auto_arima_f(train_arr[:,1],xreg=train_arr[:,2:3],seasonal=True, period=7)
print(out['arma'])

In [None]:
fitaash = sm.tsa.arima.model.ARIMA(train['Traffic'],exog=train[['Int','Holiday']],order = (0,0,3),seasonal_order=(2,1,1,7)).fit()
fitaash.summary()

In [None]:
# TODO: forecast with the holiday dummies as exog, and plot against our previous forecast

The 3rd September, 2018 was Labor Day (a U.S public holiday). Zoom in on that day

In [None]:
fig, ax = plt.subplots(1,figsize=(25,6))
ax.plot(test['date'].iloc[30:36],fc_aas.iloc[30:36],color='orange')
ax.plot(test['date'].iloc[30:36],fc_aash.iloc[30:36],color='blue')
ax.plot(test['date'].iloc[30:36],test['Traffic'].iloc[30:36],color='gray')

### Model with Fourier terms

In [None]:
out = auto_arima_f(train_arr[:,1],xreg=train_arr[:,2:7],seasonal=True, period=7)
print(out['arma'])

In [None]:
fitaashf = sm.tsa.arima.model.ARIMA(train['Traffic'],exog=train[['Int','Holiday','FourS1','FourC1','FourS2','FourC2']],order = (1,0,0),seasonal_order=(2,1,1,7)).fit()
fitaashf.summary()

In [None]:
future_x = test[['Int' , 'Holiday','FourS1','FourC1','FourS2','FourC2']].to_numpy(dtype='float')
fc_aashf = fitaashf.forecast(len(test),exog = future_x)
fig, ax = plt.subplots(1,figsize=(25,6))
ax.plot(end['date'],end['Traffic'],color='black')
ax.plot(test['date'],test['Traffic'],color='gray')
ax.plot(test['date'],fc_aas,color='orange')
ax.plot(test['date'],fc_aashf,color='green')