# Introduction to Time Series

## Autoregression

In a multiple regression model, we forecast the variable of interest using a linear combination of predictors. In an autoregression model, we forecast the variable of interest using a linear combination of past values of the variable. The term autoregression indicates that it is a regression of the variable against itself.

Thus, an autoregressive model of order $p$ can be written as

$$y_{t} = c + \phi_{1}y_{t-1} + \phi_{2}y_{t-2} + \dots + \phi_{p}y_{t-p} + \varepsilon_{t},$$

where $\varepsilon_t$ is white noise. This is like a multiple regression but with lagged values of $y_t$ as predictors. We refer to this as an AR($p$) model, an autoregressive model of order $p$.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import statsmodels as sm

%matplotlib inline

bike_sharing_df = pd.read_csv('bike_sharing_day.csv', index_col=0)
bike_sharing_df.dteday = pd.to_datetime(bike_sharing_df.dteday)

bike_sharing_df.set_index('dteday', inplace=True)
bike_sharing_df.head()

In [None]:
from statsmodels.tsa.ar_model import AR
from sklearn.metrics import r2_score

In [None]:
total_count = bike_sharing_df.cnt
train_n = len(total_count) - 20

In [None]:
train, test = total_count[:train_n], total_count[train_n:]

In [None]:
model = AR(train)
model_fit = model.fit()

In [None]:
print('Lag: %s' % model_fit.k_ar)
print('Coefficients: %s' % model_fit.params)

### make predictions

In [None]:
predictions = model_fit.predict(start=len(train), end=len(train)+len(test)-1, dynamic=False) 
r2_score(test, predictions)

### plot results

In [None]:
plt.figure(figsize=(20,5))
plt.plot(train)
plt.plot(test, label='actual')
plt.plot(predictions, label='predicted')
plt.legend()