# Time Series

## Introduction 

So far we have studied data that is static in time. This was important since many models require that we assume that observations are independent of one another. However, when dealing with time ordered data, many times this assumption is no longer valid. For example, the temperature observed today is not independent of the temperature yesterday. Another example is the stock market. Today's stock prices are related to yesterday's stock prices. In this lesson we will explore how to deal with data that contains such relationships



## Time Series Decomposition

One of the ways to overcome the issues caused by having a relationship between the observations is to decompose the data into components. Typically, we split the data into two types of components - **systematic** and **non-systematic.**


*   Systematic components are components that can have consistency or recurrence.
*   Non-systematic components cannot be modeled.

We can typically decompose a time series into 4 components - 3 systematic components and one noise component.:

1.  Observed (mean) value of the series (i.e. the average value);
2.  The trend of the series - the increase and decrease in values;
3.  The seasonality or the cyclical pattern of the series (e.g. sales of summer clothing drops during the winter);
4.  The noise is typically the random variation in our data.

Written mathematicaly, we can decompose the time series into 3 systematic componants as follows:

$$y_{t} = T_{t} + S_{t} + R_{t} + \epsilon_{t}.$$

Thus, the value at $t$ for $y$ consists of $S$, seasonality, $T$, the trend and $R$, the residual. Note that the residuals is the value that is left once we remove trend and seasonality. In complex time series analysis it may be useful, but we will not pursue it any further. 





#### Example: The Trend

The trend of a time series looks like this:

![alt text](https://otexts.com/fpp2/fpp_files/figure-html/elecequip-trend-1.png)

It simply shows the overall movement of the time series, thereby disregarding seasonality or major fluctuations (anomalies). 

### Example: Time Series Decomposition in Python

We will focus on the statsmodels library for modeling and plotting time series data in Python. statsmodels contains a function called seasonal_decompose that will allow us to plot the decomposed time series data.

For our example, we will use the occupancy dataset.

In [None]:
import pandas as pd

In [None]:
occupancy = pd.read_csv('https://raw.githubusercontent.com/loukjsmalbil/datasets_ws/master/occupancy.csv')
occupancy.head()

To plot this data, we must make sure our index is a time series with a known frequency. To analyze time series data, the data needs to be equally spaced. In the code below, we will change the type of the date column to datetime and change the index to the date column. Our frequency is 1 hour. Even though we can see that the difference between observation is 1 hour, it is not inferred, and we need to specify it ourselves. To read more about frequencies, look here.

In [None]:
occupancy.date = pd.to_datetime(occupancy.date)
occupancy.index = pd.DatetimeIndex(occupancy.date, freq='H')
occupancy.index

Now we can plot the decomposed time series.

In [None]:
import statsmodels.api as sm

In [None]:
res = sm.tsa.seasonal_decompose(occupancy.CO2)
resplot = res.plot()

Let us make a plot that is easier to read.

In [None]:
from matplotlib import pyplot
from statsmodels.tsa.seasonal import seasonal_decompose
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
fig, (ax1,ax2,ax3, ax4) = plt.subplots(4,1, figsize=(15,8))
res.observed.plot(ax=ax1)
res.trend.plot(ax=ax2)
res.seasonal.plot(ax=ax3)
res.resid.plot(ax=ax4)

In our decomposition, we have the 'raw data' on top. The second subplot shows the overall trend our data follows. The third subplot is our seasonal decomposition. As can be seen, there is no variation over the various months, indicating that seasonality does not play a major role in CO2 measurments. The last plot is simply the series that 'remains' after we subtracted the seasonality component and the trend. 

## Autoregression

**An autoregressive model is a model that uses previous observations in the time series to predict the next value in the model.** In previous lessons, we have looked at linear models where the response variable depends only on the predictor variables and the linear regression equation is of the form:

$$Y = \beta_{0} + \beta_{1} X_{1} + ... \beta_{n} X_{n}.$$

However, when working with time series data, our response variable depends not only on the predictor variables but also on the response variable itself. **A variable that depends on itself is called an autocorrelated variable.**  More information about autocorrelation can be found [here](https://www.investopedia.com/terms/a/autocorrelation.asp). Typically, our regression equation will be of the form below. We can model an autoregressive relation using an **autoregressive model (AR)**. In an autoregressive model we have a variable whose value only on the previous time period.

$$y_{t} = \beta_{0} + \beta_{1} y_{t-1},$$

where $\beta_{0}$ is essentially some constant (the starting point-value of $y_{t}$ and $\beta_{1}$ the factor by which the previous value of $y$, $y_{t-1}$, is multiplied. 

Our model can also depend on more than one time period in the past. In that case, we say that we have a higher **order**. For instance, 

$$y_{t} = \beta_{0} + \beta_{1} y_{t-1} + \beta_{2} y_{t-2}.$$

## Checking for Autocorrelation

We can check for autocorrelation in our data using a lag plot. **This plot will plot $y_{t}$ against $y_{t-1}$. Pandas has a function called lag_plot for detecting these relationships.**

In [None]:
import pandas as pd
from pandas.plotting import lag_plot

In [None]:
lag_plot(occupancy.CO2, lag=1)        #try with 1, 2, 3, 53 

A line along the diagonal shows that there is an autoregressive relationship.

To create an **autoregressive model we use the AR function.** Typically, we note an autoregressive model with the notation **AR(n) where n is the number of lag periods.** In the example below, we will create an autoregressive model with lag 1 to model the rate of CO2. Why a lag of 1? Well, let's make a plot to show why.


In [None]:
from statsmodels.graphics.tsaplots import plot_pacf

In [None]:
# Partial Autocorrelation for the first 50 lags 
# y-axis (-1 to 1) shows autocorrelation-value vs. x-axis lag 
plot_pacf(occupancy.CO2, lags=50)
pyplot.show()


**Note that we split the data into test and train and always use the last few observations when working with time series data. We do this to ensure that the model gives good predictions even on data it has not seen. Since the data is ordered, we cannot select the test data at random.**

In [None]:
from statsmodels.tsa.ar_model import AR

In [None]:
train, test = occupancy.CO2[:-10], occupancy.CO2[-10:]

In [None]:
test

In [None]:
train

In [None]:
model = AR(occupancy.CO2)

In [None]:
model_fit = model.fit(maxlag=1)

In [None]:
predictions = model_fit.predict(start=len(train), end=len(train)+len(test)-1)

In [None]:
pd.DataFrame({'observed':test, 'predicted':predictions})

## Stationarity 

**A time series is considered stationary if its mean and/or variance do not vary over time. As such, we assume that its mean and variance do not change over time, but stay constant.** When we have a stationary time series we may be able to infer some properties of the series that are *not dependent * upon the time of observation. As such, if we have seasonality or a clear trend over time, the time series is non-stationary.



To check whether we have a stationary time series, we can either examine the decomposition plot visually, compute the mean and standard deviation over time, or use statistical tests. One possible test is the **Augmented Dickey-Fuller test.** This test has the following hypothesis:



*   $H0$: Data is not stationary
*   $H1$: Data is stationary



We test stationarity using the adrfuller function in statsmodels. The example below demonstrates this with our CO2 data. The adrfuller function returns multiple values. The second position in the data structure returned is the p-value of our hypothesis.

In [None]:
from statsmodels.tsa.stattools import adfuller

In [None]:
adfuller(occupancy.CO2)[1]

The p-value is greater than 0.05. Therefore, with a 95% confidence interval, we do not reject the null hypothesis and conclude that the data is not stationary.

### Mini-Quiz!

Which one of these time series is/are stationary?

![alt text](https://otexts.com/fpp2/fpp_files/figure-html/stationary-1.png)

Figure 8.1: Which of these series are stationary? (a) Google stock price for 200 consecutive days; (b) Daily change in the Google stock price for 200 consecutive days; (c) Annual number of strikes in the US; (d) Monthly sales of new one-family houses sold in the US; (e) Annual price of a dozen eggs in the US (constant dollars); (f) Monthly total of pigs slaughtered in Victoria, Australia; (g) Annual total of lynx trapped in the McKenzie River district of north-west Canada; (h) Monthly Australian beer production; (i) Monthly Australian electricity production. (Retrieved From This [Resource](https://otexts.com/fpp2/stationarity.html))

"#@title
Obvious seasonality rules out series (d), (h) and (i). Trends and changing levels rules out series (a), (c), (e), (f) and (i). Increasing variance also rules out (i). That leaves only (b) and (g) as stationary series."

## Random Walks

A random walk is a type of time series model where each observation depends on the sum of the previous observation and a random noise component. The formal notation for this is:

$$y_{t} = y_{t-1} + \epsilon_{t},$$

where $\epsilon_{t}$ denotes a random noise variable with $\epsilon_{t} \sim \mathcal{N}(0,1).$

Random walks are considered **non-stationary** because and therefore **time dependent.**

## Moving Average

Moving average models are similar to autoregressive models. Moving average models also depend on a linear combination of past data. **However, unlike autoregressive models, these models depend on past white noise terms.** While the name is the same, moving average models are not the same as calculating the moving average of a time series.

Moving average models are typically noted with **MA(q) where q is the number of past white noise terms summed by the model.** For example, a first order moving average model, MA(1) will be denoted by:

$$y_{t} = \mu  + \theta_{t} \epsilon_{t-1} +  \epsilon_{t},$$

where $\mu$ denotes the mean of the time series, $\epsilon_{t}$ is the error term and $\theta$ the model parameter. 

We may also have a **AM(2)** model, where we take into account two lags or time steps: 

$$y_{t} = \mu + \theta_{1} \epsilon_{t-1} + \theta_{2}\epsilon_{t-2} + \epsilon_{t}.$$


Let us have a look at two MA models. The first model, MA(1) denotes:

$$y_{t} = 20 + \epsilon + 0.8 \epsilon_{t-1}.$$

The second one, MA(2), displays the following model:

$$y_{t} = \epsilon_{t} + \epsilon_{t-1} + 0.8 \epsilon_{t-2}.$$




![alt text](https://otexts.com/fpp2/fpp_files/figure-html/maq-1.png)

## ARMA Model


We can create a moving average model using the **ARMA (AutoRegressive Moving Average)** function in the statsmodels package. This function generates models that can have both an autoregressive component as well as a moving average component. 

Mathematically, for a ARMA(1,1) model, we have the following:

$$y_{t} = \beta_{0} + \beta_{1} y_{t-1} + \theta_{1}\epsilon_{t-1} + \epsilon_{t}.$$

However, here we will set the autoregressive lag to zero to create only a moving average model. Let's use our CO2 data again for this example.

Recall that $\beta_{0} + \beta_{1} y_{t-1}$ represents the *AR* part of the equation and $\theta_{1}\epsilon_{t-1} + \epsilon_{t}$ the *MA* part of the model. As such, we have a combination of a AR(1) and a MA(1) model, which we combined into a ARMA(1,1) model. 

In [None]:
from statsmodels.tsa.arima_model import ARMA

In [None]:
model = ARMA(occupancy.CO2, order=(0, 1))       #AR = 0, MA=1  
model_fit = model.fit()
predictions = model_fit.predict(start = len(occupancy.CO2)-3, end = len(occupancy.CO2)-1)  

Let's look at the predictions:

In [None]:
pd.DataFrame({'observed':occupancy.CO2[-3:], 'predicted':predictions})

We can see that this model alone is not a great fit for this data since there is a big difference between observed and predicted.

## Combining Autoregression with Moving Average

As we have seen in the previous paragraph, we can create a model with both an autoregressive component and a moving average component. This model is called an ARMA model and is denoted by ARMA(n, q) where n is the number of lag periods and q is the number of past white noise terms. Below is an example of an ARMA model with two lag terms and one white noise term.

In [None]:
# fit model
model = ARMA(occupancy.CO2, order=(2, 1))      # AR 2, MA 1
model_fit = model.fit(disp=False)
# make prediction
predictions = model_fit.predict(len(occupancy.CO2)-3, len(occupancy.CO2)-1)

We expect our predictions to improve:



In [None]:
pd.DataFrame({'observed':occupancy.CO2[-3:], 'predicted':predictions})

## Summary

In this lesson we were introduced to a number of basic concepts in time series modeling. Time series models require different treatment than linear models. We learned about the different components of a time series and how to plot them. We learned about stationarity, random walks, and autoregressive and moving average models. This introduction should provide you with the tools to evaluate and model time ordered data.

Some useful resources:



*   [Here](https://otexts.com/fpp2/moving-averages.html)
*   [Here](https://otexts.com/fpp2/stationarity.html)  
*   [Here](https://www.youtube.com/watch?v=oY-j2Wof51c)

