#What will you learn ?
You will learn about

Auto Correlation and Partial auto correlation - help you in choosing the right model for forecasting.

Modeling Using Auto Regressing - understand how you can build this model and use it for Forecasting.

Moving average Smoothing - Your time series data has a lot of noise. You will learn how to smoothen the noise and use the data for developing models.

ARMA and ARIMA - very powerful models for Time Series Forecasting

Exponential Smoothing - another important concept you will learn.

Finally, you will also learn about multivariate Time Series Modeling.

For the practical aspects, you will go through code snippets in Python.

You will also get an opportunity to try a few hands-on exercises for familiarizing with model building and forecasting.

#Auto Correlation Function
Auto Correlation is a very important step in Time Series Model Building

In this step , the data is correlated with its lag values to see how well the current value is related to the previous values

Auto Correlation Function is used for quantifying this relationship

In Time Series, the same data point is observed at different time intervals, hence the correlation of the current value with the previous values is known as ***Auto Correlation or Serial Correlation***.

#ACF Explanation
ACF Explanation
Consider the above time series

You can observe the data point at different time intervals

If you take a lag of 1 time step and correlate the lag values with the actual values , you get the auto correlation of lag 1

This process can be repeated for multiple lags

#ACF Plot
ACF Plot
The ACF plot is a very useful graph while performing Time Series Modeling

It tell the correlation value for each lag

An ideal ACF plot will decay exponentially

Depending on the ACF plot value , you can decide to stop at any given lag

#Plotting ACF using Python
Plotting ACF using Python
Getting the Time Series Data

import pandas_datareader as pdr
appleData = pdr.get_data_yahoo('AAPL')
The above data set has the stock prices of APPLE Inc.

It is a Series of Open, High, Low , Close, Adj stock prices for a given data.

Plotting the ACF Values for various lags
```
from matplotlib import pyplot
from statsmodels.graphics.tsaplots import plot_acf
plot_acf(appleData.Close,lags=100)
pyplot.show()
To plot the PACF we are considering only the Close column in the dataset.
```
The ACF Plot has lag number in the x axis and the correlation value for the specific lag in y axis.
#ACF Properties
ACF value lies between - 1 and + 1

ACF is unitless

It helps the analyst in getting how each lag value is correlated
Fitting ACF is the first step during Model Building

Depending on the pattern you can decide what should be the order of the Auto Regression process

#PACF
Partial Auto Correlation is another important step in the Time Series Modeling Process

The partial auto correlation at any given lag k is the correlation obtained after cancelling the effect of correlations due to terms at shorter lags.

In simple terms , in partial auto correlation , the effects due to intermediate terms are nullified to determine the correlation

#Determining the PAC value
If you have a time series that is represented by [yt , yt-1 , yt-2 , .... yt-s]

If you want to determine the Partial Auto Correlation between yt and yt-s then you have to nullify the effect of all intermediate terms to get the PACF value

#Significance of PACF
Significance of PACF
The PACF values when plotted can help in determining the order of the Auto Regressive Process.

<PACF Picture>

The number of significant correlations will determine the order of the order of the AR process.

The picture above shows a sample PACF plot.

Top Left Plot has one significant correlation meaning the order of the AR process is 1.

Top Right Plot has two and so on ..

PACF using Python
Getting the Time Series Data
```
import pandas_datareader as pdr
appleData = pdr.get_data_yahoo('AAPL')
Plotting the PACF Values for various lags

from matplotlib import pyplot
from statsmodels.graphics.tsaplots import plot_pacf
plot_pacf(appleData.Close,lags=100)
pyplot.show()
```

The PACF plot has lag on X axis and correlation value on Y axis.

Reading the PACF Plot we will understand the number of significant lags to determine the order of AR Process.

From the plot in the previous card , it is evident that the order of AR will be 2 for the time series.

nstalling Statsmodels Package
The following exercise will require you to install ***statsmodels package***. Installation instructions below.

In your playground, you have to do the following installation to try out your code.

Once you land in the online playground, go the terminal and enter the following command pip install --user statsmodels.

If your installation is successful, you should get the following message


Installing collected packages: patsy, statsmodels

Successfully installed patsy-0.4.1 statsmodels-0.8.0


##Auto Regression
Auto Regression
Regression is the process of predicting one variable from another .

In Time Series , the data points collected are from the same observation.

Here the past observations are used for predicting the future values.

This process of regressing the past values to get the future values is known as Auto Regression.

#Auto Regression Equation
Auto Regression Equation
The outcome of the model fitting will be determining the coef values.

Once the coef values are determined they can be used to predict the future values based on the current and past data.

#Assumptions for Model Fitting
The lag values should help in predicting the current and the future values

Stronger the correlation between the current and the past values, more robust the predictions from Auto Regression

#Getting the Apple stock prices
```
import pandas_datareader as pdr
appleData = pdr.get_data_yahoo('AAPL')
Viewing the Time Series

import matplotlib.pyplot as plt
appleData.Close.plot()
plt.show()


#splitting
from statsmodels.tsa.ar_model import AR

from sklearn.metrics import mean_squared_error


X = appleData.Close.values



train, test = X[1:len(X)-10], X[len(X)-10:]


#Viewing the correlogram with 250 lags


from statsmodels.graphics.tsaplots import plot_acf

plot_acf(appleData.Close,lags = 250)

plt.show()


#Model Fitting

model = AR(train)

model_fit = model.fit()

print('Lag: %s' % model_fit.k_ar)

print('Coefficients: %s' % model_fit.params)

#We are fitting the model and and viewing the coefficients.

#Coef Values


Lag: 25

Coefficients: [ 0.11762701  1.01419227 -0.03097132 -0.01778476  0.04704682 -0.0164039

  0.00131838  0.0240413  -0.05058969  0.05770135 -0.00680828 -0.06627654

  0.03677982  0.03879211  0.00913336 -0.05083633  0.00953855  0.02372498

 -0.03278349  0.02695213  0.03216126 -0.04961039 -0.05640927  0.04916215

  0.03326709 -0.02598511]





#Forecasting with AR
#Forecasting with AR

predictions = model_fit.predict(start=len(train), end=len(train)+len(test)-1, dynamic=False)

for i in range(len(predictions)):

    print('predicted=%f, expected=%f' % (predictions[i], test[i]))

error = mean_squared_error(test, predictions)

print('Test MSE: %.3f' % error)

# plot results

plt.plot(test)

plt.plot(predictions, color='red')

plt.show()


```


#Moving Average Smoothing
Moving Average Smoothing
Moving Average is a smoothing technique for reducing the noise in the Time Series Data

It is used for exposing the signal that is in-between the noise

The values are averaged at different time windows
#Centered Moving Average
Centered Moving Average
In Centered Moving average , the average is calculated for time steps before and after the given time step

cma(t) = average(obs(t-1), obs(t), obs(t+1))
#Trailing Moving Average
Trailing Moving Average uses historical observations for determining the future observation.
```
trail_moving_average(t) = average(obs(t-2), obs(t-1), obs(t))


Moving Average with Python
import pandas_datareader as pdr
appleData = pdr.get_data_yahoo('AAPL')
from pandas import Series
seriesData = Series(appleData.Close.values)
rolling = seriesData.rolling(window=3)
rolling_mean = rolling.mean()
rolling_mean.head(10)
In the above code , we are keeping the window size of 3 and moving across the time series to calculate the arithmetic mean.

Results below. The first two observations are Nan.

0          NaN
1          NaN
2    30.434285
3    30.445714
4    30.282380
5    30.168095
6    30.127143
7    29.990953
8    29.927619
9    29.895238
dtype: float64
```

#Visualizing Moving Average Results

```
seriesData.head(100).plot()

rolling_mean.head(100).plot(color='red')

plt.show()
```
The Actual data is in Blue and the rolling mean is in Red .

You can see how the data is smoother now .


#Moving Average for Prediction
Moving Average for Prediction
Moving Average can be used for prediction

It is not a very effective way to predict but can still be used to get a naive estimate

The assumption we make is that the trend and seasonal components are nullified

This can be used in a Walk Forward Manner

Prediction using Python
```
from numpy import mean



X = appleData.Close.values

window = 3

history = [X[i] for i in range(window)]

test = [X[i] for i in range(window, len(X))]

predictions = list()

# walk forward over time steps in test

for t in range(len(test)):

	length = len(history)

	yhat = mean([history[i] for i in range(length-window,length)])

	obs = test[t]

	predictions.append(yhat)

	history.append(obs)

#	print('predicted=%f, expected=%f' % (yhat, obs))

error = mean_squared_error(test, predictions)

print('Test MSE: %.3f' % error)


# plot

plt.plot(test)

plt.plot(predictions, color='red')

plt.show()

# zoom plot

plt.plot(test[0:100])

plt.plot(predictions[0:100], color='red')

plt.show()

```
In the above code , we are forecasting using Moving Average process. We are traversing along the time series to create the forecast values.

Finally we are getting the Mean Square Error.

9 of 12

In [15]:
import pandas as pd
from pandas import Series
timeSeries = [30,21,29,31,40,48,53,47,37,39,31,29,17,9,20,24,27,35,41,38,
          27,31,27,26,21,13,21,18,33,35,40,36,22,24,21,20,17,14,17,19,
          26,29,40,31,20,24,18,26,17,9,17,21,28,32,46,33,23,28,22,27,
          18,8,17,21,31,34,44,38,31,30,26,32]
ts = Series(timeSeries)
from matplotlib import pyplot 
from statsmodels.tsa.stattools import acf
from statsmodels.tsa.stattools import pacf

acf_corr = acf(ts,nlags=5, unbiased=True,)
###End code(approx 2 lines)
print(acf_corr)

from statsmodels.graphics.tsaplots import pacf

pacf_corr = pacf(ts,nlags=5)
###End code(approx 2 lines)
print(pacf_corr)

acf_lag_1 = acf_corr[0]   # acf at lag 0 == 1
acf_lag_2 = acf_corr[1]  # acf at lag 1
pacf_lag_3 = pacf_corr[2] # acf at lag 3
print(acf_lag_1,pacf_lag_3)

[ 1.          0.72120588  0.42688471  0.09746397 -0.18409287 -0.31332648]
[ 1.          0.72120588 -0.19433338 -0.28172395 -0.18857053  0.02711725]
1.0 -0.19433337567125356




In [19]:
import pandas as pd
from pandas import Series
timeSeries = [30,21,29,31,40,48,53,47,37,39,31,29,17,9,20,24,27,35,41,38,
          27,31,27,26,21,13,21,18,33,35,40,36,22,24,21,20,17,14,17,19,
          26,29,40,31,20,24,18,26,17,9,17,21,28,32,46,33,23,28,22,27,
          18,8,17,21,31,34,44,38,31,30,26,32]
train, test = timeSeries[1:len(timeSeries)-10], timeSeries[len(timeSeries)-10:]

###Start code here
from statsmodels.tsa.ar_model import AR
model = AR(train)
model_fit = model.fit()
###End code(approx 3 lines)
print('Lag: %s' % model_fit.k_ar)
print('Coefficients: %s' % model_fit.params)

from sklearn.metrics import mean_squared_error
###Start code here
predictions = model_fit.predict(start=len(train), end=len(train)+len(test)-1, dynamic=False)
###End code
for i in range(len(predictions)):
    print('predicted=%f, expected=%f' % (predictions[i], test[i]))
error = mean_squared_error(test, predictions)
print('Test MSE: %.3f' % error)

Lag: 11
Coefficients: [17.61119882  0.60959022 -0.46145298  0.34147052 -0.5535399   0.50364582
 -0.57076599  0.40686544 -0.48981335  0.36377344 -0.51010253  0.64166238]
predicted=13.925993, expected=17.000000
predicted=20.279285, expected=21.000000
predicted=26.556221, expected=31.000000
predicted=34.709756, expected=34.000000
predicted=39.324673, expected=44.000000
predicted=28.433625, expected=38.000000
predicted=27.219552, expected=31.000000
predicted=25.737600, expected=30.000000
predicted=23.488579, expected=26.000000
predicted=27.009755, expected=32.000000
Test MSE: 20.726


#Moving Average Smoothing
Moving Average Smoothing
Moving Average is a smoothing technique for reducing the noise in the Time Series Data

It is used for exposing the signal that is in-between the noise

The values are averaged at different time windows

Centered Moving Average
Centered Moving Average
In Centered Moving average , the average is calculated for time steps before and after the given time step

cma(t) = average(obs(t-1), obs(t), obs(t+1))

Centered Moving Average Properties
This method requires the know how of the future values to calculate the average.

This technique is used for detrending and removing any seasonal effects on Time Series Data.

This is not very useful for forecasting.

#Trailing Moving Average
Trailing Moving Average uses historical observations for determining the future observation.

trail_moving_average(t) = average(obs(t-2), obs(t-1), obs(t))

Moving Average with Python
```
import pandas_datareader as pdr
appleData = pdr.get_data_yahoo('AAPL')
from pandas import Series
seriesData = Series(appleData.Close.values)
rolling = seriesData.rolling(window=3)
rolling_mean = rolling.mean()
rolling_mean.head(10)
In the above code , we are keeping the window size of 3 and moving across the time series to calculate the arithmetic mean.

Results below. The first two observations are Nan.

0          NaN
1          NaN
2    30.434285
3    30.445714
4    30.282380
5    30.168095
6    30.127143
7    29.990953
8    29.927619
9    29.895238
dtype: float64
seriesData.head(100).plot()

rolling_mean.head(100).plot(color='red')

plt.show()

```

Moving Average can be used for prediction

It is not a very effective way to predict but can still be used to get a naive estimate

The assumption we make is that the trend and seasonal components are nullified

This can be used in a Walk Forward Manner

#Prediction using Python
```
from numpy import mean



X = appleData.Close.values

window = 3

history = [X[i] for i in range(window)]

test = [X[i] for i in range(window, len(X))]

predictions = list()

# walk forward over time steps in test

for t in range(len(test)):

	length = len(history)

	yhat = mean([history[i] for i in range(length-window,length)])

	obs = test[t]

	predictions.append(yhat)

	history.append(obs)

#	print('predicted=%f, expected=%f' % (yhat, obs))

error = mean_squared_error(test, predictions)

print('Test MSE: %.3f' % error)

# plot

plt.plot(test)

plt.plot(predictions, color='red')

plt.show()

# zoom plot

plt.plot(test[0:100])

plt.plot(predictions[0:100], color='red')

plt.show()

```
In the above code , we are forecasting using Moving Average process. We are traversing along the time series to create the forecast values.

Finally we are getting the Mean Square Error.

#ARMA
ARMA
The above equation represents a typical ARMA process.

Alpha represent all the Auto Regression Terms and Beta represents all the Moving Average Terms.

ARMA stands for Auto Regressive Moving Average.

It is a combination of AR and MA represented using the above equation.

Statisticians Box and Jenkins suggested an approach for identifying , estimating and fitting time series models.

The steps are

Identification

Estimation

Diagnostic Checking

In the following cards , you will learn these steps in detail.

This is the First Step in the process

Check if the Time series is Stationary or Not

Get the parameters for ARMA model

Stationarity Check

Differencing

Unit Root Method

Configuring AR and MA

ACF and PACF plot will help in getting the p and q values for the model

Some Observable Patterns

If the ACF is trailing after a particular lag value and shows a very hard cut-off in PACF after a specific lag value , the process is AR. Value of p is the lag value

If the the PACF is trailing off after a specific lag value and is having a very hard cut-off in the ACF after a particular lag value , the model is MA . Value of q is the lag value.

Estimation involves all the steps to minimize the loss from errors

Diagnosting checking involves investigation of

Over-fitting

Residual Errors

#Model Representation
The standard representation of an ARIMA model is ARIMA(p,d,q)

p the number of lag observations also called as lag order

d number of observations that are differenced, degree of difference

q order of moving average or the size of the moving average window

#ARIMA Process
Data is prepared by the degree of differencing to make it stationary

A linear regression model is prepared based on the specified type and the number of terms

One assumption is that the process that generated the observations is also an ARIMA process


```
import pandas_datareader as pdr

appleData = pdr.get_data_yahoo('AAPL')

appleCloseTs = appleData.Close

Viewing the autocorrelation plot


autocorrelation_plot(appleCloseTs)

plt.show()

from pandas import DataFrame

model = ARIMA(appleCloseTs, order=(5,1,0))

model_fit = model.fit(disp=0)

print(model_fit.summary())

# plot residual errors

residuals = DataFrame(model_fit.resid)

residuals.plot()

plt.show()

residuals.plot(kind='kde')

plt.show()

print(residuals.describe())

from statsmodels.tsa.arima_model import ARIMA

from sklearn.metrics import mean_squared_error

 

X = appleCloseTs.values

size = int(len(X) * 0.66)

train, test = X[0:size], X[size:len(X)]

history = [x for x in train]

predictions = list()

for t in range(len(test)):

	model = ARIMA(history, order=(5,1,0))

	model_fit = model.fit(disp=0)

	output = model_fit.forecast()

	yhat = output[0]

	predictions.append(yhat)

	obs = test[t]

	history.append(obs)

	#print('predicted=%f, expected=%f' % (yhat, obs))
```
Here we are splitting the data into training and testing.
Fitting an ARIMA model and doing the forecasting.
We are fitting an ARIMA model with 5 lags , number of observations differenced is 1 and zero order for the moving window.



In [0]:
import pandas as pd
from pandas import Series
timeSeries  = [30,21,29,31,40,48,53,47,37,39,31,29,17,9,20,24,27,35,41,38,
          27,31,27,26,21,13,21,18,33,35,40,36,22,24,21,20,17,14,17,19,
          26,29,40,31,20,24,18,26,17,9,17,21,28,32,46,33,23,28,22,27,
          18,8,17,21,31,34,44,38,31,30,26,32]
ts = Series(timeSeries)
X = ts.values

###Start code here
from statsmodels.tsa.arima_model import ARIMA                         #import ARIMA
X = X.astype('float64')
size = int(len(X) * 0.80)
train, test =  X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()
for t in range(len(test)):
    model = ARIMA(history, order=(5,1,0))
    model_fit =  model.fit(disp=0)
    output = model_fit.forecast()
##End code
    yhat = output[0]
    predictions.append(yhat)
    obs = test[t]
    history.append(obs)
    print('predicted=%f, expected=%f' % (yhat, obs))

from sklearn.metrics import mean_squared_error
error = mean_squared_error(test, predictions)
print("MSE = ", error)

#What is VAR ?
Vector Auto Regression is a multivariate generalization of a uni variate auto regressive time series model

Multivariate linear time series models

They are designed to capture the collective dynamics of multiple time series

```
import numpy as np

import pandas

import statsmodels.api as sm

from statsmodels.tsa.api import VAR, DynamicVAR

mdata = sm.datasets.macrodata.load_pandas().data

dates = mdata[['year', 'quarter']].astype(int).astype(str)

quarterly = dates["year"] + "Q" + dates["quarter"]

from statsmodels.tsa.base.datetools import dates_from_str

quarterly = dates_from_str(quarterly)

mdata = mdata[['realgdp','realcons','realinv']]

mdata.index = pandas.DatetimeIndex(quarterly)

data = np.log(mdata).diff().dropna()

model = VAR(data)

results = model.fit(2)

results.summary()
results.plot()

plt.show()


lag_order = results.k_ar

results.forecast(data.values[-lag_order:], 5)


array([[ 0.00502587,  0.0053712 ,  0.0051154 ],

       [ 0.00593683,  0.00784779, -0.00302473],

       [ 0.00662889,  0.00764349,  0.00393308],

       [ 0.00731516,  0.00797044,  0.00657495],

       [ 0.00732726,  0.00808811,  0.00649793]])


results.plot_forecast(10)

plt.show()
```


#Weighted Smoothing
In time series forecasting , while predicting the future value , all the past values may not equally contribute

Value at lag 1 will contribute more than the remaining values , hence it is a good approach to assign weights that decay to each of the values that are a time step apart and use that to predict the future value

#Weighted Smoothing Limitation
One limitation to the approach suggested in the previous card is if the sum of the weights are greater than 1 then the prediction accuracy might not be good.

To overcome this limitation , we go for an approach where the sum of the weights is 1

```
import matplotlib.pyplot as plt 

import pandas as pd

from pandas import Series 



timeSeries = [30,21,29,31,40,48,53,47,37,39,31,29,17,9,20,24,27,35,41,38,

          27,31,27,26,21,13,21,18,33,35,40,36,22,24,21,20,17,14,17,19,

          26,29,40,31,20,24,18,26,17,9,17,21,28,32,46,33,23,28,22,27,

          18,8,17,21,31,34,44,38,31,30,26,32]

ts = Series(timeSeries)



The below python function returns the values of a time series after smoothing


def exp_smth(ts, alpha):

    result = [ts[0]] # first value is same as series

    for n in range(1, len(ts)):

        result.append(alpha * ts[n] + (1 - alpha) * result[n-1])

    return result


# >>> exp_smth(series,0.1)

# >>> exp_smth(series,0.9)
```
Significance of Exponential Smoothing
Helps in smoothing all the noise from the time series

For different alpha values , the smoothing will differ

Higher values recreate the time series with some smoothing

Forecasts only one value at a time
Extension
The process described can be extended to predict multiple values

Depending on the trend of the time series either additive or multiplicative , the process of smoothing differs

All these process are collectively called Holt-Winters Method