## Vector Autoregressions (VAR)

This notebook will demonstrate how to estimate a vector autoregression. These models have become extremely common in both finance and macroeconomics. They are flexible, low-cost in terms of estimating, and are designed to better help you understand the time series properties of your variables.

In [None]:
import Haver
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.tsa.api import VAR
from statsmodels.tsa.base.datetools import dates_from_str

As a reminder make sure your Haver path is set to whichever Haver database you are working with. 

In [None]:
Haver.path()

In [None]:
Haver.path('c:\DLX\dat')

In [None]:
Haver.path()

Lets query the US1PLUS Haver database for the U.S. CPI and PPI indices. Remeber to set dates=True to ensure proper indexation with pandas.

In [None]:
df=Haver.data(['pcu','pa'], 'us1plus', dates=True)



In [None]:
df=df.dropna()

In [None]:
df.head()

In [None]:
df.tail()

It is always a good idea to quickly take a look at your data.

In [None]:
fig = plt.figure(figsize=(20,12))
fig= plt.plot(df, label=['CPI', 'PPI'])
plt.xlabel("Date")
plt.ylabel("YOY % Change")
plt.title('Annual Inflation Rates for U.S. CPI and PPI')
plt.legend(['CPI','PPI'], fontsize="20")
plt.show()

We are often interested in working with inflation rather than the price level. We can take the log and then difference the data year-over-year to end up with an inflation number we are use to working with.

In [None]:
ddf = np.log(df).diff(12).dropna()

In [None]:
fig = plt.figure(figsize=(20,12))
fig= plt.plot(ddf, label=['CPI', 'PPI'])
plt.xlabel("Date")
plt.ylabel("YOY % Change")
plt.title('Annual Inflation Rates for U.S. CPI and PPI')
plt.legend(['CPI','PPI'], fontsize="20")
plt.show()

Estimating a vector autoregression is pretty simple in Python. the VAR command will create the model and the fit command will perform the estimation. 

In [None]:
model = VAR(ddf)
results = model.fit(2)
results.summary()

In [None]:
x=results.plot()

In [None]:
x=results.plot_acorr()

Lag selection in a VAR can be automated using information criteria such as AIC or BIC. You can choose the max number of lags you will allow and the information criteria of your choice. Remember that VARs use up degrees of freedom quickly so be conscious of your lag length.

In [None]:
results = model.fit(maxlags=20, ic='aic')

In [None]:
results.summary()

Vector autoregressions are widely used for forecasting. They have performed as well as structural equations and other types of forecasting models. 

In [None]:
lag_order= results.k_ar
forecast=results.forecast(ddf.values[-lag_order:], 12)
forecast= pd.DataFrame(forecast, columns=['CPI', 'PPI'])
forecast

In [None]:
plt.plot(forecast)

In [None]:
x=results.plot_forecast(12)

Impulse response functions allow you to shock the error term of one equation and see how that 1 standard deviation shock effects the other variables in system. It is based on the estimated coefficients from the underlying VAR model. 

In [None]:
irf = results.irf(10)

In [None]:
x=irf.plot(orth=False)

In [None]:
x=irf.plot(impulse='pa')

You can also perform an impulse response and calculate the cumulative effect of the 1 SD shock over a given set of periods.

In [None]:
x=irf.plot_cum_effects(orth=False)

Granger causality tests statistically test whether lags of one variable have any incremental explanatory power on the other variable(s). Don't let the name fool you. This is not a test of causation. 

In [None]:
granger=results.test_causality('pcu', ['pa'], kind='f')

In [None]:
print(granger)

In [None]:
granger=results.test_causality('pa', ['pcu'], kind='f')

In [None]:
print(granger)