Vector Autoregressions
## Vector Autoregresion (VAR) Estimation

**Functions**

`tsa.VAR`

### Exercise 85
Download data on 10-year interest rates, 1-year interest rates and the
GDP deflator from FRED.

In [1]:
import pandas as pd

gs1 = pd.read_csv("./data/GS1.csv", parse_dates=True)
gs1 = gs1.set_index("DATE")
gs10 = pd.read_csv("./data/GS10.csv", parse_dates=True)
gs10 = gs10.set_index("DATE")
defl = pd.read_csv("./data/GDPDEF.csv", parse_dates=True)
defl = defl.set_index("DATE")

data = pd.concat([gs1, gs10, defl], axis=1)
data.columns = ["gs1", "gs10", "defl"]
data.head(6)

Unnamed: 0_level_0,gs1,gs10,defl
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1953-04-01,2.36,2.83,14.409
1953-05-01,2.48,3.05,
1953-06-01,2.45,3.11,
1953-07-01,2.38,2.93,14.47
1953-08-01,2.28,2.95,
1953-09-01,2.2,2.87,


#### Explanation

The data have all been downloaded from FRED and saved as csv files. The
series are imported and merged into a single DataFrame. We see that
the deflator is quarterly while the others are monthly.

In [2]:
data.index = pd.to_datetime(data.index)
data = data.resample("QE").mean().dropna()
data.head()

Unnamed: 0_level_0,gs1,gs10,defl
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1953-06-30,2.43,2.996667,14.409
1953-09-30,2.286667,2.916667,14.47
1953-12-31,1.706667,2.643333,14.497
1954-03-31,1.226667,2.44,14.543
1954-06-30,0.876667,2.346667,14.556


#### Explanation
We can use `resample` to convert all of the series to quarterly to match the
deflator. The mean is a reasonable method to aggregate the interest rates and
since pandas ignores `NaN`, the mean of the deflator is the observation
available in each quarter.

### Exercise 86
Transform the GDP deflator to be percent returns (e.g. $\Delta\ln\left(GDP_t\right)$ ).

In [3]:
import numpy as np

log_defl = np.log(data.defl)
data["deflg"] = log_defl - log_defl.shift(1)
data = data.dropna()
data.head()

Unnamed: 0_level_0,gs1,gs10,defl,deflg
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1953-09-30,2.286667,2.916667,14.47,0.004225
1953-12-31,1.706667,2.643333,14.497,0.001864
1954-03-31,1.226667,2.44,14.543,0.003168
1954-06-30,0.876667,2.346667,14.556,0.000894
1954-09-30,0.916667,2.346667,14.575,0.001304


#### Explanation

Here we use `np.log` and `shift` to implement the log difference.

### Exercise 87
Estimate a first-order VAR on the spread between the 10-year and 1-year
(spread), the one-year, and the growth rate of the GDP deflator.

In [4]:
import statsmodels.tsa.api as tsa

data["spread"] = data.gs10 - data.gs1
# Save for later
data.to_hdf("./data/var-data.h5", key="var_data")

#### Explanation

The spread is constructed as the difference and the data is saved for use in 
other exercises.

In [5]:
mod = tsa.VAR(data[["spread", "gs1", "deflg"]])
res = mod.fit(1, trend="c", ic=None)

res.summary()

  Summary of Regression Results   
Model:                         VAR
Method:                        OLS
Date:           Wed, 27, Aug, 2025
Time:                     15:13:40
--------------------------------------------------------------------
No. of Equations:         3.00000    BIC:                   -15.0871
Nobs:                     265.000    HQIC:                  -15.1840
Log likelihood:           904.460    FPE:                2.38434e-07
AIC:                     -15.2492    Det(Omega_mle):     2.27955e-07
--------------------------------------------------------------------
Results for equation spread
               coefficient       std. error           t-stat            prob
----------------------------------------------------------------------------
const             0.055526         0.072753            0.763           0.445
L1.spread         0.926595         0.029050           31.896           0.000
L1.gs1            0.017248         0.011502            1.500           0.13

#### Explanation

A VAR model is specified using `tsa.VAR`. The only required input is the data.
We do not include the intercept since this is supplied through the `trend`
argument of `fit`, where "c" indicates a constant. We set the maximum
lag to 1 and `ic` to `None` to get force a VAR(1) to be estimated. If we
do not set `ic` to `None`, statsmdoels will perform a lag length search for
lags in 0, 1, ..., `maxlags` (0 or 1 in this specification).

### Exercise 88
What are the _own_ effects?


In [6]:
res.params

Unnamed: 0,spread,gs1,deflg
const,0.055526,0.026191,0.00121
L1.spread,0.926595,0.006413,-0.000299
L1.gs1,0.017248,0.944286,0.000158
L1.deflg,-8.881401,30.073181,0.78296


In [7]:
own_effects = {}
for var in res.params:
    own_effects[var] = res.params.loc[f"L1.{var}", var]
pd.DataFrame(pd.Series(own_effects, name="Own Effect"))

Unnamed: 0,Own Effect
spread,0.926595
gs1,0.944286
deflg,0.78296


#### Explanation

The parameters are a `DataFrame` where the columns are the leads and the rows
are lag or trend terms. 

### Exercise 89
What are the cross effects between these?

In [8]:
other_effects = {}
for var in res.params:
    for other in res.params:
        if other == var:
            continue
        other_effects[(var, other)] = res.params.loc[f"L1.{other}", var]

s = pd.Series(other_effects, name="effect")
s.index = s.index.set_names(["lead", "lag"])
pd.DataFrame(s)

Unnamed: 0_level_0,Unnamed: 1_level_0,effect
lead,lag,Unnamed: 2_level_1
spread,gs1,0.017248
spread,deflg,-8.881401
gs1,spread,0.006413
gs1,deflg,30.073181
deflg,spread,-0.000299
deflg,gs1,0.000158


#### Explanation

These effects are hard to interpret since the series have not been
standardized to have the same variance.

### Exercise 90
How could you get a sense of the persistence of this system?

In [9]:
# Exclude the constant
phi = res.params.iloc[1:]
evals = np.linalg.eigvals(phi)
print(f"The maximum eigenval is {np.max(np.abs(evals))}")

The maximum eigenval is 0.9565826582275834


#### Explanation

The maximum eigenvalue of the VAR(1) parameters provides a measure of the
persistence in the model. It is close to 1 so these values are highly
persistent.

In [10]:
std_data = data / data.std()
mod = tsa.VAR(std_data[["spread", "gs1", "deflg"]])
res = mod.fit(1, trend="c", ic=None)

res.params.iloc[1:]

Unnamed: 0,spread,gs1,deflg
L1.spread,0.926595,0.002092,-0.056719
L1.gs1,0.052879,0.944286,0.092085
L1.deflg,-0.046769,0.051655,0.78296


In [11]:
phi = res.params.iloc[1:]
evals = np.linalg.eigvals(phi)
print(f"The maximum eigenval is {np.max(np.abs(evals))}")

The maximum eigenval is 0.9565826582275723


#### Explanation

We repeat the exercise using data standardized by their standard deviations.
While the coefficients change (except own effects), the eigenvalues are unaffected. 
The coefficients are directly interpretable in terms of a 1 standard deviation change
in each variable.