[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/danpele/Time-Series-Analysis/blob/main/chapter5_seminar_notebook.ipynb)

---

# Chapter 5 Seminar: VAR Models & Granger Causality - Practice Exercises

**Course:** Time Series Analysis and Forecasting  
**Program:** Bachelor program, Faculty of Cybernetics, Statistics and Economic Informatics, Bucharest University of Economic Studies, Romania  
**Academic Year:** 2025-2026

---

## Seminar Objectives

1. Understand Vector Autoregression (VAR) model structure
2. Estimate and interpret VAR models in Python
3. Conduct Granger causality tests
4. Compute and interpret Impulse Response Functions (IRF)
5. Analyze Forecast Error Variance Decomposition (FEVD)
6. Apply VAR to real economic data

## Setup

In [None]:
# Core libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# VAR modeling
from statsmodels.tsa.api import VAR
from statsmodels.tsa.stattools import adfuller, grangercausalitytests
from statsmodels.tsa.vector_ar.vecm import coint_johansen
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from scipy import stats

# Plotting style
plt.rcParams['figure.figsize'] = (12, 5)
plt.rcParams['font.size'] = 11
plt.rcParams['axes.facecolor'] = 'none'
plt.rcParams['figure.facecolor'] = 'none'
plt.rcParams['axes.grid'] = False
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.spines.right'] = False

COLORS = {'blue': '#1A3A6E', 'red': '#DC3545', 'green': '#2E7D32', 'orange': '#E67E22'}

print("Setup complete!")

---
# Part 1: Multiple Choice Quiz (20 Questions)

Answer the following questions about VAR models and Granger causality. Run each cell after entering your answer to check if you're correct.

### Quiz 1: VAR Definition

**Question:** In a VAR(2) model with 3 variables, how many coefficient matrices $\mathbf{A}_i$ are there?

- A) 2
- B) 3
- C) 6
- D) 9

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz1_answer = ''  # <-- Enter your answer here

# Check answer
if quiz1_answer.upper() == 'A':
    print("✓ CORRECT!")
    print("VAR(p) has p coefficient matrices, one for each lag.")
    print("VAR(2) has A₁ and A₂, each of size K×K where K=3.")
elif quiz1_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: The lag order p determines the number of coefficient matrices.")

### Quiz 2: Number of Parameters

**Question:** A VAR(2) with K=3 variables (including constants) has how many parameters per equation?

- A) 3
- B) 6
- C) 7
- D) 9

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz2_answer = ''  # <-- Enter your answer here

# Check answer
if quiz2_answer.upper() == 'C':
    print("✓ CORRECT!")
    print("Per equation: 1 (constant) + K × p = 1 + 3 × 2 = 7 parameters")
    print("Total system: K × (1 + Kp) = 3 × 7 = 21 parameters")
elif quiz2_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: Each equation has a constant plus K coefficients for each of p lags.")

### Quiz 3: Granger Causality Definition

**Question:** "X Granger-causes Y" means:

- A) X is the true economic cause of Y
- B) Past values of X help predict future Y beyond Y's own past
- C) X and Y are contemporaneously correlated
- D) X always moves before Y

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz3_answer = ''  # <-- Enter your answer here

# Check answer
if quiz3_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("Granger causality is about PREDICTIVE content, not true causation.")
    print("X Granger-causes Y if lagged X improves forecasts of Y,")
    print("beyond what Y's own history provides.")
elif quiz3_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: Granger causality is a statistical concept about prediction.")

### Quiz 4: Granger Causality Test

**Question:** To test if $Y_2$ Granger-causes $Y_1$ in a VAR(p), we test:

- A) All coefficients in the $Y_1$ equation equal zero
- B) Coefficients on lagged $Y_2$ in the $Y_1$ equation equal zero
- C) Coefficients on lagged $Y_1$ in the $Y_2$ equation equal zero
- D) The error covariance equals zero

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz4_answer = ''  # <-- Enter your answer here

# Check answer
if quiz4_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("H₀: all coefficients on lagged Y₂ in the Y₁ equation = 0")
    print("If we reject H₀, then Y₂ Granger-causes Y₁.")
    print("This is an F-test with p restrictions.")
elif quiz4_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: We test whether Y₂'s lags have predictive power for Y₁.")

### Quiz 5: VAR Stability

**Question:** A VAR(1) model is stable (stationary) if:

- A) All diagonal elements of $\mathbf{A}_1$ are less than 1
- B) The determinant of $\mathbf{A}_1$ is less than 1
- C) All eigenvalues of $\mathbf{A}_1$ are less than 1 in absolute value
- D) The trace of $\mathbf{A}_1$ equals zero

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz5_answer = ''  # <-- Enter your answer here

# Check answer
if quiz5_answer.upper() == 'C':
    print("✓ CORRECT!")
    print("Stability requires all eigenvalues λᵢ to satisfy |λᵢ| < 1.")
    print("This ensures shocks die out over time.")
    print("Geometrically: all eigenvalues inside the unit circle.")
elif quiz5_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: Stability is determined by the eigenvalues of the coefficient matrix.")

### Quiz 6: Impulse Response Functions

**Question:** An Impulse Response Function (IRF) shows:

- A) The correlation between two variables
- B) The effect of a shock to one variable on all variables over time
- C) The forecast accuracy of the model
- D) The optimal lag order

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz6_answer = ''  # <-- Enter your answer here

# Check answer
if quiz6_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("IRF_{ij}(h) = response of variable i at horizon h")
    print("to a one-unit shock in variable j at time 0.")
    print("It traces how shocks propagate through the system.")
elif quiz6_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: 'Impulse' refers to a shock, 'Response' is how variables react.")

### Quiz 7: Lag Order Selection

**Question:** Which criterion typically selects the most parsimonious VAR model?

- A) AIC (Akaike Information Criterion)
- B) BIC (Bayesian Information Criterion)
- C) FPE (Final Prediction Error)
- D) Log-likelihood

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz7_answer = ''  # <-- Enter your answer here

# Check answer
if quiz7_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("BIC penalty: k·log(n) vs AIC penalty: 2k")
    print("For n > 8: log(n) > 2, so BIC penalizes more.")
    print("BIC is consistent; AIC tends to overfit in large samples.")
elif quiz7_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: Compare the penalty terms: 2k vs k·log(n).")

### Quiz 8: FEVD Interpretation

**Question:** Forecast Error Variance Decomposition (FEVD) tells us:

- A) The correlation between variables
- B) What proportion of forecast error variance comes from each shock
- C) The optimal forecast horizon
- D) Whether residuals are white noise

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz8_answer = ''  # <-- Enter your answer here

# Check answer
if quiz8_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("FEVD decomposes the h-step forecast error variance.")
    print("It shows what % of uncertainty comes from each shock.")
    print("Useful for understanding relative importance of shocks.")
elif quiz8_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: FEVD partitions forecast uncertainty by source.")

### Quiz 9: Cholesky Ordering

**Question:** Cholesky ordering in IRF analysis assumes:

- A) All variables are equally important
- B) Variables ordered first affect later variables contemporaneously, not vice versa
- C) Shocks are perfectly correlated
- D) No lag structure exists

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz9_answer = ''  # <-- Enter your answer here

# Check answer
if quiz9_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("Cholesky imposes a recursive (triangular) structure.")
    print("First variable: affects all contemporaneously")
    print("Last variable: affected by all, affects none same-period")
    print("Economic justification required for ordering!")
elif quiz9_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: Cholesky creates a causal ordering for contemporaneous effects.")

### Quiz 10: Cointegration and VAR

**Question:** If variables are I(1) and cointegrated, you should use:

- A) VAR in levels
- B) VAR in first differences
- C) Vector Error Correction Model (VECM)
- D) Separate univariate models

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz10_answer = ''  # <-- Enter your answer here

# Check answer
if quiz10_answer.upper() == 'C':
    print("✓ CORRECT!")
    print("VECM = VAR in differences + error correction term")
    print("It captures both short-run dynamics and long-run equilibrium.")
    print("VAR in differences loses long-run information;")
    print("VAR in levels may be inefficient.")
elif quiz10_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: Cointegration implies a long-run relationship that should be modeled.")

### Quiz 11: VAR Residuals

**Question:** In a well-specified VAR, residuals should be:

- A) Autocorrelated but homoskedastic
- B) Serially uncorrelated (white noise)
- C) Perfectly normally distributed
- D) Zero for all observations

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz11_answer = ''  # <-- Enter your answer here

# Check answer
if quiz11_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("Residuals should have no autocorrelation (white noise).")
    print("Cross-equation correlation is allowed (captured in Σ).")
    print("Use Portmanteau or LM tests to check.")
elif quiz11_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: The model should capture all serial dependence.")

### Quiz 12: Structural VAR (SVAR)

**Question:** The main difference between SVAR and reduced-form VAR is:

- A) SVAR uses more lags
- B) SVAR allows contemporaneous effects with economic interpretation
- C) SVAR requires more data
- D) SVAR cannot be used for forecasting

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz12_answer = ''  # <-- Enter your answer here

# Check answer
if quiz12_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("Reduced-form: shocks are correlated, no structural interpretation")
    print("SVAR: imposes identifying restrictions to recover")
    print("orthogonal structural shocks (e.g., 'monetary policy shock')")
elif quiz12_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: 'Structural' refers to economic structure and identification.")

### Quiz 13: Granger Causality vs True Causality

**Question:** Finding that X Granger-causes Y means:

- A) X definitely causes Y in an economic sense
- B) X has predictive power for Y, but may not be true causation
- C) Y causes X
- D) X and Y are unrelated

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz13_answer = ''  # <-- Enter your answer here

# Check answer
if quiz13_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("Granger causality ≠ true causation!")
    print("Limitations: omitted variables, anticipation effects, timing.")
    print("Example: stock prices 'Granger-cause' earnings (anticipation).")
elif quiz13_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: Granger causality is about prediction, not mechanism.")

### Quiz 14: VAR Estimation

**Question:** VAR models can be estimated by:

- A) OLS on each equation separately
- B) Only by maximum likelihood
- C) Only by Bayesian methods
- D) Weighted least squares only

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz14_answer = ''  # <-- Enter your answer here

# Check answer
if quiz14_answer.upper() == 'A':
    print("✓ CORRECT!")
    print("With the same regressors in each equation:")
    print("OLS = GLS = MLE (under Gaussian errors)")
    print("This is efficient and easy to implement.")
elif quiz14_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: Same regressors in each equation leads to a nice property.")

### Quiz 15: IRF Convergence

**Question:** In a stable VAR, impulse responses as h → ∞:

- A) Explode to infinity
- B) Converge to zero
- C) Oscillate forever
- D) Stay constant

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz15_answer = ''  # <-- Enter your answer here

# Check answer
if quiz15_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("Stability means |λᵢ| < 1 for all eigenvalues.")
    print("This ensures Aʰ → 0 as h → ∞.")
    print("Shocks have transitory effects that eventually die out.")
elif quiz15_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: Stability means shocks don't have permanent effects.")

### Quiz 16: Bidirectional Causality

**Question:** If both "X Granger-causes Y" and "Y Granger-causes X", this is called:

- A) No causality
- B) Unidirectional causality
- C) Bidirectional (feedback) causality
- D) Instantaneous causality

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz16_answer = ''  # <-- Enter your answer here

# Check answer
if quiz16_answer.upper() == 'C':
    print("✓ CORRECT!")
    print("Bidirectional: X ↔ Y (feedback)")
    print("Each variable helps predict the other.")
    print("Common in financial markets: prices ↔ volume")
elif quiz16_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: Both directions show predictive power.")

### Quiz 17: VAR Companion Form

**Question:** The companion matrix of a VAR(p) is used to:

- A) Reduce the number of parameters
- B) Convert VAR(p) to VAR(1) form for analysis
- C) Estimate the model
- D) Test for Granger causality

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz17_answer = ''  # <-- Enter your answer here

# Check answer
if quiz17_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("Any VAR(p) can be written as a VAR(1) in companion form.")
    print("This makes stability analysis easier.")
    print("Eigenvalues of companion matrix determine stability.")
elif quiz17_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: The companion form is a rewriting, not a simplification.")

### Quiz 18: FEVD at Short vs Long Horizons

**Question:** In FEVD, as the horizon h increases:

- A) Own shocks always dominate
- B) The proportions converge to long-run values
- C) All shocks contribute equally
- D) FEVD becomes undefined

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz18_answer = ''  # <-- Enter your answer here

# Check answer
if quiz18_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("FEVD proportions stabilize at long horizons.")
    print("Short-run: own shocks often dominate.")
    print("Long-run: shows ultimate importance of each shock.")
elif quiz18_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: Think about what happens as we forecast further ahead.")

### Quiz 19: Instantaneous Causality

**Question:** Instantaneous causality tests whether:

- A) Lagged X predicts Y
- B) Shocks to X and Y are correlated within the same period
- C) X and Y have the same trend
- D) The VAR is stable

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz19_answer = ''  # <-- Enter your answer here

# Check answer
if quiz19_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("Instantaneous causality: contemporaneous correlation of residuals.")
    print("Tests if Cov(ε₁ₜ, ε₂ₜ) = 0")
    print("Different from Granger causality (which uses lags).")
elif quiz19_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: 'Instantaneous' means same time period, not lagged.")

### Quiz 20: Practical VAR

**Question:** Before estimating a VAR, you should always check:

- A) That all variables are I(2)
- B) The stationarity of each variable
- C) That variables are perfectly correlated
- D) That the sample size is exactly 100

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz20_answer = ''  # <-- Enter your answer here

# Check answer
if quiz20_answer.upper() == 'B':
    print("✓ CORRECT!")
    print("Standard VAR requires stationary variables.")
    print("If I(1): either difference or use VECM if cointegrated.")
    print("ADF/KPSS tests should be run first.")
elif quiz20_answer:
    print("✗ Incorrect. Try again!")
    print("Hint: VAR assumes stationarity for valid inference.")

---
# Part 2: Hands-On Exercises

Now let's apply VAR models to real data!

## Exercise 1: Simulating a VAR(1) Process

### Task
Generate and visualize a bivariate VAR(1) process to understand its dynamics.

In [None]:
# VAR(1) simulation
np.random.seed(42)

# Coefficient matrix (stable: eigenvalues < 1)
A = np.array([[0.7, 0.2],
              [0.1, 0.5]])

# Check stability
eigenvalues = np.linalg.eigvals(A)
print("Coefficient matrix A:")
print(A)
print(f"\nEigenvalues: {eigenvalues}")
print(f"Moduli: {np.abs(eigenvalues)}")
print(f"Stable: {all(np.abs(eigenvalues) < 1)}")

In [None]:
# Simulate VAR(1)
T = 200
K = 2
c = np.array([1, 0.5])  # Constants

# Initialize
Y = np.zeros((T, K))
epsilon = np.random.randn(T, K)  # Shocks

# Generate
for t in range(1, T):
    Y[t] = c + A @ Y[t-1] + epsilon[t]

# Create DataFrame
var_data = pd.DataFrame(Y, columns=['Y1', 'Y2'])

# Plot
fig, axes = plt.subplots(2, 1, figsize=(12, 8), sharex=True)

axes[0].plot(var_data['Y1'], color=COLORS['blue'], linewidth=1)
axes[0].set_title('Simulated VAR(1): Y₁', fontweight='bold')
axes[0].axhline(y=var_data['Y1'].mean(), color='red', linestyle='--', alpha=0.5)

axes[1].plot(var_data['Y2'], color=COLORS['green'], linewidth=1)
axes[1].set_title('Simulated VAR(1): Y₂', fontweight='bold')
axes[1].axhline(y=var_data['Y2'].mean(), color='red', linestyle='--', alpha=0.5)
axes[1].set_xlabel('Time')

plt.tight_layout()
plt.show()

print(f"\nSample means: Y₁={var_data['Y1'].mean():.2f}, Y₂={var_data['Y2'].mean():.2f}")

## Exercise 2: Estimating a VAR Model

### Task
Estimate a VAR model and interpret the results.

In [None]:
# Estimate VAR
model = VAR(var_data)

# Select lag order
lag_order = model.select_order(maxlags=8)
print("Lag Order Selection:")
print(lag_order.summary())

In [None]:
# Fit VAR(1)
results = model.fit(1)
print(results.summary())

In [None]:
# Compare estimated vs true coefficients
print("Coefficient Comparison:")
print("="*50)
print("\nTrue A matrix:")
print(A)
print("\nEstimated A matrix:")
# Extract coefficient matrices (excluding constants)
A_hat = results.coefs[0]  # First lag coefficients
print(A_hat)
print(f"\nMax absolute error: {np.max(np.abs(A - A_hat)):.4f}")

## Exercise 3: Granger Causality Testing

### Task
Test for Granger causality in both directions.

In [None]:
# Granger causality tests
print("Granger Causality Tests")
print("="*60)

# Test: Does Y2 Granger-cause Y1?
print("\nH₀: Y₂ does NOT Granger-cause Y₁")
print("-"*40)
gc_test1 = grangercausalitytests(var_data[['Y1', 'Y2']], maxlag=4, verbose=True)

In [None]:
# Test: Does Y1 Granger-cause Y2?
print("\nH₀: Y₁ does NOT Granger-cause Y₂")
print("-"*40)
gc_test2 = grangercausalitytests(var_data[['Y2', 'Y1']], maxlag=4, verbose=True)

In [None]:
# Summary
print("\nGranger Causality Summary:")
print("="*50)
print("Based on p-values at lag 1:")
# Get p-values at lag 1
p_y2_to_y1 = gc_test1[1][0]['ssr_ftest'][1]
p_y1_to_y2 = gc_test2[1][0]['ssr_ftest'][1]

print(f"Y₂ → Y₁: p-value = {p_y2_to_y1:.4f} {'✓ Significant' if p_y2_to_y1 < 0.05 else '✗ Not significant'}")
print(f"Y₁ → Y₂: p-value = {p_y1_to_y2:.4f} {'✓ Significant' if p_y1_to_y2 < 0.05 else '✗ Not significant'}")

print("\nNote: True A matrix has a₁₂=0.2 (Y₂→Y₁) and a₂₁=0.1 (Y₁→Y₂)")
print("So both directions should show Granger causality.")

## Exercise 4: Impulse Response Functions

### Task
Compute and interpret IRFs.

In [None]:
# Compute IRFs
irf = results.irf(20)

# Plot
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# IRF: Y1 shock
axes[0, 0].plot(irf.irfs[:, 0, 0], color=COLORS['blue'], linewidth=2)
axes[0, 0].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[0, 0].fill_between(range(21), irf.irfs[:, 0, 0] - 1.96*irf.stderr()[:, 0, 0],
                        irf.irfs[:, 0, 0] + 1.96*irf.stderr()[:, 0, 0], alpha=0.2)
axes[0, 0].set_title('Response of Y₁ to Y₁ shock', fontweight='bold')

axes[0, 1].plot(irf.irfs[:, 1, 0], color=COLORS['blue'], linewidth=2)
axes[0, 1].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[0, 1].fill_between(range(21), irf.irfs[:, 1, 0] - 1.96*irf.stderr()[:, 1, 0],
                        irf.irfs[:, 1, 0] + 1.96*irf.stderr()[:, 1, 0], alpha=0.2)
axes[0, 1].set_title('Response of Y₂ to Y₁ shock', fontweight='bold')

# IRF: Y2 shock
axes[1, 0].plot(irf.irfs[:, 0, 1], color=COLORS['green'], linewidth=2)
axes[1, 0].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[1, 0].fill_between(range(21), irf.irfs[:, 0, 1] - 1.96*irf.stderr()[:, 0, 1],
                        irf.irfs[:, 0, 1] + 1.96*irf.stderr()[:, 0, 1], alpha=0.2)
axes[1, 0].set_title('Response of Y₁ to Y₂ shock', fontweight='bold')
axes[1, 0].set_xlabel('Horizon')

axes[1, 1].plot(irf.irfs[:, 1, 1], color=COLORS['green'], linewidth=2)
axes[1, 1].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[1, 1].fill_between(range(21), irf.irfs[:, 1, 1] - 1.96*irf.stderr()[:, 1, 1],
                        irf.irfs[:, 1, 1] + 1.96*irf.stderr()[:, 1, 1], alpha=0.2)
axes[1, 1].set_title('Response of Y₂ to Y₂ shock', fontweight='bold')
axes[1, 1].set_xlabel('Horizon')

plt.tight_layout()
plt.show()

print("\nIRF Interpretation:")
print("- Own shocks: immediate impact, then decay")
print("- Cross shocks: show spillover effects between variables")
print("- All responses → 0 as h → ∞ (stable VAR)")

## Exercise 5: Forecast Error Variance Decomposition

### Task
Compute and interpret FEVD.

In [None]:
# Compute FEVD
fevd = results.fevd(20)

# Plot
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# FEVD for Y1
axes[0].stackplot(range(21), 
                  fevd.decomp[:, 0, 0]*100, 
                  fevd.decomp[:, 0, 1]*100,
                  labels=['Y₁ shock', 'Y₂ shock'],
                  colors=[COLORS['blue'], COLORS['green']], alpha=0.7)
axes[0].set_title('FEVD of Y₁', fontweight='bold')
axes[0].set_xlabel('Horizon')
axes[0].set_ylabel('Percent')
axes[0].legend(loc='center right')
axes[0].set_ylim(0, 100)

# FEVD for Y2
axes[1].stackplot(range(21), 
                  fevd.decomp[:, 1, 0]*100, 
                  fevd.decomp[:, 1, 1]*100,
                  labels=['Y₁ shock', 'Y₂ shock'],
                  colors=[COLORS['blue'], COLORS['green']], alpha=0.7)
axes[1].set_title('FEVD of Y₂', fontweight='bold')
axes[1].set_xlabel('Horizon')
axes[1].set_ylabel('Percent')
axes[1].legend(loc='center right')
axes[1].set_ylim(0, 100)

plt.tight_layout()
plt.show()

# Print table
print("\nFEVD at selected horizons (%)")
print("="*60)
print(f"{'Horizon':<10} {'Y₁ by Y₁':>12} {'Y₁ by Y₂':>12} {'Y₂ by Y₁':>12} {'Y₂ by Y₂':>12}")
for h in [1, 5, 10, 20]:
    print(f"{h:<10} {fevd.decomp[h, 0, 0]*100:>12.1f} {fevd.decomp[h, 0, 1]*100:>12.1f} "
          f"{fevd.decomp[h, 1, 0]*100:>12.1f} {fevd.decomp[h, 1, 1]*100:>12.1f}")

## Exercise 6: Real Data - GDP and Unemployment

### Task
Apply VAR to analyze the relationship between GDP growth and unemployment.

In [None]:
# Load or generate macroeconomic data
np.random.seed(123)
n = 120  # 30 years quarterly

# Simulate Okun's Law relationship
# ΔU = -0.5 * ΔY (simplified)
gdp_growth = np.zeros(n)
unemployment = np.zeros(n)
unemployment[0] = 5  # Initial unemployment rate

for t in range(1, n):
    # GDP growth: AR(1) + shock
    gdp_growth[t] = 0.3 * gdp_growth[t-1] + np.random.randn() * 1.5 + 2
    # Unemployment: responds to GDP with lag
    unemployment[t] = 0.8 * unemployment[t-1] - 0.3 * gdp_growth[t-1] + np.random.randn() * 0.3 + 1

# Create DataFrame
macro_data = pd.DataFrame({
    'GDP_Growth': gdp_growth,
    'Unemployment': np.clip(unemployment, 2, 12)  # Keep realistic range
}, index=pd.date_range('1994Q1', periods=n, freq='QE'))

print(f"Macro Data: {len(macro_data)} quarterly observations")
print(macro_data.describe())

In [None]:
# Plot the data
fig, axes = plt.subplots(2, 1, figsize=(12, 8), sharex=True)

axes[0].plot(macro_data.index, macro_data['GDP_Growth'], color=COLORS['blue'], linewidth=1)
axes[0].axhline(y=0, color='red', linestyle='--', alpha=0.5)
axes[0].set_title('GDP Growth Rate (%)', fontweight='bold')
axes[0].set_ylabel('%')

axes[1].plot(macro_data.index, macro_data['Unemployment'], color=COLORS['orange'], linewidth=1)
axes[1].set_title('Unemployment Rate (%)', fontweight='bold')
axes[1].set_xlabel('Date')
axes[1].set_ylabel('%')

plt.tight_layout()
plt.show()

In [None]:
# Stationarity tests
print("ADF Tests for Stationarity")
print("="*50)

for col in macro_data.columns:
    result = adfuller(macro_data[col], autolag='AIC')
    status = "STATIONARY" if result[1] < 0.05 else "NON-STATIONARY"
    print(f"{col:<15} ADF={result[0]:>8.3f}  p={result[1]:.4f}  → {status}")

In [None]:
# Fit VAR
macro_model = VAR(macro_data)
lag_order = macro_model.select_order(maxlags=8)
print("Lag Order Selection:")
print(lag_order.summary())

# Use BIC-selected lag
optimal_lag = lag_order.bic
print(f"\nUsing lag order: {optimal_lag}")

In [None]:
# Estimate VAR
macro_results = macro_model.fit(optimal_lag)
print(macro_results.summary())

In [None]:
# Granger Causality
print("\nGranger Causality Tests")
print("="*60)

print("\n1. Does Unemployment Granger-cause GDP Growth?")
gc1 = grangercausalitytests(macro_data[['GDP_Growth', 'Unemployment']], maxlag=4, verbose=False)
for lag in [1, 2, 3, 4]:
    p = gc1[lag][0]['ssr_ftest'][1]
    sig = "*" if p < 0.05 else ""
    print(f"  Lag {lag}: p-value = {p:.4f} {sig}")

print("\n2. Does GDP Growth Granger-cause Unemployment?")
gc2 = grangercausalitytests(macro_data[['Unemployment', 'GDP_Growth']], maxlag=4, verbose=False)
for lag in [1, 2, 3, 4]:
    p = gc2[lag][0]['ssr_ftest'][1]
    sig = "*" if p < 0.05 else ""
    print(f"  Lag {lag}: p-value = {p:.4f} {sig}")

In [None]:
# IRFs
macro_irf = macro_results.irf(16)

fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# GDP shock effects
axes[0, 0].plot(macro_irf.irfs[:, 0, 0], color=COLORS['blue'], linewidth=2)
axes[0, 0].fill_between(range(17), 
                        macro_irf.irfs[:, 0, 0] - 1.96*macro_irf.stderr()[:, 0, 0],
                        macro_irf.irfs[:, 0, 0] + 1.96*macro_irf.stderr()[:, 0, 0], alpha=0.2)
axes[0, 0].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[0, 0].set_title('GDP → GDP', fontweight='bold')

axes[0, 1].plot(macro_irf.irfs[:, 1, 0], color=COLORS['blue'], linewidth=2)
axes[0, 1].fill_between(range(17), 
                        macro_irf.irfs[:, 1, 0] - 1.96*macro_irf.stderr()[:, 1, 0],
                        macro_irf.irfs[:, 1, 0] + 1.96*macro_irf.stderr()[:, 1, 0], alpha=0.2)
axes[0, 1].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[0, 1].set_title('GDP → Unemployment (Okun\'s Law)', fontweight='bold')

# Unemployment shock effects
axes[1, 0].plot(macro_irf.irfs[:, 0, 1], color=COLORS['orange'], linewidth=2)
axes[1, 0].fill_between(range(17), 
                        macro_irf.irfs[:, 0, 1] - 1.96*macro_irf.stderr()[:, 0, 1],
                        macro_irf.irfs[:, 0, 1] + 1.96*macro_irf.stderr()[:, 0, 1], alpha=0.2)
axes[1, 0].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[1, 0].set_title('Unemployment → GDP', fontweight='bold')
axes[1, 0].set_xlabel('Quarters')

axes[1, 1].plot(macro_irf.irfs[:, 1, 1], color=COLORS['orange'], linewidth=2)
axes[1, 1].fill_between(range(17), 
                        macro_irf.irfs[:, 1, 1] - 1.96*macro_irf.stderr()[:, 1, 1],
                        macro_irf.irfs[:, 1, 1] + 1.96*macro_irf.stderr()[:, 1, 1], alpha=0.2)
axes[1, 1].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[1, 1].set_title('Unemployment → Unemployment', fontweight='bold')
axes[1, 1].set_xlabel('Quarters')

plt.tight_layout()
plt.show()

print("\nIRF Interpretation:")
print("- GDP shock → Unemployment falls (Okun's Law)")
print("- Unemployment is persistent (slow decay)")
print("- Effects die out over 8-12 quarters")

## Exercise 7: Forecasting with VAR

### Task
Generate multi-step forecasts with confidence intervals.

In [None]:
# Generate forecasts
forecast_steps = 8  # 2 years
forecast = macro_results.forecast(macro_data.values[-optimal_lag:], steps=forecast_steps)

# Create forecast dates
forecast_dates = pd.date_range(start=macro_data.index[-1] + pd.DateOffset(months=3), 
                               periods=forecast_steps, freq='QE')

# Get forecast intervals
forecast_interval = macro_results.forecast_interval(macro_data.values[-optimal_lag:], 
                                                     steps=forecast_steps, alpha=0.05)

# Plot
fig, axes = plt.subplots(2, 1, figsize=(12, 8), sharex=True)

# GDP Growth
axes[0].plot(macro_data.index[-20:], macro_data['GDP_Growth'].values[-20:], 
             color=COLORS['blue'], linewidth=1.5, label='Historical')
axes[0].plot(forecast_dates, forecast[:, 0], 
             color=COLORS['red'], linewidth=2, linestyle='--', label='Forecast')
axes[0].fill_between(forecast_dates, forecast_interval[1][:, 0], forecast_interval[2][:, 0],
                     color=COLORS['red'], alpha=0.2, label='95% CI')
axes[0].axvline(x=macro_data.index[-1], color='black', linestyle='-', alpha=0.3)
axes[0].set_title('GDP Growth Forecast', fontweight='bold')
axes[0].legend(loc='upper right')

# Unemployment
axes[1].plot(macro_data.index[-20:], macro_data['Unemployment'].values[-20:], 
             color=COLORS['orange'], linewidth=1.5, label='Historical')
axes[1].plot(forecast_dates, forecast[:, 1], 
             color=COLORS['red'], linewidth=2, linestyle='--', label='Forecast')
axes[1].fill_between(forecast_dates, forecast_interval[1][:, 1], forecast_interval[2][:, 1],
                     color=COLORS['red'], alpha=0.2, label='95% CI')
axes[1].axvline(x=macro_data.index[-1], color='black', linestyle='-', alpha=0.3)
axes[1].set_title('Unemployment Forecast', fontweight='bold')
axes[1].legend(loc='upper right')

plt.tight_layout()
plt.show()

# Print forecast table
print("\nForecast Summary:")
print("="*70)
print(f"{'Date':<12} {'GDP Growth':>15} {'95% CI':>20} {'Unemp':>10} {'95% CI':>20}")
for i in range(forecast_steps):
    print(f"{str(forecast_dates[i].date()):<12} {forecast[i, 0]:>15.2f} "
          f"[{forecast_interval[1][i, 0]:>6.2f}, {forecast_interval[2][i, 0]:>6.2f}] "
          f"{forecast[i, 1]:>10.2f} [{forecast_interval[1][i, 1]:>6.2f}, {forecast_interval[2][i, 1]:>6.2f}]")

## Summary

### What We Practiced

1. **VAR Model Structure**: Understanding coefficient matrices and parameter counting
2. **Estimation**: Fitting VAR models and selecting lag order
3. **Granger Causality**: Testing predictive relationships between variables
4. **Impulse Response Functions**: Tracing shock propagation
5. **FEVD**: Decomposing forecast variance by shock source
6. **Forecasting**: Multi-step predictions with uncertainty

### Key Takeaways

- VAR models capture interdependencies between multiple time series
- Granger causality tests predictive content, NOT true causation
- Stability requires eigenvalues inside the unit circle
- IRF and FEVD depend on shock ordering (Cholesky identification)
- Always check stationarity before estimating VAR
- Use BIC for parsimonious model selection