[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/danpele/Time-Series-Analysis/blob/main/EN/Course_Notebooks/chapter7_lecture_notebook.ipynb)

---

# Chapter 7: Cointegration and VECM

**Course:** Time Series Analysis and Forecasting  
**Program:** Bachelor program, Faculty of Cybernetics, Statistics and Economic Informatics, Bucharest University of Economic Studies, Romania  
**Academic Year:** 2025-2026

---

## Learning Objectives

By the end of this notebook, you will be able to:
1. Understand the concept of cointegration and its economic interpretation
2. Identify and avoid spurious regression problems
3. Apply Engle-Granger and Johansen cointegration tests
4. Estimate and interpret Vector Error Correction Models (VECM)
5. Analyze adjustment coefficients and weak exogeneity
6. Apply cointegration analysis to real financial data

## Setup and Imports

In [None]:
# Core libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# Statistical tests and models
from statsmodels.tsa.stattools import adfuller, coint
from statsmodels.tsa.vector_ar.vecm import coint_johansen, VECM
from statsmodels.tsa.api import VAR
from statsmodels.regression.linear_model import OLS
from statsmodels.tools.tools import add_constant
from scipy import stats

# For fetching real data
try:
    import pandas_datareader.data as web
    HAS_PDR = True
except ImportError:
    HAS_PDR = False
    print("Note: pandas_datareader not installed. Install with: pip install pandas-datareader")

# Plotting style - clean, professional, transparent
plt.rcParams['figure.figsize'] = (12, 5)
plt.rcParams['font.size'] = 11
plt.rcParams['axes.facecolor'] = 'none'
plt.rcParams['figure.facecolor'] = 'none'
plt.rcParams['savefig.facecolor'] = 'none'
plt.rcParams['savefig.transparent'] = True
plt.rcParams['axes.grid'] = False
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.spines.right'] = False
plt.rcParams['legend.frameon'] = False
plt.rcParams['legend.loc'] = 'upper center'

# Colors (IDA color scheme)
COLORS = {
    'blue': '#1A3A6E',
    'red': '#DC3545',
    'green': '#2E7D32',
    'orange': '#E67E22',
    'gray': '#666666'
}

print("All libraries loaded successfully!")

## 1. The Problem: Non-Stationary Time Series

Many economic and financial variables are **non-stationary** (I(1)):
- Stock prices, GDP, consumption, investment
- Exchange rates, interest rates
- Price indices, wages

**The challenge:**
- Standard regression with I(1) variables leads to **spurious results**
- Simply differencing loses **long-run information**

**The solution: Cointegration**

## 2. Spurious Regression Problem

**Granger & Newbold (1974):** Regressing one random walk on another independent random walk gives:
- High R² (often > 0.9)
- Significant t-statistics
- Very low Durbin-Watson statistic (DW ≈ 0)

**Rule of thumb:** If R² > DW, suspect spurious regression!

In [None]:
# Demonstrate spurious regression
np.random.seed(42)
n = 200

# Two INDEPENDENT random walks
y1 = np.cumsum(np.random.randn(n))  # Random walk 1
y2 = np.cumsum(np.random.randn(n))  # Random walk 2 (independent!)

# Run OLS regression
X = add_constant(y2)
model = OLS(y1, X).fit()

# Calculate Durbin-Watson
from statsmodels.stats.stattools import durbin_watson
dw = durbin_watson(model.resid)

print("Spurious Regression Example")
print("="*60)
print(f"R-squared: {model.rsquared:.4f}")
print(f"Coefficient on Y2: {model.params[1]:.4f} (t-stat: {model.tvalues[1]:.2f})")
print(f"P-value: {model.pvalues[1]:.6f}")
print(f"Durbin-Watson: {dw:.4f}")
print()
print(f"R² > DW? {model.rsquared:.4f} > {dw:.4f} = {model.rsquared > dw}")
print("\n⚠️ WARNING: These series are COMPLETELY INDEPENDENT!")
print("   High R² and significant coefficient are SPURIOUS!")

In [None]:
# Visualize spurious regression
fig, axes = plt.subplots(1, 3, figsize=(15, 4))

# Plot both series
axes[0].plot(y1, color=COLORS['blue'], label='Y1 (Random Walk)', linewidth=1)
axes[0].plot(y2, color=COLORS['orange'], label='Y2 (Random Walk)', linewidth=1)
axes[0].set_title('Two Independent Random Walks', fontweight='bold')
axes[0].set_xlabel('Time')
axes[0].legend(loc='upper center', bbox_to_anchor=(0.5, -0.15), ncol=2)

# Scatter plot
axes[1].scatter(y2, y1, alpha=0.5, s=20, color=COLORS['blue'])
axes[1].plot(y2, model.fittedvalues, color=COLORS['red'], linewidth=2, label=f'OLS fit (R²={model.rsquared:.3f})')
axes[1].set_xlabel('Y2')
axes[1].set_ylabel('Y1')
axes[1].set_title('Spurious Regression', fontweight='bold', color=COLORS['red'])
axes[1].legend(loc='upper center', bbox_to_anchor=(0.5, -0.15))

# Residuals (clearly non-stationary!)
axes[2].plot(model.resid, color=COLORS['green'], linewidth=1)
axes[2].axhline(y=0, color='black', linestyle='--', alpha=0.5)
axes[2].set_title('Residuals (Non-Stationary!)', fontweight='bold', color=COLORS['red'])
axes[2].set_xlabel('Time')

plt.tight_layout()
plt.show()

## 3. Cointegration: The Key Concept

**Definition (Engle & Granger, 1987):**

Variables $Y_{1t}, Y_{2t}, \ldots, Y_{kt}$ are **cointegrated** if:
1. All variables are I(1) (non-stationary with unit root)
2. There exists a linear combination $\beta_1 Y_{1t} + \beta_2 Y_{2t} + \cdots + \beta_k Y_{kt}$ that is I(0) (stationary)

**Intuition:** The variables share a **common stochastic trend** and move together in the long run.

**Economic interpretation:** Cointegration represents a **long-run equilibrium relationship**.

In [None]:
# Create cointegrated series
np.random.seed(123)
n = 200

# Common stochastic trend (random walk)
common_trend = np.cumsum(np.random.randn(n))

# Two cointegrated series sharing the common trend
beta = 0.8  # Cointegrating coefficient
y1 = common_trend + np.random.randn(n) * 0.5
y2 = beta * common_trend + np.random.randn(n) * 0.5

# The spread (cointegrating relationship) should be stationary
spread = y1 - (1/beta) * y2

print("Cointegrated Series Example")
print("="*60)
print(f"Y1 and Y2 share a common trend with β ≈ {1/beta:.2f}")
print(f"\nSpread (Y1 - {1/beta:.2f}*Y2) should be stationary:")

# ADF test on spread
adf_spread = adfuller(spread)
print(f"  ADF statistic: {adf_spread[0]:.4f}")
print(f"  P-value: {adf_spread[1]:.4f}")
print(f"  Stationary? {'Yes ✓' if adf_spread[1] < 0.05 else 'No'}")

In [None]:
# Visualize cointegrated series
fig, axes = plt.subplots(1, 3, figsize=(15, 4))

# Both series
axes[0].plot(y1, color=COLORS['blue'], label='Y1', linewidth=1)
axes[0].plot(y2 * (1/beta), color=COLORS['orange'], label=f'Y2 × {1/beta:.1f}', linewidth=1, alpha=0.7)
axes[0].set_title('Cointegrated Series (Move Together)', fontweight='bold')
axes[0].set_xlabel('Time')
axes[0].legend(loc='upper center', bbox_to_anchor=(0.5, -0.15), ncol=2)

# Scatter plot
axes[1].scatter(y2, y1, alpha=0.5, s=20, color=COLORS['blue'])
z = np.polyfit(y2, y1, 1)
axes[1].plot(y2, np.poly1d(z)(y2), color=COLORS['red'], linewidth=2, label=f'Coint. relation')
axes[1].set_xlabel('Y2')
axes[1].set_ylabel('Y1')
axes[1].set_title('Long-Run Equilibrium', fontweight='bold')
axes[1].legend(loc='upper center', bbox_to_anchor=(0.5, -0.15))

# Spread (stationary!)
axes[2].plot(spread, color=COLORS['green'], linewidth=1)
axes[2].axhline(y=np.mean(spread), color='red', linestyle='--', alpha=0.7)
axes[2].fill_between(range(n), np.mean(spread) - 2*np.std(spread), 
                     np.mean(spread) + 2*np.std(spread), alpha=0.2, color='green')
axes[2].set_title('Spread: Stationary I(0)', fontweight='bold', color=COLORS['green'])
axes[2].set_xlabel('Time')

plt.tight_layout()
plt.show()

## 4. Engle-Granger Two-Step Method

**Step 1:** Estimate the cointegrating regression
$$Y_t = \alpha + \beta X_t + e_t$$

**Step 2:** Test if residuals $\hat{e}_t$ are stationary using ADF test
- $H_0$: Residuals have unit root (no cointegration)
- $H_1$: Residuals are stationary (cointegration exists)

**Important:** Use Engle-Granger critical values, not standard ADF critical values!

In [None]:
# Engle-Granger test using statsmodels
# Create data array for testing
data_coint = pd.DataFrame({'Y1': y1, 'Y2': y2})

print("Engle-Granger Cointegration Test")
print("="*60)

# Using statsmodels coint function
coint_stat, pvalue, crit_values = coint(y1, y2)

print(f"\nTest Statistic: {coint_stat:.4f}")
print(f"P-value: {pvalue:.4f}")
print(f"\nCritical Values:")
print(f"  1%: {crit_values[0]:.4f}")
print(f"  5%: {crit_values[1]:.4f}")
print(f"  10%: {crit_values[2]:.4f}")
print(f"\nConclusion: {'Reject H0 - Cointegrated!' if pvalue < 0.05 else 'Cannot reject H0 - Not cointegrated'}")

In [None]:
# Manual Engle-Granger procedure
print("Manual Engle-Granger Procedure")
print("="*60)

# Step 1: Estimate cointegrating regression
X_coint = add_constant(y2)
coint_reg = OLS(y1, X_coint).fit()

print("\nStep 1: Cointegrating Regression")
print(f"  Y1 = {coint_reg.params[0]:.4f} + {coint_reg.params[1]:.4f} × Y2")
print(f"  R-squared: {coint_reg.rsquared:.4f}")

# Step 2: Test residuals for stationarity
residuals = coint_reg.resid
adf_result = adfuller(residuals, regression='c')

print("\nStep 2: ADF Test on Residuals")
print(f"  ADF Statistic: {adf_result[0]:.4f}")
print(f"  P-value: {adf_result[1]:.4f}")
print(f"  Critical Values:")
for key, value in adf_result[4].items():
    print(f"    {key}: {value:.4f}")

# Note about critical values
print("\n⚠️ Note: For cointegration tests, use Engle-Granger critical values")
print("   (more negative than standard ADF because residuals are estimated)")

## 5. Johansen Cointegration Test

**Advantages over Engle-Granger:**
- Tests for **multiple** cointegrating relationships
- Maximum likelihood estimation (more efficient)
- No need to choose dependent variable

**Key concept:** Test the rank of matrix $\Pi$ in the VECM:
- rank($\Pi$) = 0: No cointegration
- 0 < rank($\Pi$) = r < k: r cointegrating vectors
- rank($\Pi$) = k: All variables are I(0)

In [None]:
# Johansen cointegration test
print("Johansen Cointegration Test")
print("="*60)

# Prepare data
data_johansen = np.column_stack([y1, y2])

# Run Johansen test
johansen_result = coint_johansen(data_johansen, det_order=0, k_ar_diff=1)

print("\nTrace Test:")
print(f"{'Rank':>6} {'Trace Stat':>12} {'Crit 90%':>10} {'Crit 95%':>10} {'Crit 99%':>10}")
print("-"*55)
for i in range(2):
    sig = " **" if johansen_result.lr1[i] > johansen_result.cvt[i, 1] else ""
    print(f"r = {i:>2} {johansen_result.lr1[i]:>12.2f} {johansen_result.cvt[i, 0]:>10.2f} "
          f"{johansen_result.cvt[i, 1]:>10.2f} {johansen_result.cvt[i, 2]:>10.2f}{sig}")

print("\nMax Eigenvalue Test:")
print(f"{'Rank':>6} {'Max Eig':>12} {'Crit 90%':>10} {'Crit 95%':>10} {'Crit 99%':>10}")
print("-"*55)
for i in range(2):
    sig = " **" if johansen_result.lr2[i] > johansen_result.cvm[i, 1] else ""
    print(f"r = {i:>2} {johansen_result.lr2[i]:>12.2f} {johansen_result.cvm[i, 0]:>10.2f} "
          f"{johansen_result.cvm[i, 1]:>10.2f} {johansen_result.cvm[i, 2]:>10.2f}{sig}")

print("\nEigenvalues:", johansen_result.eig.round(4))
print("\nCointegrating vector (β):")
print(johansen_result.evec[:, 0].round(4))

## 6. Vector Error Correction Model (VECM)

When variables are cointegrated, we use VECM instead of VAR in differences:

$$\Delta \mathbf{Y}_t = \mathbf{c} + \boldsymbol{\alpha}\boldsymbol{\beta}'\mathbf{Y}_{t-1} + \sum_{j=1}^{p-1} \boldsymbol{\Gamma}_j \Delta \mathbf{Y}_{t-j} + \boldsymbol{\varepsilon}_t$$

where:
- $\boldsymbol{\beta}$ = cointegrating vectors (define equilibrium)
- $\boldsymbol{\alpha}$ = adjustment coefficients (speed of adjustment)
- $\boldsymbol{\beta}'\mathbf{Y}_{t-1}$ = error correction term (deviation from equilibrium)

In [None]:
# Estimate VECM
print("Vector Error Correction Model (VECM)")
print("="*60)

# Prepare data as DataFrame
data_vecm = pd.DataFrame({'Y1': y1, 'Y2': y2})

# Fit VECM with 1 cointegrating relationship
vecm_model = VECM(data_vecm, k_ar_diff=1, coint_rank=1, deterministic='ci')
vecm_results = vecm_model.fit()

print(vecm_results.summary())

In [None]:
# Extract and interpret VECM parameters
print("VECM Parameter Interpretation")
print("="*60)

# Get alpha (adjustment coefficients)
alpha = vecm_results.alpha
print(f"\nAdjustment Coefficients (α):")
print(f"  α₁ (Y1): {alpha[0, 0]:.4f}")
print(f"  α₂ (Y2): {alpha[1, 0]:.4f}")

# Interpretation
print(f"\nInterpretation:")
if alpha[0, 0] < 0:
    print(f"  Y1 adjusts by {abs(alpha[0, 0])*100:.1f}% of disequilibrium per period")
if alpha[1, 0] != 0:
    print(f"  Y2 adjusts by {abs(alpha[1, 0])*100:.1f}% of disequilibrium per period")

# Weak exogeneity
print(f"\nWeak Exogeneity:")
if abs(alpha[0, 0]) < 0.01:
    print(f"  Y1 is weakly exogenous (does not adjust to equilibrium)")
if abs(alpha[1, 0]) < 0.01:
    print(f"  Y2 is weakly exogenous (does not adjust to equilibrium)")

## 7. Real-World Example: Interest Rates

**Expectations Hypothesis of Term Structure:**
- Short-term and long-term interest rates should be cointegrated
- The spread (term premium) should be stationary

In [None]:
# Fetch real interest rate data
if HAS_PDR:
    try:
        # 3-month Treasury Bill and 10-year Treasury yield
        short_rate = web.DataReader('TB3MS', 'fred', '1990-01-01', '2024-12-31')
        long_rate = web.DataReader('GS10', 'fred', '1990-01-01', '2024-12-31')
        
        rates_data = pd.DataFrame({
            'ShortRate': short_rate['TB3MS'],
            'LongRate': long_rate['GS10']
        }).dropna()
        
        print(f"✓ Real Interest Rate Data: {len(rates_data)} monthly observations")
        DATA_SOURCE = "FRED"
    except Exception as e:
        print(f"Could not fetch data: {e}")
        HAS_PDR = False

if not HAS_PDR:
    # Simulated cointegrated interest rates
    np.random.seed(456)
    n = 400
    
    # Common trend
    trend = np.cumsum(np.random.randn(n) * 0.1) + 5
    
    # Short and long rates
    short_rate = trend + np.random.randn(n) * 0.3
    long_rate = trend + 1.5 + np.random.randn(n) * 0.2  # Higher with term premium
    
    rates_data = pd.DataFrame({
        'ShortRate': short_rate,
        'LongRate': long_rate
    }, index=pd.date_range('1990-01', periods=n, freq='ME'))
    
    DATA_SOURCE = "Simulated"
    print(f"Using simulated data: {len(rates_data)} observations")

print(f"\nData Source: {DATA_SOURCE}")
print(rates_data.describe().round(2))

In [None]:
# Plot interest rates
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Both rates
axes[0].plot(rates_data.index, rates_data['ShortRate'], color=COLORS['blue'], 
             label='3-Month T-Bill', linewidth=1)
axes[0].plot(rates_data.index, rates_data['LongRate'], color=COLORS['orange'], 
             label='10-Year Treasury', linewidth=1)
axes[0].set_title(f'US Interest Rates ({DATA_SOURCE})', fontweight='bold')
axes[0].set_ylabel('Interest Rate (%)')
axes[0].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=2)

# Term spread
spread = rates_data['LongRate'] - rates_data['ShortRate']
axes[1].plot(spread.index, spread, color=COLORS['green'], linewidth=1)
axes[1].axhline(y=spread.mean(), color='red', linestyle='--', alpha=0.7, label=f'Mean: {spread.mean():.2f}%')
axes[1].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[1].fill_between(spread.index, 0, spread, where=spread < 0, alpha=0.3, color='red', label='Inverted')
axes[1].set_title('Term Spread (Long - Short)', fontweight='bold')
axes[1].set_ylabel('Spread (%)')
axes[1].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=2)

plt.tight_layout()
plt.subplots_adjust(bottom=0.2)
plt.show()

In [None]:
# Cointegration analysis of interest rates
print("Cointegration Analysis: Interest Rates")
print("="*60)

# Unit root tests
print("\n1. Unit Root Tests (ADF):")
for col in ['ShortRate', 'LongRate']:
    adf = adfuller(rates_data[col].dropna())
    status = "Stationary" if adf[1] < 0.05 else "Non-stationary I(1)"
    print(f"   {col:>12}: ADF = {adf[0]:>7.3f}, p-value = {adf[1]:.4f} → {status}")

# Test spread
adf_spread = adfuller(spread.dropna())
print(f"   {'Spread':>12}: ADF = {adf_spread[0]:>7.3f}, p-value = {adf_spread[1]:.4f}")

# Engle-Granger test
print("\n2. Engle-Granger Cointegration Test:")
eg_stat, eg_pval, eg_crit = coint(rates_data['ShortRate'], rates_data['LongRate'])
print(f"   Test statistic: {eg_stat:.4f}")
print(f"   P-value: {eg_pval:.4f}")
print(f"   Conclusion: {'Cointegrated' if eg_pval < 0.05 else 'Not cointegrated'} at 5% level")

# Johansen test
print("\n3. Johansen Cointegration Test:")
johansen = coint_johansen(rates_data.values, det_order=0, k_ar_diff=1)
print(f"   Trace stat (r=0): {johansen.lr1[0]:.2f} vs 95% CV: {johansen.cvt[0, 1]:.2f}")
print(f"   Trace stat (r≤1): {johansen.lr1[1]:.2f} vs 95% CV: {johansen.cvt[1, 1]:.2f}")
print(f"   Conclusion: {'1 cointegrating relationship' if johansen.lr1[0] > johansen.cvt[0, 1] else 'No cointegration'}")

In [None]:
# Estimate VECM for interest rates
print("VECM Estimation: Interest Rates")
print("="*60)

vecm_rates = VECM(rates_data, k_ar_diff=1, coint_rank=1, deterministic='ci')
vecm_rates_results = vecm_rates.fit()

print(vecm_rates_results.summary())

In [None]:
# Economic interpretation
print("\nEconomic Interpretation")
print("="*60)

alpha_rates = vecm_rates_results.alpha
beta_rates = vecm_rates_results.beta

print(f"\nCointegrating vector (β): [{beta_rates[0, 0]:.4f}, {beta_rates[1, 0]:.4f}]")
print(f"  → Long-run: ShortRate = {-beta_rates[1, 0]/beta_rates[0, 0]:.4f} × LongRate + const")

print(f"\nAdjustment coefficients (α):")
print(f"  ShortRate α: {alpha_rates[0, 0]:.4f}")
print(f"  LongRate α:  {alpha_rates[1, 0]:.4f}")

print(f"\nInterpretation:")
if abs(alpha_rates[0, 0]) > abs(alpha_rates[1, 0]):
    print(f"  → Short rate adjusts MORE to disequilibrium")
    print(f"  → Long rate is more 'weakly exogenous' (driven by expectations)")
    print(f"  → Consistent with central bank adjusting short rate to maintain spread")
else:
    print(f"  → Long rate adjusts MORE to disequilibrium")

## 8. VECM Forecasting

VECM provides better long-run forecasts than VAR in differences because it preserves the equilibrium relationship.

In [None]:
# VECM forecasting
forecast_steps = 24

# Generate forecasts
forecast = vecm_rates_results.predict(steps=forecast_steps)

# Create forecast dates
if hasattr(rates_data.index[-1], 'to_timestamp'):
    last_date = rates_data.index[-1].to_timestamp()
else:
    last_date = rates_data.index[-1]
forecast_dates = pd.date_range(start=last_date + pd.DateOffset(months=1), 
                               periods=forecast_steps, freq='ME')

# Plot forecasts
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

for i, (col, color) in enumerate(zip(['ShortRate', 'LongRate'], [COLORS['blue'], COLORS['orange']])):
    # Historical
    axes[i].plot(rates_data.index[-60:], rates_data[col].values[-60:], 
                 color=color, linewidth=1.5, label='Historical')
    # Forecast
    axes[i].plot(forecast_dates, forecast[:, i], 
                 color=COLORS['red'], linewidth=2, linestyle='--', label='VECM Forecast')
    axes[i].axvline(x=rates_data.index[-1], color='black', linestyle='-', alpha=0.3)
    axes[i].set_title(f'{col} Forecast', fontweight='bold')
    axes[i].set_ylabel('Rate (%)')
    axes[i].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=2)

plt.tight_layout()
plt.subplots_adjust(bottom=0.2)
plt.show()

## 9. Impulse Response Functions in VECM

In a cointegrated system, shocks have **permanent effects** on levels but the system returns to equilibrium.

In [None]:
# IRF from VECM
irf = vecm_rates_results.irf(periods=40)

fig, axes = plt.subplots(2, 2, figsize=(12, 8))

titles = [['Short → Short', 'Long → Short'], ['Short → Long', 'Long → Long']]
colors_irf = [[COLORS['blue'], COLORS['orange']], [COLORS['blue'], COLORS['orange']]]

for i in range(2):
    for j in range(2):
        axes[i, j].plot(irf.irfs[:, i, j], color=colors_irf[i][j], linewidth=2)
        axes[i, j].axhline(y=0, color='black', linestyle='-', alpha=0.3)
        axes[i, j].set_title(titles[i][j], fontweight='bold')
        if i == 1:
            axes[i, j].set_xlabel('Horizon')
        if j == 0:
            axes[i, j].set_ylabel('Response')

plt.suptitle('VECM Impulse Response Functions', fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

print("\nNote: Unlike stationary VAR, VECM IRFs don't decay to zero.")
print("      Shocks have permanent effects but equilibrium is restored.")

## Summary

### Key Takeaways

1. **Spurious regression** occurs when regressing I(1) series that are not cointegrated
   - High R², significant coefficients, but meaningless!
   - Rule: If R² > DW, suspect spurious regression

2. **Cointegration** means I(1) variables share a common trend
   - Linear combination is stationary (I(0))
   - Represents long-run equilibrium relationship

3. **Testing for cointegration:**
   - Engle-Granger: Simple, but only one vector
   - Johansen: Multiple vectors, more powerful

4. **VECM** is the appropriate model for cointegrated variables
   - $\beta$ = cointegrating vectors (equilibrium)
   - $\alpha$ = adjustment speeds
   - Preserves long-run information lost by differencing

5. **Weak exogeneity** ($\alpha = 0$): Variable doesn't respond to disequilibrium

### Practical Workflow
1. Test for unit roots (ADF/KPSS)
2. If I(1), test for cointegration (Johansen)
3. If cointegrated, estimate VECM
4. Interpret adjustment coefficients
5. Check diagnostics
6. IRF, FEVD, forecasting