# Dynamic Spatial Panel Models

**Level**: Advanced  
**Duration**: 180-210 minutes  
**Prerequisites**: Notebooks 01-06, GMM for dynamic panels

---

## Learning Objectives

By the end of this notebook, you will be able to:

1. **Estimate** dynamic spatial panel models via GMM
2. **Understand** double endogeneity (Nickell bias + spatial endogeneity)
3. **Construct** valid instruments for both sources of endogeneity
4. **Compute** short-run vs long-run effects and dynamic multipliers
5. **Generate** impulse-response functions across space and time
6. **Test** instrument validity with Hansen J-test

---

## Table of Contents

1. [Motivation: Time + Space](#1.-Motivation:-Time-+-Space)
2. [Valid Instruments for Dynamic Spatial Panels](#2.-Valid-Instruments-for-Dynamic-Spatial-Panels)
3. [GMM Estimation](#3.-GMM-Estimation)
4. [Short-Run vs Long-Run Effects](#4.-Short-Run-vs-Long-Run-Effects)
5. [Hansen J-Test (Over-Identification)](#5.-Hansen-J-Test-(Over-Identification))
6. [Impulse-Response Functions](#6.-Impulse-Response-Functions)
7. [Difference-in-Sargan Test](#7.-Difference-in-Sargan-Test)
8. [Case Study: Regional Economic Growth](#8.-Case-Study:-Regional-Economic-Growth)
9. [Diagnostic Tests](#9.-Diagnostic-Tests)
10. [Summary](#10.-Summary)

---

## Setup

In [None]:
import sys
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from scipy.stats import chi2
import warnings
warnings.filterwarnings('ignore')

# Add panelbox to path
panelbox_path = Path("/home/guhaase/projetos/panelbox")
sys.path.insert(0, str(panelbox_path))

# Set style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

print("‚úì Packages loaded successfully")
print(f"‚úì Panelbox path: {panelbox_path}")

## 1. Motivation: Time + Space

### The Dynamic Spatial Panel Model

$$
y_{it} = \gamma y_{i,t-1} + \rho W y_{it} + X_{it}\beta + \alpha_i + \varepsilon_{it}
$$

Where:
- $\gamma$: **Temporal persistence** (AR coefficient)
- $\rho$: **Spatial spillover** parameter
- $y_{i,t-1}$: Own past value (lagged dependent variable)
- $Wy_{it}$: Contemporaneous spatial lag

### Why Both Dynamics?

**Temporal dynamics** ($\gamma$):
- Path dependence and adjustment costs
- Habit formation and persistence
- Economic convergence ($\gamma < 1$ ‚Üí shocks dissipate)

**Spatial dynamics** ($\rho$):
- Spillovers and externalities
- Diffusion processes
- Spatial contagion

### Double Endogeneity Problem

1. **Nickell bias**: $y_{i,t-1}$ correlated with $\alpha_i$ (fixed effects)
2. **Spatial endogeneity**: $Wy_{it}$ correlated with $\varepsilon_{it}$ (simultaneity)
3. **Solution**: GMM with valid instruments for BOTH sources

### Applications

| Application | $\gamma$ (persistence) | $\rho$ (spillover) |
|-------------|----------------------|-------------------|
| Economic growth | Convergence | Knowledge diffusion |
| Crime | Recidivism | Spatial contagion |
| Innovation | Learning by doing | Technology spillovers |
| Unemployment | Hysteresis | Labor market linkages |

---

In [None]:
# Load simulated regional growth data (long panel: N=50 regions, T=15 years)
print("Creating simulated regional growth dataset...\n")

np.random.seed(42)

# Panel dimensions
N = 50  # regions
T = 15  # years (need T >= 10 for dynamic panel)

# Create spatial weight matrix (rook contiguity for 10√ó5 grid)
from libpysal.weights import lat2W
W = lat2W(10, 5)  # 10 rows √ó 5 columns = 50 regions
W.transform = 'r'  # Row-standardize
W_dense = W.full()[0]

# True parameters
gamma_true = 0.6    # Temporal persistence
rho_true = 0.25     # Spatial spillover
beta_invest = 0.15  # Investment effect
beta_educ = 0.20    # Education effect

# Generate data
regions = np.arange(N)
years = np.arange(2005, 2005 + T)

data_list = []

# Fixed effects
alpha = np.random.normal(2.0, 0.5, N)

# Exogenous variables (with region-specific trends)
investment = np.random.uniform(15, 30, (N, T))
education = np.random.uniform(8, 14, (N, T))

# Initialize
y = np.zeros((N, T))
y[:, 0] = alpha + 0.1 * np.random.randn(N)  # Initial condition

# Generate dynamic spatial process
I = np.eye(N)
A_inv = np.linalg.inv(I - rho_true * W_dense)

for t in range(1, T):
    # DGP: y_t = (I - œÅW)^{-1}(Œ≥y_{t-1} + XŒ≤ + Œ± + Œµ)
    mu = gamma_true * y[:, t-1] + beta_invest * investment[:, t] + beta_educ * education[:, t] + alpha
    epsilon = np.random.normal(0, 0.3, N)
    y[:, t] = A_inv @ (mu + epsilon)

# Create panel dataframe
for i in range(N):
    for t in range(T):
        data_list.append({
            'region_id': i,
            'year': years[t],
            'gdp_growth': y[i, t],
            'investment': investment[i, t],
            'education': education[i, t]
        })

data = pd.DataFrame(data_list)

print(f"Panel dimensions:")
print(f"  N = {data['region_id'].nunique()} regions")
print(f"  T = {data['year'].nunique()} years")
print(f"  Total obs = {len(data)}")
print(f"\nTrue parameters:")
print(f"  Œ≥ (persistence) = {gamma_true}")
print(f"  œÅ (spillover)   = {rho_true}")
print(f"  Œ≤_investment    = {beta_invest}")
print(f"  Œ≤_education     = {beta_educ}")

data.head(10)

## 2. Valid Instruments for Dynamic Spatial Panels

### Instrument Requirements

For valid instruments $Z$:

1. **Exogeneity**: $E(Z'\varepsilon) = 0$
2. **Relevance**: $\text{Corr}(Z, \text{endogenous vars}) \neq 0$

### Instrument Sets

**For $y_{i,t-1}$ (Nickell bias)**:
- Temporal lags: $y_{i,t-2}, y_{i,t-3}, \ldots$ (Arellano-Bond)
- Valid if $\varepsilon_{it}$ is serially uncorrelated

**For $Wy_{it}$ (spatial endogeneity)**:
- Spatial lags of exogenous $X$: $WX_{it}, W^2X_{it}, \ldots$
- Temporal-spatial lags: $Wy_{i,t-1}, Wy_{i,t-2}, \ldots$

**Combined Instrument Matrix**:

$$
Z = [y_{i,t-2}, y_{i,t-3}, WX_{it}, W^2X_{it}, Wy_{i,t-1}, Wy_{i,t-2}]
$$

### Key Insight

- More instruments ‚â† better (over-fitting risk)
- Hansen J-test checks over-identification restrictions
- AR(2) test checks if instruments are valid (should be insignificant)

---

In [None]:
# Construct lagged and spatial lag variables
print("Constructing instruments...\n")

# Add entity_id and time columns
data['entity_id'] = data['region_id']
data['time'] = data['year']

# Sort by region and year
data_sorted = data.sort_values(['entity_id', 'time']).reset_index(drop=True)

# Create temporal lags of y
data_sorted['y_lag1'] = data_sorted.groupby('entity_id')['gdp_growth'].shift(1)
data_sorted['y_lag2'] = data_sorted.groupby('entity_id')['gdp_growth'].shift(2)
data_sorted['y_lag3'] = data_sorted.groupby('entity_id')['gdp_growth'].shift(3)

# Create spatial lags
from libpysal.weights import lag_spatial

# For each year, compute spatial lag
Wy_list = []
W_invest_list = []
W_educ_list = []

for year in data_sorted['year'].unique():
    year_data = data_sorted[data_sorted['year'] == year].sort_values('region_id')
    
    # Spatial lag of y
    Wy = lag_spatial(W, year_data['gdp_growth'].values)
    Wy_list.extend(Wy)
    
    # Spatial lag of X (instruments)
    W_invest = lag_spatial(W, year_data['investment'].values)
    W_invest_list.extend(W_invest)
    
    W_educ = lag_spatial(W, year_data['education'].values)
    W_educ_list.extend(W_educ)

data_sorted['Wy'] = Wy_list
data_sorted['W_invest'] = W_invest_list
data_sorted['W_educ'] = W_educ_list

# Create temporal lag of Wy (spatial-temporal instrument)
data_sorted['Wy_lag1'] = data_sorted.groupby('entity_id')['Wy'].shift(1)

print("Instrument variables created:")
print("  Temporal instruments: y_lag2, y_lag3 (for y_lag1)")
print("  Spatial instruments:  W_invest, W_educ (for Wy)")
print("  Combined:             Wy_lag1 (for Wy)")

print("\nSample of constructed variables:")
data_sorted[['region_id', 'year', 'gdp_growth', 'y_lag1', 'y_lag2', 'Wy', 'W_invest']].head(20)

## 3. GMM Estimation

### Two-Step Efficient GMM

**Moment conditions**:

$$
E[Z_i'(y_{it} - \gamma y_{i,t-1} - \rho Wy_{it} - X_{it}\beta)] = 0
$$

**GMM estimator**:

$$
\hat{\theta}_{GMM} = \arg\min_{\theta} \, g(\theta)' W^{-1} g(\theta)
$$

Where:
- $g(\theta) = \frac{1}{N}\sum_{i=1}^N Z_i'\hat{\varepsilon}_i$
- $W$ = efficient weight matrix

### Two-Step Procedure

1. **First step**: 2SLS with $W = I$ (identity matrix)
2. **Second step**: Efficient GMM with $W = \hat{\Omega}^{-1}$ (optimal weights)

---

In [None]:
# For demonstration, we'll use standard panel GMM with spatial lag as additional regressor
# Note: This is a simplified approach. Full implementation would require specialized GMM estimator.

print("Estimating Dynamic Spatial Panel via GMM...\n")

# Remove missing values from lags
data_gmm = data_sorted.dropna(subset=['y_lag1', 'y_lag2', 'y_lag3', 'Wy', 'W_invest', 'W_educ']).copy()

print(f"Estimation sample: {len(data_gmm)} observations\n")

# Manual 2SLS estimation (simplified GMM)
from sklearn.linear_model import LinearRegression
from scipy.linalg import inv

# Endogenous variables: y_lag1, Wy
endog = data_gmm[['y_lag1', 'Wy']].values

# Exogenous variables (include in both stages)
exog = data_gmm[['investment', 'education']].values

# Instruments: y_lag2, y_lag3, W_invest, W_educ, Wy_lag1
instruments = data_gmm[['y_lag2', 'y_lag3', 'W_invest', 'W_educ']].values

# Add constant
const = np.ones((len(data_gmm), 1))

# First stage: Regress endogenous on [instruments, exog, const]
Z = np.hstack([instruments, exog, const])
X_endog = endog

# Fitted values
Pi = inv(Z.T @ Z) @ Z.T @ X_endog
X_endog_hat = Z @ Pi

# Second stage: Regress y on [X_endog_hat, exog, const]
y = data_gmm['gdp_growth'].values
X = np.hstack([X_endog_hat, exog, const])

beta_2sls = inv(X.T @ X) @ X.T @ y

# Extract coefficients
gamma_hat = beta_2sls[0]
rho_hat = beta_2sls[1]
beta_invest_hat = beta_2sls[2]
beta_educ_hat = beta_2sls[3]
intercept = beta_2sls[4]

# Compute standard errors (robust)
residuals = y - X @ beta_2sls
n = len(y)
k = X.shape[1]

# Robust variance-covariance matrix
meat = Z.T @ np.diag(residuals**2) @ Z
bread = inv(X.T @ Z @ inv(Z.T @ Z) @ Z.T @ X)
V_robust = (n / (n - k)) * bread @ (X.T @ Z @ inv(Z.T @ Z) @ meat @ inv(Z.T @ Z) @ Z.T @ X) @ bread
se_robust = np.sqrt(np.diag(V_robust))

# t-statistics
t_stats = beta_2sls / se_robust
p_values = 2 * (1 - stats.t.cdf(np.abs(t_stats), n - k))

# Display results
print("="*70)
print("DYNAMIC SPATIAL PANEL GMM RESULTS")
print("="*70)

print(f"\nSample size: N√óT = {data_gmm['entity_id'].nunique()}√ó{data_gmm['time'].nunique()} = {n}")
print(f"Instruments: {Z.shape[1]} total")

print("\n" + "-"*70)
print(f"{'Variable':<20} {'Coef':>10} {'Std Err':>10} {'t':>8} {'P>|t|':>10}")
print("-"*70)

var_names = ['y_lag1 (Œ≥)', 'Wy (œÅ)', 'investment', 'education', 'const']
for i, name in enumerate(var_names):
    print(f"{name:<20} {beta_2sls[i]:>10.4f} {se_robust[i]:>10.4f} {t_stats[i]:>8.2f} {p_values[i]:>10.4f}")

print("-"*70)

# Interpretation
print("\nINTERPRETATION:")
print(f"\nTemporal persistence (Œ≥): {gamma_hat:.4f} (true: {gamma_true})")
if 0 < gamma_hat < 1 and p_values[0] < 0.05:
    half_life = np.log(0.5) / np.log(gamma_hat)
    print(f"  ‚úì Convergence detected: shocks dissipate over time")
    print(f"  ‚Üí Half-life: {half_life:.2f} years")
elif gamma_hat >= 1:
    print(f"  ‚ö† No convergence: unit root or explosive process")

print(f"\nSpatial spillover (œÅ): {rho_hat:.4f} (true: {rho_true})")
if rho_hat > 0 and p_values[1] < 0.05:
    print(f"  ‚úì Positive spatial spillovers detected")
    print(f"  ‚Üí Growth in neighbors increases own growth")

print(f"\nInvestment effect: {beta_invest_hat:.4f} (true: {beta_invest})")
print(f"Education effect:  {beta_educ_hat:.4f} (true: {beta_educ})")

print("="*70)

# Store for later use
results_dict = {
    'gamma': gamma_hat,
    'rho': rho_hat,
    'beta_invest': beta_invest_hat,
    'beta_educ': beta_educ_hat,
    'se': se_robust,
    'pvalues': p_values,
    'residuals': residuals,
    'n': n,
    'k': k,
    'Z': Z,
    'X': X
}

## 4. Short-Run vs Long-Run Effects

### Conceptual Difference

**Short-run effect**: Immediate impact in same period (coefficient $\beta$)

**Long-run effect**: Cumulative effect after full adjustment

### Long-Run Multiplier

From the model:

$$
y_{it} = \gamma y_{i,t-1} + \rho Wy_{it} + X_{it}\beta + \alpha_i + \varepsilon_{it}
$$

At steady state ($y_{it} = y_{i,t-1} = y^*$):

$$
(I - \rho W)y^* = \frac{1}{1-\gamma} X\beta + \text{const}
$$

**Long-run spatial multiplier**:

$$
\frac{\partial y^*}{\partial X} = \frac{1}{1-\gamma} (I - \rho W)^{-1} \beta
$$

### Approximation

Average long-run effect:

$$
LR = \frac{\beta}{1 - \gamma - \rho \lambda_{max}}
$$

Where $\lambda_{max}$ is the maximum eigenvalue of $W$.

---

In [None]:
# Compute short-run vs long-run effects
print("="*70)
print("SHORT-RUN vs LONG-RUN EFFECTS")
print("="*70)

gamma_hat = results_dict['gamma']
rho_hat = results_dict['rho']
beta_invest_hat = results_dict['beta_invest']
beta_educ_hat = results_dict['beta_educ']

# Compute max eigenvalue of W
eigenvalues = np.linalg.eigvals(W_dense).real
lambda_max = eigenvalues.max()

print(f"\nMaximum eigenvalue of W: {lambda_max:.4f}")

# Temporal multiplier (ignoring spatial)
temporal_multiplier = 1 / (1 - gamma_hat)

# Combined dynamic-spatial multiplier (approximation)
dyn_spatial_multiplier = 1 / (1 - gamma_hat - rho_hat * lambda_max)

print(f"\nMultipliers:")
print(f"  Temporal only:     {temporal_multiplier:.4f}x")
print(f"  Dynamic + Spatial: {dyn_spatial_multiplier:.4f}x")

# Long-run effects
lr_invest = beta_invest_hat * dyn_spatial_multiplier
lr_educ = beta_educ_hat * dyn_spatial_multiplier

print("\n" + "-"*70)
print(f"{'Variable':<15} {'Short-Run':>12} {'Long-Run':>12} {'LR/SR':>10}")
print("-"*70)

print(f"{'Investment':<15} {beta_invest_hat:>12.4f} {lr_invest:>12.4f} {lr_invest/beta_invest_hat:>10.2f}x")
print(f"{'Education':<15} {beta_educ_hat:>12.4f} {lr_educ:>12.4f} {lr_educ/beta_educ_hat:>10.2f}x")

print("-"*70)

print("\nINTERPRETATION (Investment):")
print(f"\n  ‚Üí 1 percentage point increase in investment rate:")
print(f"    ‚Ä¢ Short-run: {beta_invest_hat:.3f} pp increase in GDP growth")
print(f"    ‚Ä¢ Long-run:  {lr_invest:.3f} pp increase in GDP growth")
print(f"    ‚Ä¢ Multiplier: {dyn_spatial_multiplier:.2f}x")

print("\n  The long-run effect is larger because:")
print(f"    1. Temporal persistence (Œ≥={gamma_hat:.2f}): past growth feeds into future growth")
print(f"    2. Spatial spillovers (œÅ={rho_hat:.2f}): neighbors' growth boosts own growth")

print("="*70)

# Visualization
fig, ax = plt.subplots(figsize=(10, 6))

variables = ['Investment', 'Education']
sr_effects = [beta_invest_hat, beta_educ_hat]
lr_effects = [lr_invest, lr_educ]

x = np.arange(len(variables))
width = 0.35

bars1 = ax.bar(x - width/2, sr_effects, width, label='Short-Run', alpha=0.8)
bars2 = ax.bar(x + width/2, lr_effects, width, label='Long-Run', alpha=0.8)

ax.set_xlabel('Variable', fontsize=12, fontweight='bold')
ax.set_ylabel('Effect on GDP Growth', fontsize=12, fontweight='bold')
ax.set_title('Short-Run vs Long-Run Effects', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels(variables)
ax.legend(fontsize=11)
ax.grid(True, alpha=0.3, axis='y')

# Add value labels on bars
for bars in [bars1, bars2]:
    for bar in bars:
        height = bar.get_height()
        ax.text(bar.get_x() + bar.get_width()/2., height,
                f'{height:.3f}', ha='center', va='bottom', fontsize=10)

plt.tight_layout()
plt.savefig('../outputs/figures/nb07_sr_vs_lr.png', dpi=300, bbox_inches='tight')
plt.show()

print("\n‚úì Figure saved: ../outputs/figures/nb07_sr_vs_lr.png")

## 5. Hansen J-Test (Over-Identification)

### Testing Instrument Validity

When # instruments > # parameters (over-identified), we can test validity.

**Hansen J-statistic**:

$$
J = n \cdot g(\hat{\theta})' \hat{W}^{-1} g(\hat{\theta}) \sim \chi^2(df)
$$

Where:
- $df = $ # instruments - # parameters
- $g(\hat{\theta}) = \frac{1}{n}Z'\hat{\varepsilon}$

**Hypotheses**:
- $H_0$: Instruments are valid (uncorrelated with errors)
- $H_A$: At least one instrument is invalid

**Decision rule**:
- p > 0.05: Fail to reject $H_0$ ‚Üí Instruments valid ‚úì
- p < 0.05: Reject $H_0$ ‚Üí Instruments invalid ‚ö†

---

In [None]:
# Hansen J-test for over-identification
print("="*70)
print("HANSEN J-TEST FOR OVER-IDENTIFICATION")
print("="*70)

n = results_dict['n']
k = results_dict['k']
Z = results_dict['Z']
residuals = results_dict['residuals']

# Number of instruments vs parameters
num_instruments = Z.shape[1]
num_params = k
df_hansen = num_instruments - num_params

print(f"\nNumber of instruments: {num_instruments}")
print(f"Number of parameters:  {num_params}")
print(f"Degrees of freedom:    {df_hansen}")

# Compute J-statistic
# J = n * (Z'Œµ)' [Z'Z]^{-1} (Z'Œµ) / œÉ^2
Ze = Z.T @ residuals / n
sigma2 = np.sum(residuals**2) / (n - k)

J_stat = n * Ze.T @ inv(Z.T @ Z / n) @ Ze / sigma2
p_hansen = 1 - chi2.cdf(J_stat, df_hansen)

print(f"\nJ-statistic: {J_stat:.4f}")
print(f"p-value:     {p_hansen:.4f}")

print("\n" + "-"*70)
if p_hansen > 0.05:
    print("‚úì Fail to reject H‚ÇÄ ‚Üí Instruments are valid")
    print("  ‚Üí Over-identifying restrictions are satisfied")
else:
    print("‚ö† Reject H‚ÇÄ ‚Üí Instruments may be invalid")
    print("  ‚Üí Check for:")
    print("    ‚Ä¢ Weak instruments (first-stage F-stat)")
    print("    ‚Ä¢ Serial correlation in errors (AR tests)")
    print("    ‚Ä¢ Invalid exclusion restrictions")

print("-"*70)

print("\nRECOMMENDATIONS:")
print("  ‚Ä¢ J-test alone is not sufficient for instrument validity")
print("  ‚Ä¢ Also check: AR(2) test, weak instrument diagnostics")
print("  ‚Ä¢ Consider difference-in-Sargan for subsets of instruments")

print("="*70)

# Store
results_dict['hansen_j'] = J_stat
results_dict['hansen_pvalue'] = p_hansen
results_dict['hansen_df'] = df_hansen

## 6. Impulse-Response Functions

### Tracing Shock Propagation

An **impulse-response function (IRF)** shows how a one-time shock in region $i$ at time $t$ propagates:
- **Over time**: Through temporal persistence ($\gamma$)
- **Across space**: Through spatial spillovers ($\rho$)

### Dynamic Equation

Simplified (ignoring $X$):

$$
y_t = (I - \rho W)^{-1} \gamma y_{t-1} + \text{shock}
$$

### Interpretation

- **Period 0**: Shock hits region $i$
- **Period 1**: 
  - Region $i$: $\gamma \times$ shock (persistence)
  - Neighbors: $\rho \times$ (region $i$'s shock)
- **Period 2+**: Continues to decay and diffuse

If $0 < \gamma < 1$ and $|\rho| < 1/\lambda_{max}$, the system is stable (shocks dissipate).

---

In [None]:
# Compute and visualize Impulse-Response Function
print("Computing Impulse-Response Function...\n")

gamma_hat = results_dict['gamma']
rho_hat = results_dict['rho']

n_regions = W.n
n_periods = 12

# Initialize
y_irf = np.zeros((n_periods, n_regions))

# Unit shock in region 0 at t=0
shock_region = 0
y_irf[0, shock_region] = 1.0

# Propagate shock over time and space
I = np.eye(n_regions)
A_inv = inv(I - rho_hat * W_dense)

for t in range(1, n_periods):
    # y_t = (I - œÅW)^{-1} Œ≥ y_{t-1}
    y_irf[t, :] = A_inv @ (gamma_hat * y_irf[t-1, :])

print(f"Shock: Unit shock in region {shock_region} at t=0")
print(f"Propagation: {n_periods} periods\n")

# Identify neighbors of shocked region
neighbor_idx = list(W.neighbors[shock_region])
distant_idx = [i for i in range(n_regions) if i not in neighbor_idx and i != shock_region]

print(f"Shocked region:  {shock_region}")
print(f"Neighbors:       {len(neighbor_idx)} regions")
print(f"Distant regions: {len(distant_idx)} regions")

# Plot IRF
fig, ax = plt.subplots(figsize=(12, 6))

time_periods = np.arange(n_periods)

# Shocked region
ax.plot(time_periods, y_irf[:, shock_region], marker='o', label=f'Shocked region ({shock_region})',
        linewidth=2.5, markersize=8, color='darkred')

# Average response in neighbors
neighbor_response = y_irf[:, neighbor_idx].mean(axis=1)
ax.plot(time_periods, neighbor_response, marker='s',
        label=f'Neighbors (avg of {len(neighbor_idx)})', linewidth=2.5, markersize=7, color='darkblue')

# Average response in distant regions
if len(distant_idx) > 0:
    distant_response = y_irf[:, distant_idx].mean(axis=1)
    ax.plot(time_periods, distant_response, marker='^',
            label=f'Distant regions (avg of {len(distant_idx)})', linewidth=2.5, markersize=7, color='darkgreen')

ax.set_xlabel('Time Period', fontsize=13, fontweight='bold')
ax.set_ylabel('Response', fontsize=13, fontweight='bold')
ax.set_title(f'Impulse-Response Function (Œ≥={gamma_hat:.2f}, œÅ={rho_hat:.2f})',
             fontsize=14, fontweight='bold')
ax.legend(fontsize=11, frameon=True, shadow=True)
ax.grid(True, alpha=0.3)
ax.axhline(0, color='black', linestyle='-', linewidth=0.8)

plt.tight_layout()
plt.savefig('../outputs/figures/nb07_irf.png', dpi=300, bbox_inches='tight')
plt.show()

print("\n‚úì IRF plot saved: ../outputs/figures/nb07_irf.png")

# Interpretation
print("\nINTERPRETATION:")
print(f"  ‚Üí Shock decays over time (Œ≥ = {gamma_hat:.2f} < 1)")
print(f"  ‚Üí Shock diffuses to neighbors (œÅ = {rho_hat:.2f} > 0)")
print(f"  ‚Üí Distant regions affected with delay (multi-step spillovers)")
print(f"  ‚Üí After {n_periods} periods, response in shocked region: {y_irf[-1, shock_region]:.4f}")

In [None]:
# Heatmap: Time √ó Space
print("Creating IRF heatmap...\n")

fig, ax = plt.subplots(figsize=(14, 8))

im = ax.imshow(y_irf.T, cmap='RdBu_r', aspect='auto', interpolation='nearest',
               vmin=-np.abs(y_irf).max(), vmax=np.abs(y_irf).max())

ax.set_xlabel('Time Period', fontsize=13, fontweight='bold')
ax.set_ylabel('Region Index', fontsize=13, fontweight='bold')
ax.set_title(f'Impulse-Response Heatmap: Shock in Region {shock_region}',
             fontsize=14, fontweight='bold')

# Highlight shocked region
ax.axhline(shock_region, color='yellow', linestyle='--', linewidth=2, alpha=0.7)

# Colorbar
cbar = plt.colorbar(im, ax=ax)
cbar.set_label('Response Magnitude', fontsize=12, fontweight='bold')

plt.tight_layout()
plt.savefig('../outputs/figures/nb07_irf_heatmap.png', dpi=300, bbox_inches='tight')
plt.show()

print("‚úì IRF heatmap saved: ../outputs/figures/nb07_irf_heatmap.png")

print("\nHeatmap shows:")
print("  ‚Ä¢ Horizontal axis: Time progression")
print("  ‚Ä¢ Vertical axis: All regions (shocked region highlighted)")
print("  ‚Ä¢ Color intensity: Response magnitude")
print("  ‚Ä¢ Pattern: Shock diffuses spatially and decays temporally")

## 7. Difference-in-Sargan Test

### Testing Subsets of Instruments

The **difference-in-Sargan test** compares:
- **Full model**: Uses all instruments
- **Restricted model**: Excludes subset of instruments

**Test statistic**:

$$
\Delta J = J_{full} - J_{restricted} \sim \chi^2(df)
$$

Where $df = $ # excluded instruments

**Hypotheses**:
- $H_0$: Excluded instruments are valid
- $H_A$: At least one excluded instrument is invalid

**Use case**: Test whether adding more lags improves or harms estimation.

---

In [None]:
# Difference-in-Sargan test
print("="*70)
print("DIFFERENCE-IN-SARGAN TEST")
print("="*70)

print("\nComparing instrument sets:")
print("  Full:       y_lag2, y_lag3, W_invest, W_educ")
print("  Restricted: y_lag2, W_invest, W_educ (exclude y_lag3)")

# Re-estimate with restricted instrument set (exclude y_lag3)
instruments_restricted = data_gmm[['y_lag2', 'W_invest', 'W_educ']].values
Z_restricted = np.hstack([instruments_restricted, exog, const])

# First stage
Pi_r = inv(Z_restricted.T @ Z_restricted) @ Z_restricted.T @ endog
X_endog_hat_r = Z_restricted @ Pi_r

# Second stage
X_r = np.hstack([X_endog_hat_r, exog, const])
beta_r = inv(X_r.T @ X_r) @ X_r.T @ y

# Residuals
residuals_r = y - X_r @ beta_r

# J-statistic for restricted model
Ze_r = Z_restricted.T @ residuals_r / n
sigma2_r = np.sum(residuals_r**2) / (n - k)
J_restricted = n * Ze_r.T @ inv(Z_restricted.T @ Z_restricted / n) @ Ze_r / sigma2_r

df_restricted = Z_restricted.shape[1] - k

# Difference-in-Sargan
J_full = results_dict['hansen_j']
diff_sargan = J_full - J_restricted
diff_df = results_dict['hansen_df'] - df_restricted

p_diff = 1 - chi2.cdf(diff_sargan, diff_df)

print(f"\nJ-statistic (full):       {J_full:.4f}")
print(f"J-statistic (restricted): {J_restricted:.4f}")
print(f"\nDifference:               {diff_sargan:.4f}")
print(f"df (# excluded):          {diff_df}")
print(f"p-value:                  {p_diff:.4f}")

print("\n" + "-"*70)
if p_diff > 0.05:
    print("‚úì Fail to reject H‚ÇÄ ‚Üí Excluded instruments (y_lag3) are valid")
    print("  ‚Üí Using additional lags improves efficiency")
else:
    print("‚ö† Reject H‚ÇÄ ‚Üí Excluded instruments may be invalid")
    print("  ‚Üí Additional lags may not satisfy exclusion restriction")
    print("  ‚Üí Possible serial correlation in errors")

print("-"*70)
print("="*70)

## 8. Case Study: Regional Economic Growth

### Research Questions

1. **Do regions converge over time?** ($\gamma < 1$?)
2. **Are there spatial growth spillovers?** ($\rho > 0$?)
3. **What is the long-run impact of investment?** (LR multiplier)

### Policy Implications

- **Convergence** ($\gamma < 1$): Poor regions catch up to rich regions
- **Spillovers** ($\rho > 0$): Regional policies have cross-border effects
- **LR multipliers**: Short-run evaluations underestimate true impact

---

In [None]:
print("="*70)
print("CASE STUDY: REGIONAL ECONOMIC CONVERGENCE")
print("="*70)

print("\nResearch Context:")
print("  Panel of {} regions over {} years".format(
    data_gmm['entity_id'].nunique(), data_gmm['time'].nunique()))
print("  Dependent variable: GDP growth rate")
print("  Key regressors: Investment rate, education level")

print("\n" + "-"*70)
print("RESEARCH QUESTIONS & FINDINGS")
print("-"*70)

gamma_hat = results_dict['gamma']
rho_hat = results_dict['rho']
beta_invest_hat = results_dict['beta_invest']
p_gamma = results_dict['pvalues'][0]
p_rho = results_dict['pvalues'][1]

# Q1: Convergence?
print("\n1. DO REGIONS CONVERGE OVER TIME?")
print(f"   Œ≥ (persistence) = {gamma_hat:.4f} (p = {p_gamma:.4f})")

if gamma_hat < 1 and p_gamma < 0.05:
    convergence_rate = (1 - gamma_hat) * 100
    half_life = np.log(0.5) / np.log(gamma_hat)
    print(f"\n   ‚úì YES - Evidence of convergence")
    print(f"     ‚Ä¢ Convergence rate: {convergence_rate:.1f}% per year")
    print(f"     ‚Ä¢ Half-life of shocks: {half_life:.1f} years")
    print(f"     ‚Ä¢ Implication: Regional disparities diminish over time")
else:
    print(f"\n   ‚úó NO - No evidence of convergence")
    print(f"     ‚Ä¢ Shocks may have permanent effects")

# Q2: Spillovers?
print("\n2. ARE THERE SPATIAL GROWTH SPILLOVERS?")
print(f"   œÅ (spatial lag) = {rho_hat:.4f} (p = {p_rho:.4f})")

if rho_hat > 0 and p_rho < 0.05:
    print(f"\n   ‚úì YES - Positive spatial spillovers detected")
    print(f"     ‚Ä¢ 1% increase in neighbors' growth ‚Üí {rho_hat:.2f}% own growth")
    print(f"     ‚Ä¢ Mechanisms: Knowledge diffusion, trade linkages, migration")
    print(f"     ‚Ä¢ Implication: Coordinated regional policies more effective")
else:
    print(f"\n   ‚úó NO - No significant spatial spillovers")

# Q3: LR impact
print("\n3. WHAT IS THE LONG-RUN IMPACT OF INVESTMENT?")
print(f"   Short-run effect: {beta_invest_hat:.4f}")
print(f"   Long-run effect:  {lr_invest:.4f}")
print(f"   LR multiplier:    {dyn_spatial_multiplier:.2f}x")

print(f"\n   ‚Üí 1 pp increase in investment rate:")
print(f"     ‚Ä¢ Immediate:  +{beta_invest_hat:.3f} pp GDP growth")
print(f"     ‚Ä¢ Long-run:   +{lr_invest:.3f} pp GDP growth")
print(f"     ‚Ä¢ Multiplier effect: {dyn_spatial_multiplier:.2f}x")

print("\n" + "-"*70)
print("POLICY IMPLICATIONS")
print("-"*70)

print("\n1. DYNAMIC EFFECTS")
print(f"   ‚Ä¢ Regional investments have long-lasting effects (Œ≥ = {gamma_hat:.2f})")
print(f"   ‚Ä¢ Short-run evaluations underestimate true impact by {(lr_invest/beta_invest_hat - 1)*100:.0f}%")

print("\n2. SPATIAL COORDINATION")
print(f"   ‚Ä¢ Spillovers magnify policy effects (œÅ = {rho_hat:.2f})")
print(f"   ‚Ä¢ Unilateral policies ignore {(dyn_spatial_multiplier - temporal_multiplier)/dyn_spatial_multiplier*100:.0f}% of total effect")
print(f"   ‚Ä¢ Coordinated regional strategies enhance effectiveness")

print("\n3. TARGETING")
print(f"   ‚Ä¢ High-spillover regions (central locations) have larger impact")
print(f"   ‚Ä¢ Network position matters for policy effectiveness")

print("="*70)

## 9. Diagnostic Tests

### Arellano-Bond Serial Correlation Tests

For dynamic panels, we test for serial correlation in **first-differenced errors**:

- **AR(1) test**: Expected to be significant (differencing induces MA(1))
- **AR(2) test**: Should be **insignificant** for valid instruments

**Why AR(2) matters**:
- If AR(2) is significant ‚Üí $\varepsilon_{it}$ is serially correlated
- Then $y_{i,t-2}$ is **not valid** as instrument for $y_{i,t-1}$
- Need deeper lags: $y_{i,t-3}, y_{i,t-4}, \ldots$

---

In [None]:
# Arellano-Bond AR tests (simplified)
print("="*70)
print("DIAGNOSTIC TESTS: SERIAL CORRELATION")
print("="*70)

# Compute first-differenced residuals
data_gmm_test = data_gmm.copy()
data_gmm_test['resid'] = results_dict['residuals']
data_gmm_test['resid_lag1'] = data_gmm_test.groupby('entity_id')['resid'].shift(1)
data_gmm_test['resid_lag2'] = data_gmm_test.groupby('entity_id')['resid'].shift(2)

# AR(1) test: correlation between ŒîŒµ_t and ŒîŒµ_{t-1}
data_test1 = data_gmm_test.dropna(subset=['resid', 'resid_lag1'])
corr_ar1 = np.corrcoef(data_test1['resid'], data_test1['resid_lag1'])[0, 1]
n_ar1 = len(data_test1)
z_ar1 = np.sqrt(n_ar1) * corr_ar1
p_ar1 = 2 * (1 - stats.norm.cdf(np.abs(z_ar1)))

# AR(2) test
data_test2 = data_gmm_test.dropna(subset=['resid', 'resid_lag2'])
corr_ar2 = np.corrcoef(data_test2['resid'], data_test2['resid_lag2'])[0, 1]
n_ar2 = len(data_test2)
z_ar2 = np.sqrt(n_ar2) * corr_ar2
p_ar2 = 2 * (1 - stats.norm.cdf(np.abs(z_ar2)))

print("\nArellano-Bond Tests for Serial Correlation:")
print("-"*70)
print(f"{'Test':<10} {'Correlation':>12} {'z-stat':>10} {'p-value':>10} {'Result'}")
print("-"*70)

print(f"{'AR(1)':<10} {corr_ar1:>12.4f} {z_ar1:>10.2f} {p_ar1:>10.4f}", end=" ")
if p_ar1 < 0.05:
    print("(Expected)")
else:
    print("")

print(f"{'AR(2)':<10} {corr_ar2:>12.4f} {z_ar2:>10.2f} {p_ar2:>10.4f}", end=" ")
if p_ar2 > 0.05:
    print("‚úì Valid")
else:
    print("‚ö† Invalid")

print("-"*70)

print("\nINTERPRETATION:")
print("  AR(1): Significant (expected due to differencing)")

if p_ar2 > 0.05:
    print("  AR(2): ‚úì Insignificant ‚Üí Instruments likely valid")
    print("         ‚Üí No second-order serial correlation")
    print("         ‚Üí y_{t-2} is valid instrument for Œîy_{t-1}")
else:
    print("  AR(2): ‚ö† Significant ‚Üí Instrument validity questionable")
    print("         ‚Üí Serial correlation detected")
    print("         ‚Üí Need deeper lags (y_{t-3}, y_{t-4}, ...)")

print("="*70)

## 10. Summary

### Key Takeaways

1. **Dynamic spatial panels** combine:
   - Temporal dynamics: $\gamma y_{i,t-1}$ (persistence/convergence)
   - Spatial dynamics: $\rho Wy_{it}$ (spillovers/diffusion)

2. **Double endogeneity** requires GMM:
   - Nickell bias: $y_{i,t-1}$ correlated with $\alpha_i$
   - Spatial endogeneity: $Wy_{it}$ correlated with $\varepsilon_{it}$

3. **Valid instruments**:
   - For $y_{i,t-1}$: Temporal lags $y_{i,t-2}, y_{i,t-3}, \ldots$
   - For $Wy_{it}$: Spatial lags of $X$ (WX, W¬≤X) and temporal-spatial lags

4. **Short-run ‚â† Long-run**:
   - LR multiplier = $\frac{1}{1 - \gamma - \rho\lambda_{max}}$
   - Typical LR/SR ratio: 2-5x

5. **Impulse-response functions**:
   - Trace shock propagation across space and time
   - Visualize decay (temporal) and diffusion (spatial)

6. **Diagnostic tests**:
   - Hansen J-test: Over-identification
   - AR(2) test: Instrument validity
   - Difference-in-Sargan: Subset validity

---

### Applications

| Field | Temporal ($\gamma$) | Spatial ($\rho$) |
|-------|-------------------|------------------|
| **Economic growth** | Convergence dynamics | Knowledge spillovers |
| **Crime** | Recidivism, persistence | Contagion effects |
| **Innovation** | Learning by doing | Technology diffusion |
| **Unemployment** | Hysteresis | Labor market linkages |
| **Public health** | Disease progression | Spatial transmission |

---

### Next Steps

**Notebook 08**: Comprehensive specification testing
- Spatial autocorrelation tests (Moran's I, LM tests)
- Model selection (SAR vs SEM vs SDM vs SDEM)
- Robustness checks (alternative W matrices)

---

### References

1. **Arellano & Bond (1991)**: "Some Tests of Specification for Panel Data"
2. **Blundell & Bond (1998)**: "Initial Conditions and Moment Restrictions"
3. **Elhorst (2014)**: *Spatial Econometrics*
4. **Yu et al. (2008)**: "Quasi-maximum likelihood estimators for spatial dynamic panel data"
5. **Lee & Yu (2010)**: "Estimation of spatial autoregressive panel data models with fixed effects"

---

In [None]:
# Final summary table
print("="*70)
print("NOTEBOOK COMPLETION SUMMARY")
print("="*70)

print("\n‚úì TASKS COMPLETED:")
tasks = [
    "Estimated dynamic spatial panel via GMM",
    "Constructed valid instruments for double endogeneity",
    "Computed short-run vs long-run effects",
    "Performed Hansen J-test for over-identification",
    "Generated impulse-response functions (line plot + heatmap)",
    "Conducted difference-in-Sargan test",
    "Analyzed regional growth convergence case study",
    "Tested serial correlation (AR tests)"
]

for i, task in enumerate(tasks, 1):
    print(f"  {i}. {task}")

print("\nüìä OUTPUTS GENERATED:")
outputs = [
    "../outputs/figures/nb07_sr_vs_lr.png",
    "../outputs/figures/nb07_irf.png",
    "../outputs/figures/nb07_irf_heatmap.png"
]

for output in outputs:
    print(f"  ‚Ä¢ {output}")

print("\nüéì LEARNING OUTCOMES ACHIEVED:")
outcomes = [
    "Understanding of double endogeneity (Nickell + spatial)",
    "GMM estimation with valid instruments",
    "Distinction between short-run and long-run effects",
    "Interpretation of temporal persistence (Œ≥) and spatial spillovers (œÅ)",
    "Impulse-response analysis for shock propagation",
    "Instrument validity testing (Hansen J, Sargan, AR tests)"
]

for outcome in outcomes:
    print(f"  ‚úì {outcome}")

print("\n‚è±Ô∏è  ESTIMATED COMPLETION TIME: 180-210 minutes")
print("\nüìö NEXT: Notebook 08 - Comprehensive Specification Testing")

print("="*70)