# 02 — System GMM & Efficiency Gains

**Duration:** ~90 minutes  
**Level:** Intermediate  
**Prerequisites:** Notebook 01 (Difference GMM Fundamentals)

## Learning Objectives

1. Understand the **weak instruments problem** in Difference GMM
2. Explain the **Blundell-Bond System GMM** approach
3. Compare efficiency between Difference and System GMM
4. Interpret the **Difference-in-Hansen** test for level instruments
5. Know when to prefer System GMM over Difference GMM

## Outline

1. [The Weak Instruments Problem](#1-weak-instruments)
2. [System GMM: Adding Level Equations](#2-system-gmm)
3. [Estimation with PanelBox](#3-estimation)
4. [Efficiency Comparison](#4-efficiency)
5. [Difference-in-Hansen Test](#5-diff-hansen)
6. [When to Use System GMM](#6-guidelines)
7. [Exercises](#7-exercises)

In [None]:
# Setup
import sys
import warnings
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

project_root = Path("../../..").resolve()
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

from panelbox.gmm import DifferenceGMM, SystemGMM

sys.path.insert(0, str(Path("..").resolve()))
from utils.visualization import apply_tutorial_style, plot_coefficient_comparison

apply_tutorial_style()
warnings.filterwarnings('ignore', category=UserWarning)
print("Setup complete.")

## 1. The Weak Instruments Problem

Difference GMM uses lagged **levels** as instruments for first-differenced equations. When the underlying series is **highly persistent** ($\rho \to 1$), lagged levels become **weak instruments** for differences:

$$\text{Corr}(y_{i,t-2}, \Delta y_{i,t-1}) \to 0 \quad \text{as } \rho \to 1$$

This leads to:
- Large standard errors
- Biased coefficient estimates
- Poor finite-sample properties

Let's demonstrate this with data that has near-unit-root persistence.

In [None]:
# Load weak instruments data (rho = 0.95, near unit root)
weak_data = pd.read_csv("../data/weak_instruments.csv")
print(f"Shape: {weak_data.shape}")
print(f"Entities: {weak_data['entity'].nunique()}, Periods: {weak_data['time'].nunique()}")
weak_data.describe().round(4)

In [None]:
# Difference GMM on weak instruments data
model_diff_weak = DifferenceGMM(
    data=weak_data,
    dep_var='y',
    lags=1,
    id_var='entity',
    time_var='time',
    exog_vars=['x'],
    time_dummies=False,
    collapse=True,
    two_step=True,
    robust=True
)
results_diff_weak = model_diff_weak.fit()

print("Difference GMM on Persistent Series (rho_true = 0.95):")
print(f"  rho_hat = {results_diff_weak.params.iloc[0]:.4f}")
print(f"  SE      = {results_diff_weak.std_errors.iloc[0]:.4f}")
print(f"  Hansen p = {results_diff_weak.hansen_j.pvalue:.4f}")
print(f"\nNote the large SE — this is the weak instruments problem!")

## 2. System GMM: Adding Level Equations

Blundell & Bond (1998) proposed **System GMM**, which stacks two sets of equations:

### Difference Equations (Arellano-Bond)
$$\Delta y_{it} = \rho \, \Delta y_{i,t-1} + \Delta \mathbf{x}'_{it} \boldsymbol{\beta} + \Delta \varepsilon_{it}$$
**Instruments:** Lags of levels ($y_{i,t-2}, y_{i,t-3}, \ldots$)

### Level Equations (additional)
$$y_{it} = \rho \, y_{i,t-1} + \mathbf{x}'_{it} \boldsymbol{\beta} + \mu_i + \varepsilon_{it}$$
**Instruments:** Lags of differences ($\Delta y_{i,t-1}$)

### Key Additional Assumption
$$E[\Delta y_{i,1} \cdot \mu_i] = 0$$

This **stationarity condition** requires that initial deviations from long-run mean are uncorrelated with fixed effects. It is plausible when:
- The process started long before the first observation
- The panel does not begin at a structural break or policy change

## 3. Estimation with PanelBox

In [None]:
# System GMM on the same weak instruments data
model_sys_weak = SystemGMM(
    data=weak_data,
    dep_var='y',
    lags=1,
    id_var='entity',
    time_var='time',
    exog_vars=['x'],
    time_dummies=False,
    collapse=True,
    two_step=True,
    robust=True,
    level_instruments={'max_lags': 1}
)
results_sys_weak = model_sys_weak.fit()
print(results_sys_weak.summary())

In [None]:
# Compare Difference vs System GMM
print("\nComparison: Difference GMM vs System GMM")
print("=" * 60)
print(f"{'':20s} {'Diff GMM':>15s} {'System GMM':>15s}")
print("-" * 60)
print(f"{'rho estimate':20s} {results_diff_weak.params.iloc[0]:>15.4f} {results_sys_weak.params.iloc[0]:>15.4f}")
print(f"{'SE(rho)':20s} {results_diff_weak.std_errors.iloc[0]:>15.4f} {results_sys_weak.std_errors.iloc[0]:>15.4f}")
print(f"{'Hansen J p-value':20s} {results_diff_weak.hansen_j.pvalue:>15.4f} {results_sys_weak.hansen_j.pvalue:>15.4f}")
print(f"{'AR(2) p-value':20s} {results_diff_weak.ar2_test.pvalue:>15.4f} {results_sys_weak.ar2_test.pvalue:>15.4f}")
print(f"{'Instruments':20s} {results_diff_weak.n_instruments:>15d} {results_sys_weak.n_instruments:>15d}")
print(f"{'Observations':20s} {results_diff_weak.nobs:>15d} {results_sys_weak.nobs:>15d}")

# Efficiency gain
se_diff = results_diff_weak.std_errors.iloc[0]
se_sys = results_sys_weak.std_errors.iloc[0]
if se_diff > 0 and se_sys > 0:
    gain = (se_diff - se_sys) / se_diff * 100
    print(f"\nEfficiency gain: {gain:.1f}% reduction in SE")
print(f"True rho = 0.95")

## 4. Efficiency Comparison

Let's compare both estimators on the employment data (which has moderate persistence).

In [None]:
# Load employment data
abdata = pd.read_csv("../data/abdata.csv")

# Difference GMM
model_diff_ab = DifferenceGMM(
    data=abdata, dep_var='n', lags=1, id_var='firm', time_var='year',
    exog_vars=['w', 'k'], time_dummies=True, collapse=True, two_step=True, robust=True
)
results_diff_ab = model_diff_ab.fit()

# System GMM
model_sys_ab = SystemGMM(
    data=abdata, dep_var='n', lags=1, id_var='firm', time_var='year',
    exog_vars=['w', 'k'], time_dummies=True, collapse=True, two_step=True, robust=True,
    level_instruments={'max_lags': 1}
)
results_sys_ab = model_sys_ab.fit()

# Build comparison
comparison_vars = ['L1.n', 'w', 'k']
comp_rows = []
for var in comparison_vars:
    if var in results_diff_ab.params.index and var in results_sys_ab.params.index:
        comp_rows.append({
            'Variable': var,
            'Diff GMM Coef': results_diff_ab.params[var],
            'Diff GMM SE': results_diff_ab.std_errors[var],
            'Sys GMM Coef': results_sys_ab.params[var],
            'Sys GMM SE': results_sys_ab.std_errors[var],
        })

comp_df = pd.DataFrame(comp_rows)
print("Employment Model: Difference vs System GMM")
print("=" * 70)
print(comp_df.to_string(index=False, float_format='{:.4f}'.format))

In [None]:
# Visualize coefficient comparison
rho_diff = results_diff_ab.params.get('L1.n', np.nan)
se_rho_diff = results_diff_ab.std_errors.get('L1.n', np.nan)
rho_sys = results_sys_ab.params.get('L1.n', np.nan)
se_rho_sys = results_sys_ab.std_errors.get('L1.n', np.nan)

estimates = {
    'Difference GMM': (rho_diff, se_rho_diff),
    'System GMM': (rho_sys, se_rho_sys),
}

fig, ax = plt.subplots(figsize=(8, 3))
plot_coefficient_comparison(
    estimates,
    param_name='Employment persistence (rho)',
    title='Difference GMM vs System GMM: Employment Equation',
    ax=ax
)
fig.tight_layout()
fig.savefig('../outputs/figures/02_diff_vs_sys_comparison.png', dpi=150, bbox_inches='tight')
plt.show()

## 5. Difference-in-Hansen Test

The **Difference-in-Hansen** test checks whether the additional level instruments are valid:

$$C = J_{\text{system}} - J_{\text{difference}} \sim \chi^2(q)$$

where $q$ is the number of additional moment conditions.

- **H0**: Level instruments are valid
- **Reject if** $p < 0.10$: use Difference GMM instead
- **Fail to reject** ($p > 0.10$): System GMM is appropriate

In [None]:
# Check Difference-in-Hansen test
if results_sys_ab.diff_hansen is not None:
    dh = results_sys_ab.diff_hansen
    print("Difference-in-Hansen Test:")
    print(f"  Statistic: {dh.statistic:.4f}")
    print(f"  P-value:   {dh.pvalue:.4f}")
    print(f"  Conclusion: {dh.conclusion}")
    if dh.pvalue > 0.10:
        print("  => Level instruments are valid. System GMM is appropriate.")
    else:
        print("  => Level instruments may be invalid. Consider Difference GMM.")
else:
    print("Difference-in-Hansen test not available for this specification.")
    print("This can happen when instrument columns are filtered due to sparse coverage.")

## 6. When to Use System GMM

| Criterion | Prefer Difference GMM | Prefer System GMM |
|-----------|----------------------|--------------------|
| Persistence | $\rho < 0.8$ | $\rho > 0.8$ |
| Panel length | Moderate T | Small T |
| Initial conditions | Suspect | Plausible stationarity |
| Approach | Conservative | More efficient |
| Diff-in-Hansen | Rejects | Does not reject |

### Bond's Rule of Thumb

> If the System GMM estimate of $\rho$ is close to the Difference GMM estimate, both are likely valid. If they differ substantially, investigate the stationarity assumption.

In [None]:
# Apply to growth data (highly persistent series)
growth = pd.read_csv("../data/growth.csv")

# Difference GMM
model_diff_g = DifferenceGMM(
    data=growth, dep_var='lgdp', lags=1, id_var='country', time_var='year',
    exog_vars=['inv', 'school', 'popgrowth'], time_dummies=False,
    collapse=True, two_step=True, robust=True
)
results_diff_g = model_diff_g.fit()

# System GMM
model_sys_g = SystemGMM(
    data=growth, dep_var='lgdp', lags=1, id_var='country', time_var='year',
    exog_vars=['inv', 'school', 'popgrowth'], time_dummies=False,
    collapse=True, two_step=True, robust=True,
    level_instruments={'max_lags': 1}
)
results_sys_g = model_sys_g.fit()

print("Growth Model: Difference vs System GMM")
print("=" * 60)
print(f"{'':20s} {'Diff GMM':>15s} {'System GMM':>15s}")
print("-" * 60)
for var in ['L1.lgdp', 'inv', 'school', 'popgrowth']:
    d_val = results_diff_g.params.get(var, np.nan)
    s_val = results_sys_g.params.get(var, np.nan)
    print(f"{var:20s} {d_val:>15.4f} {s_val:>15.4f}")
print("-" * 60)
print(f"{'Hansen J p':20s} {results_diff_g.hansen_j.pvalue:>15.4f} {results_sys_g.hansen_j.pvalue:>15.4f}")
print(f"{'AR(2) p':20s} {results_diff_g.ar2_test.pvalue:>15.4f} {results_sys_g.ar2_test.pvalue:>15.4f}")

## 7. Exercises

### Exercise 1: Varying Persistence
Using the Nickell bias data, estimate both Difference and System GMM for $\rho = 0.3, 0.5, 0.8$ with $T = 5$. When does System GMM provide the largest efficiency gain?

### Exercise 2: Level Instrument Depth
Estimate the employment model with `level_instruments={'max_lags': k}` for $k = 1, 2, 3$. How does the instrument count change? Does deeper depth improve or hurt?

### Exercise 3: Stationarity Violation
Generate data where the panel starts at $t=0$ with $y_{i,0} = 5 \mu_i$ (strong correlation with fixed effects). Estimate System GMM and check if the Difference-in-Hansen test detects the violation.

In [None]:
# Space for exercises
# YOUR CODE HERE


## Summary

1. **Weak instruments** arise when series are persistent ($\rho \to 1$), degrading Difference GMM
2. **System GMM** adds level equations with lagged differences as instruments
3. The efficiency gain is largest for **persistent series** (typically 20-50% lower SE)
4. The **Difference-in-Hansen** test validates level instruments
5. System GMM requires the **stationarity assumption** on initial conditions

### Next Notebook
In **Notebook 03**, we'll dive deeper into **instrument specification**: GMM-style vs IV-style, proliferation, collapse, and lag selection.

---
**References:**
- Blundell, R., & Bond, S. (1998). Initial conditions and moment restrictions in dynamic panel data models. *Journal of Econometrics*, 87(1), 115-143.
- Bond, S., Hoeffler, A., & Temple, J. (2001). GMM estimation of empirical growth models. *Economics Papers* 2001-W21.