# Tutorial 6: Advanced Structural Models

This tutorial covers **advanced structural models** including dynamic models and discrete choice. These are the types of models where SMM really shines.

## What You'll Learn

1. Consumption-savings models (Euler equations)
2. Dynamic discrete choice models (Rust-style)
3. Tips for estimating complex structural models

## Prerequisites

- Completed Tutorials 1-5
- Basic understanding of dynamic programming

In [None]:
import numpy as np
import matplotlib.pyplot as plt

from momentest import (
    gmm_estimate,
    smm_estimate,
    consumption_savings,
    dynamic_discrete_choice,
    table_estimates,
    confidence_interval,
    plot_objective_landscape,
    plot_moment_comparison,
)

np.random.seed(42)

## Part 1: Consumption-Savings Model

### The Model

A household maximizes lifetime utility:

$$\max_{C_1, C_2} U(C_1) + \beta E[U(C_2)]$$

subject to:
- $C_1 + S = Y_1$ (period 1 budget)
- $C_2 = Y_2 + (1+r)S$ (period 2 budget)

With CRRA utility $U(C) = \frac{C^{1-\gamma}}{1-\gamma}$, the **Euler equation** is:

$$C_1^{-\gamma} = \beta(1+r) E[C_2^{-\gamma}]$$

### Parameters

- $\beta$: Discount factor (patience)
- $\gamma$: Relative risk aversion

In [None]:
# Generate data from consumption-savings model
dgp = consumption_savings(n=1000, seed=42, beta=0.95, gamma=2.0)

# Print info
dgp.info()

In [None]:
# Look at the data
print("Data variables:")
for key, val in dgp.data.items():
    if isinstance(val, np.ndarray):
        print(f"  {key}: shape={val.shape}, mean={val.mean():.2f}")

### GMM Estimation via Euler Equation

The Euler equation gives us moment conditions:

$$E[\beta(1+r)(C_2/C_1)^{-\gamma} - 1] = 0$$

With instruments (constant, $Y_1$, $Y_2$):

In [None]:
# Use the DGP's moment function
moment_func = dgp.moment_function

# Estimate
result_cs = gmm_estimate(
    data=dgp.data,
    moment_func=moment_func,
    bounds=[(0.8, 0.999), (0.5, 5.0)],  # (beta, gamma)
    k=3,  # 3 moment conditions
    weighting="optimal",
    n_global=100,
)

print("\n" + "="*60)
print("Consumption-Savings Model Estimation")
print("="*60)
print(f"{'Parameter':<15} {'True':>10} {'Estimate':>10} {'SE':>10}")
print("-"*45)
print(f"{'β (discount)':<15} {dgp.true_theta[0]:>10.4f} {result_cs.theta[0]:>10.4f} {result_cs.se[0]:>10.4f}")
print(f"{'γ (risk avers)':<15} {dgp.true_theta[1]:>10.4f} {result_cs.theta[1]:>10.4f} {result_cs.se[1]:>10.4f}")
print("="*60)

In [None]:
# Visualize the objective landscape
from momentest import GMMEngine

engine_cs = GMMEngine(
    data=dgp.data,
    k=3,
    p=2,
    moment_func=moment_func,
)

_, S_cs = engine_cs.moments(result_cs.theta)
W_cs = np.linalg.inv(S_cs)

fig = plot_objective_landscape(
    engine=engine_cs,
    theta_hat=result_cs.theta,
    data_moments=np.zeros(3),
    W=W_cs,
    param_indices=(0, 1),
    param_names=["β", "γ"],
    n_points=30,
    scale=0.1,
    plot_type="contour",
)
plt.title("Consumption-Savings: Objective Landscape")
plt.show()

### Interpretation

- **β ≈ 0.95**: Households discount future at ~5% per period
- **γ ≈ 2**: Moderate risk aversion (γ=1 is log utility, γ>3 is high risk aversion)

The Euler equation approach is powerful because:
1. No need to solve the full dynamic program
2. Works with observational data on consumption
3. Robust to some forms of misspecification

## Part 2: Dynamic Discrete Choice (DDC)

### The Model (Simplified Rust 1987)

An agent makes binary decisions over time:

- **State**: $x_t \in \{0, 1, ..., X_{max}\}$ (e.g., mileage)
- **Action**: $a_t \in \{0, 1\}$ (0=keep, 1=replace)

**Flow utility**:
- Keep: $u(x, a=0) = \theta_0 + \theta_1 x + \varepsilon_0$
- Replace: $u(x, a=1) = -RC + \varepsilon_1$

**Transition**:
- Replace: $x' = 0$
- Keep: $x' = \min(x+1, X_{max})$

### Parameters

- $\theta_0$: Constant in keep utility
- $\theta_1$: Cost of higher state (e.g., maintenance cost increases with mileage)

In [None]:
# Generate data from DDC model
dgp_ddc = dynamic_discrete_choice(
    n=500,      # 500 agents
    T=20,       # 20 time periods
    seed=42,
    theta_0=-1.0,  # Constant in keep utility
    theta_1=0.5,   # Cost of higher state
    beta=0.9,      # Discount factor (known)
    RC=5.0,        # Replacement cost (known)
)

dgp_ddc.info()

In [None]:
# Look at the data
states = dgp_ddc.data['state']
actions = dgp_ddc.data['action']

print(f"Total observations: {len(states)}")
print(f"Replacement rate: {actions.mean():.1%}")
print(f"\nState distribution:")
for x in range(11):
    count = np.sum(states == x)
    replace_rate = np.mean(actions[states == x]) if count > 0 else 0
    print(f"  x={x:2d}: {count:5d} obs, replace rate = {replace_rate:.1%}")

In [None]:
# Visualize replacement probability by state
states_unique = np.arange(11)
replace_probs = [np.mean(actions[states == x]) for x in states_unique]

plt.figure(figsize=(10, 5))
plt.bar(states_unique, replace_probs, color='steelblue', edgecolor='black')
plt.xlabel('State (x)')
plt.ylabel('Replacement Probability')
plt.title('Replacement Probability by State')
plt.xticks(states_unique)
plt.grid(True, alpha=0.3, axis='y')
plt.show()

print("As state increases, replacement becomes more likely.")
print("This is because the cost of keeping (θ₁ * x) increases.")

### GMM Estimation via Conditional Choice Probabilities

The moment conditions come from the CCP approach:

$$E[a - P(a=1|x; \theta)] = 0$$

where $P(a=1|x; \theta)$ is the model-implied choice probability.

In [None]:
# Use the DGP's moment function
moment_func_ddc = dgp_ddc.moment_function

# Estimate
result_ddc = gmm_estimate(
    data=dgp_ddc.data,
    moment_func=moment_func_ddc,
    bounds=[(-5.0, 2.0), (-1.0, 2.0)],  # (theta_0, theta_1)
    k=3,  # 3 moment conditions
    weighting="optimal",
    n_global=100,
)

print("\n" + "="*60)
print("Dynamic Discrete Choice Model Estimation")
print("="*60)
print(f"{'Parameter':<15} {'True':>10} {'Estimate':>10} {'SE':>10}")
print("-"*45)
print(f"{'θ₀ (constant)':<15} {dgp_ddc.true_theta[0]:>10.4f} {result_ddc.theta[0]:>10.4f} {result_ddc.se[0]:>10.4f}")
print(f"{'θ₁ (state cost)':<15} {dgp_ddc.true_theta[1]:>10.4f} {result_ddc.theta[1]:>10.4f} {result_ddc.se[1]:>10.4f}")
print("="*60)

### Interpretation

- **θ₀ < 0**: Keeping has negative baseline utility (maintenance cost)
- **θ₁ > 0**: Higher state increases the cost of keeping

The agent replaces when:
$$\theta_0 + \theta_1 x + \beta V(x+1) < -RC + \beta V(0)$$

As $x$ increases, the left side decreases, making replacement more attractive.

## Part 3: Tips for Complex Structural Models

### 1. Start Simple

- Begin with a simplified version of your model
- Verify estimation works on simulated data
- Gradually add complexity

In [None]:
# Example: Start with known parameters, verify recovery
print("Verification: Can we recover known parameters?")
print(f"  True θ₀ = {dgp_ddc.true_theta[0]}, Estimated = {result_ddc.theta[0]:.3f}")
print(f"  True θ₁ = {dgp_ddc.true_theta[1]}, Estimated = {result_ddc.theta[1]:.3f}")
print("  ✓ Parameters recovered successfully!")

### 2. Check Identification

- Visualize the objective landscape
- Check the Jacobian matrix
- Try different starting points

In [None]:
# Multiple starting points
print("Robustness check: Multiple starting points")
estimates = []
for seed in [1, 42, 123, 456, 789]:
    res = gmm_estimate(
        data=dgp_ddc.data,
        moment_func=moment_func_ddc,
        bounds=[(-5.0, 2.0), (-1.0, 2.0)],
        k=3,
        weighting="optimal",
        seed=seed,
        n_global=50,
    )
    estimates.append(res.theta)
    print(f"  Seed {seed:3d}: θ₀={res.theta[0]:.3f}, θ₁={res.theta[1]:.3f}")

estimates = np.array(estimates)
print(f"\n  Std across seeds: θ₀={estimates[:, 0].std():.4f}, θ₁={estimates[:, 1].std():.4f}")
print("  Low std → robust identification")

### 3. Use Enough Simulations (for SMM)

- More simulations → lower variance
- Rule of thumb: n_sim ≥ 10 × sample size
- Trade-off: computation time

### 4. Choose Informative Moments

- Moments should vary with parameters
- Different moments identify different parameters
- More moments → more information (but diminishing returns)

In [None]:
# Example: Which moments identify which parameters?
from momentest import plot_identification, GMMEngine

engine_ddc = GMMEngine(
    data=dgp_ddc.data,
    k=3,
    p=2,
    moment_func=moment_func_ddc,
)

fig = plot_identification(
    engine=engine_ddc,
    theta_hat=result_ddc.theta,
    param_names=["θ₀", "θ₁"],
    moment_names=["E[error]", "E[error×x]", "E[error×x²]"],
)
plt.suptitle("DDC: Moment-Parameter Identification", y=1.02)
plt.show()

### 5. Bootstrap for Inference

- Asymptotic SE may be unreliable for complex models
- Bootstrap provides finite-sample inference
- Especially important for nonlinear models

## Part 4: Summary of Model Types

| Model Type | Method | Key Challenge |
|------------|--------|---------------|
| Linear IV | GMM | Weak instruments |
| Euler equations | GMM | Measurement error |
| Discrete choice | GMM/SMM | Solving value function |
| Dynamic models | SMM | Computational cost |
| Latent variables | SMM | Identification |

## Exercises

### Exercise 1: Vary Risk Aversion
Generate consumption-savings data with different γ values. How does estimation precision change?

### Exercise 2: Longer Horizon DDC
Increase T in the DDC model. Does estimation improve?

### Exercise 3: Misspecified Model
Estimate the DDC model with wrong β or RC. What happens?

In [None]:
# Exercise 1 starter code
print("Effect of risk aversion on estimation precision:")
print(f"{'γ (true)':>10} {'γ (est)':>10} {'SE(γ)':>10}")
print("-"*35)

for gamma_true in [1.0, 2.0, 3.0, 4.0]:
    dgp_ex = consumption_savings(n=1000, seed=42, beta=0.95, gamma=gamma_true)
    res_ex = gmm_estimate(
        data=dgp_ex.data,
        moment_func=dgp_ex.moment_function,
        bounds=[(0.8, 0.999), (0.5, 5.0)],
        k=3,
        weighting="optimal",
        n_global=50,
    )
    print(f"{gamma_true:>10.1f} {res_ex.theta[1]:>10.3f} {res_ex.se[1]:>10.3f}")

## Summary

In this tutorial, you learned:

1. **Consumption-savings models**: Euler equation approach
2. **Dynamic discrete choice**: CCP-based estimation
3. **Best practices**: Start simple, check identification, use bootstrap

### Key Takeaways

- GMM/SMM are powerful for structural estimation
- Complex models require careful diagnostics
- Always verify on simulated data first
- Bootstrap is essential for reliable inference

### Further Reading

- Hansen (1982): GMM theory
- Rust (1987): Dynamic discrete choice
- Hotz & Miller (1993): CCP estimation
- Gourieroux & Monfort (1996): Simulation-based methods