# Tutorial 1: Hansen-Singleton (1982) GMM Estimation
## STARTER FILE

**üìù Student Instructions:**

This is the starter file for Tutorial 1. You will implement key components of the GMM estimation procedure.

### ‚úÖ PROVIDED (Complete - No implementation needed):
1. **Data preparation** - All data loading and transformation code
2. **One-step GMM** - Fully implemented in `estimate_gmm()` function
3. **Helper functions** - All utility functions (HAC covariance, numerical jacobian, etc.)
4. **SpecificationRunner** class - Complete framework for running estimations
5. **Question 1 example** - Full working example with specifications defined

### üî® TO IMPLEMENT (Your tasks):

#### **Task 1: Two-Step GMM** (in `estimate_gmm()` function)
- Implement the two-step GMM estimator
- Follow the TODO instructions in the code
- Test by running with `gmm_method='two-step'`

#### **Task 2: Iterated GMM** (in `estimate_gmm()` function)
- Implement the iterated GMM estimator
- Follow the TODO instructions in the code
- Test by running with `gmm_method='iterated'`

#### **Task 3: Question 2 Specifications** (Replicate Table 3)
- **Panel A**: Define specifications for NDS consumption with EW+VW returns (NLAG=1,2,4)
- **Panel B**: Define specifications for NDS consumption with VW+RF returns (NLAG=1,2,4)
- Follow the pattern from Question 1 example

#### **Task 4: Question 3** (Parameter Stability Test)
- Implement Wald test for parameter stability across time periods
- Follow the pseudocode structure provided
- Test H0: parameters are equal across 3 periods

#### **Task 5: Question 4** (CARA Utility)
- Repeat Question 2 using CARA utility instead of CRRA
- **Panel A**: CARA with EW+VW returns
- **Panel B**: CARA with VW+RF returns

### üö® Important Notes:
- **Assertions**: The code contains `assert False` statements where you need to implement. Remove them when done.
- **Testing**: Run code frequently to catch errors early
- **One-step GMM**: Use this as reference for implementing two-step and iterated
- **Question 1**: Study the example carefully - it shows the complete pattern

### üìö Resources:
- Hansen-Singleton (1982) paper (provided)
- Lecture slides on GMM estimation
- One-step GMM implementation (use as template)

---


## 0. Quick instructions
1. If you have real data, set `DATA_PATH` and map your column names in the **Config** cell.
2. Returns must be **gross** (e.g., `1 + R`), not net.
3. Consumption should be **real per-capita**. For CRRA we use the **ratio** `c_{t+1}/c_t`; for CARA we use **levels** to form differences `c_{t+1}-c_t`.
4. Instruments typically include a constant and lags of consumption growth and returns.
5. The three regimes for HS-style comparisons are: pre-1959Q2, 1959Q2‚Äì1978Q4, post-1978Q4. You can change these.

> **Tip:** If you see numerical issues, adjust starting values, tighten bounds, or reduce the instrument set.

In [None]:
!pip install numpy pandas scipy matplotlib statsmodels tabulate openpyxl --quiet

In [None]:
import numpy as np
import pandas as pd
import numpy.linalg as la
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from pathlib import Path
from dataclasses import dataclass
from typing import List, Optional, Dict, Tuple
import scipy.optimize as opt

try:
    from statsmodels.tsa.stattools import adfuller
    HAS_SM = True
except ImportError:
    HAS_SM = False

pd.set_option('display.float_format', lambda x: f"{x:,.6f}")
from tabulate import tabulate

## 1. Data preparation

We construct quarterly real per-capita consumption and real equity returns from FRED and CRSP monthly data.

**Data sources:**

(1) FRED Monthly Data:
- `PCEND`: Personal Consumption Expenditures: Nondurable Goods (billions of dollars, SAAR)
- `DNDGRG3M086SBEA` (PPCEND): Chain-type price index for nondurable goods
- `PCES`: Personal Consumption Expenditures: Services (billions of dollars, SAAR)
- `DSERRG3M086SBEA` (PPCES): Chain-type price index for services
- `POPTHM`: Population (thousands of persons)
- `CNP16OV`: Civilian Noninstitutional Population (thousands)
- `CPIAUCSL`: Consumer Price Index for All Urban Consumers
- `A796RX0Q048SBEA`: Real Personal Consumption Expenditures per Capita: Nondurable Goods (chained 2017 dollars, SAAR)
- `A797RX0Q048SBEA`: Real Personal Consumption Expenditures per Capita: Services (chained 2017 dollars, SAAR)
`GS1`: 1-Year Treasury Constant Maturity Rate (percent per annum)

(2) CRSP Monthly Data:
- `vwretd`: Value-Weighted Return (includes dividends)
- `ewretd`: Equal-Weighted Return (includes dividends)
- `sprtn`: Return on the S&P 500 Index

**Pipeline:**
1. Load monthly FRED and CRSP data
2. Transform to real per-capita consumption and deflate returns by CPI
3. Aggregate to quarterly frequency
4. Construct leads/lags for GMM estimation

### 1.1. Configuration

In [None]:
# Data directory and column names
DATA_DIR = Path('data')  # Relative to notebook location
COL_DATE = 'date_q'
COL_CONS_NDS, COL_CONS_NDS_SVC = 'cons_nds_pc', 'cons_nds_svc_pc'
COL_RET_EW, COL_RET_VW, COL_RET_SP, COL_RET_RF, = 'ret_ew_gross', 'ret_vw_gross', 'ret_sp_gross', 'rf_gross'

### 1.2 Load raw data

In [None]:
# Helper function to read FRED CSVs
def read_fred(series: str) -> pd.DataFrame:
    df = pd.read_csv(DATA_DIR / f"{series}.csv", parse_dates=['observation_date'])
    return df.rename(columns={'observation_date': 'date', df.columns[1]: series})

# Load all FRED series
fred_specs = [
    ('PCEND', 'PCEND'),                      # Consumption: Nondurable Goods (nominal)
    ('DNDGRG3M086SBEA', 'PPCEND'),           # Price index: Nondurable Goods
    ('PCES', 'PCES'),                        # Consumption: Services (nominal)
    ('DSERRG3M086SBEA', 'PPCES'),            # Price index: Services
    ('POPTHM', 'POPTHM'),                    # Population (thousands)
    ('CNP16OV', 'CNP16OV'),                  # Civilian population (thousands)
    ('CPIAUCSL', 'CPI'),                     # Consumer Price Index
    ('GS1', 'GS1'),                          # 1-Year Treasury rate (%)
    ('A796RX0Q048SBEA', 'CONS_ND_REAL_PC'),  # Real ND consumption per capita (quarterly)
    ('A797RX0Q048SBEA', 'CONS_SVC_REAL_PC'), # Real services consumption per capita (quarterly)
]

fred = read_fred(fred_specs[0][0]).rename(columns={fred_specs[0][0]: fred_specs[0][1]})
for key, alias in fred_specs[1:]:
    try:
        fred = fred.merge(read_fred(key).rename(columns={key: alias}), on='date', how='outer')
    except FileNotFoundError:
        print(f"Warning: {key}.csv not found, skipping...")

fred = fred.sort_values('date')
fred[fred.columns[1:]] = fred[fred.columns[1:]].apply(pd.to_numeric, errors='coerce')
fred['month'] = fred['date'].dt.to_period('M')

# Load CRSP returns
crsp = pd.read_csv(DATA_DIR / 'CRSP.csv', parse_dates=['MthCalDt'])
crsp['month'] = crsp['MthCalDt'].dt.to_period('M')
crsp[['vwretd', 'ewretd', 'sprtrn']] = crsp[['vwretd', 'ewretd', 'sprtrn']].apply(pd.to_numeric, errors='coerce')
crsp_monthly = crsp.set_index('month')[['vwretd', 'ewretd', 'sprtrn']].sort_index()

### 1.3 Transform to real per-capita and deflate returns

Real Personal Consumption Expenditure on Nondurables
$$
    c_t = \frac{\text{Nondurables Consumption}_t * 1\text{e}9}{\text{Nondurables CPI}_t/100}*\frac{1}{\text{Population}_t * 1\text{e}3}
$$
Real Personal Consumption Expenditure on Nondurables and Services
$$
    c_t^* = \left(\frac{\text{Nondurables Consumption}_t * 1\text{e}9}{\text{Nondurables CPI}_t/100}+\frac{\text{Services Consumption}_t * 1\text{e}9}{\text{Services CPI}_t/100}\right)*\frac{1}{\text{Population}_t * 1\text{e}3}
$$
Real Value-weighted Return (VWRETD)
$$
    r_t = \text{Real Value-weighted Return}_t = \frac{1+\text{Value-weighted Return}_t}{\text{CPI}_{t}/\text{CPI}_{t-1}}
$$
Real Equal-weighted Return (EWRETD)
$$
    r_t^* = \text{Real Equal-weighted Return}_t = \frac{1+\text{Equal-weighted Return}_t}{\text{CPI}_t/\text{CPI}_{t-1}}
$$
Real S&P Return (SPRTRN)
$$
    r_t^{**} = \text{Real S\&P Return}_t = \frac{1+\text{S\&P Return}_t}{\text{CPI}_t/\text{CPI}_{t-1}}
$$

In [None]:
# Compute real per-capita consumption (monthly)
fred = fred.set_index('month')
fred['cons_nds_pc'] = (fred['PCEND'] * 1e9 / (fred['PPCEND'] / 100)) / (fred['POPTHM'] * 1e3)
fred['cons_nds_svc_pc'] = fred['cons_nds_pc'] + (fred['PCES'] * 1e9 / (fred['PPCES'] / 100)) / (fred['POPTHM'] * 1e3)

# Merge FRED and CRSP, compute real returns
monthly = fred[['cons_nds_pc', 'cons_nds_svc_pc', 'CPI', 'GS1']].join(crsp_monthly, how='left').sort_index()
monthly = monthly.loc['1947-01':]
monthly.index = monthly.index.to_timestamp(how='end')

# Inflation and real returns
monthly['inf'] = monthly['CPI'] / monthly['CPI'].shift(1)
monthly['vwret_real_gross'] = (1.0 + monthly['vwretd']) / monthly['inf']
monthly['ewret_real_gross'] = (1.0 + monthly['ewretd']) / monthly['inf']
monthly['spret_real_gross'] = (1.0 + monthly['sprtrn']) / monthly['inf']

# Risk-free rate: convert annual % to monthly gross return, then deflate
# GS1 is annual rate, so monthly rate ‚âà GS1/12, gross monthly return = 1 + GS1/(100*12)
monthly['rf_gross'] = (1.0 + monthly['GS1'] / (100 * 12)) / monthly['inf']

### 1.4 Aggregate to quarterly frequency

In [None]:
# Resample to quarterly: average consumption, compound returns
quarterly = pd.DataFrame({
    'cons_nds_pc': monthly['cons_nds_pc'].resample('QE').mean(),
    'cons_nds_svc_pc': monthly['cons_nds_svc_pc'].resample('QE').mean(),
    'ret_vw_gross': monthly['vwret_real_gross'].resample('QE').prod(),
    'ret_ew_gross': monthly['ewret_real_gross'].resample('QE').prod(),
    'ret_sp_gross': monthly['spret_real_gross'].resample('QE').prod(),  # Compound quarterly risk-free rate
    'rf_gross': monthly['rf_gross'].resample('QE').prod(),  # Compound quarterly risk-free rate
})

# Add net returns and growth ratios
quarterly['ret_vw'] = quarterly['ret_vw_gross'] - 1.0
quarterly['ret_ew'] = quarterly['ret_ew_gross'] - 1.0
quarterly['ret_sp'] = quarterly['ret_sp_gross'] - 1.0
quarterly['rf'] = quarterly['rf_gross'] - 1.0

# Create plotting dataframe
plot_df = quarterly.loc['1947-03-31':].dropna(
    subset=['cons_nds_pc', 'cons_nds_svc_pc', 'ret_vw_gross', 'ret_ew_gross']
).copy()
plot_df.index.name = 'date_q'

# Create ratio variables for plotting
for c in [COL_CONS_NDS, COL_CONS_NDS_SVC]:
    plot_df[c + '_ratio'] = plot_df[c].shift(-1) / plot_df[c]


### 1.5 Prepare GMM estimation data with leads and lags

In [None]:
# Extract core variables and construct leads/lags for instruments
data = plot_df[[COL_CONS_NDS, COL_CONS_NDS_SVC, COL_RET_EW, COL_RET_VW, COL_RET_SP, COL_RET_RF]].copy()
data[COL_DATE] = plot_df.index.to_period('Q').astype(str)
for c in [COL_CONS_NDS, COL_CONS_NDS_SVC]:
    data[c + '_f1'] = data[c].shift(-1)
    data[c + '_ratio'] = data[c + '_f1'] / data[c]

# Create lagged instruments (including risk-free rate)
for c in [COL_RET_EW, COL_RET_VW, COL_RET_SP, COL_RET_RF, COL_CONS_NDS + '_ratio', COL_CONS_NDS_SVC + '_ratio']:
    for lag in range(1, 7):
        data[c + f'_l{lag}'] = data[c].shift(lag)

data = data.dropna().reset_index(drop=True)
data['t_idx'] = pd.PeriodIndex(data[COL_DATE], freq='Q').to_timestamp()
data['const'] = 1.0  # Add constant
data.head()

## 2. Plots and unit-root diagnostics
Plot post-WWII (From 1960-) monthly series and compute ADF statistics for stationarity analysis.
- Real consumption on expenditure on nondurables per capita $c_{t}$ and its ratio $c_{t+1} / c_{t}$
- Real consumption on nondurables and services per capita $c_{t}^{*}$ and its ratio $c_{t+1}^{*} / c_{t}^{*}$
- Value weighted aggregate stock returns $r_{t+1}$
- Equally weighted aggregate stock returns $r_{t+1}^{*}$

In [None]:
# Plot 3x2 panel of time series
fig, axes = plt.subplots(3, 2, figsize=(12, 9), sharex=True)
plot_specs = [
    ('cons_nds_pc', 'Real nondurables per capita $c_t$', 'Real dollars (per capita)', 'tab:blue'),
    ('cons_nds_pc_ratio', r'Growth ratio $c_{t+1}/c_t$ (nondurables)', '', 'tab:orange'),
    ('cons_nds_svc_pc', r'Real nondurables + services per capita $c_t^{*}$', 'Real dollars (per capita)', 'tab:green'),
    ('cons_nds_svc_pc_ratio', r'Growth ratio $c_{t+1}^{*}/c_t^{*}$', '', 'tab:red'),
    ('ret_vw', r'Value-weighted return $r_{t+1}$', 'Net return', 'tab:purple'),
    ('ret_ew', r'Equal-weighted return $r_{t+1}^{*}$', '', 'tab:brown'),
]

for ax, (col, title, ylabel, color) in zip(axes.ravel(), plot_specs):
    ax.plot(plot_df.index, plot_df[col], color=color)
    ax.set_title(title)
    if ylabel:
        ax.set_ylabel(ylabel)
    ax.grid(True, linestyle=':', linewidth=0.6, alpha=0.7)
    ax.xaxis.set_major_locator(mdates.YearLocator(base=10))
    ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y'))

for ax in axes[2, :]:
    ax.set_xlabel('Quarter')
fig.tight_layout()
plt.show()

# ADF unit-root tests
if HAS_SM:
    adf_results = []
    for col, title, _, _ in plot_specs:
        series = plot_df[col].dropna()
        if len(series) >= 25:
            stat, pvalue, lags, nobs, *_ = adfuller(series, autolag='AIC')
            adf_results.append({'series': title, 'test_stat': stat, 'p_value': pvalue, 'lags': lags, 'nobs': nobs})
    display(pd.DataFrame(adf_results))
else:
    print('statsmodels not installed; ADF tests skipped.')

## 3. Generalized Method of Moments (GMM)

GMM is a general estimation framework based on population moment conditions. Suppose we have a model with parameter vector $\theta \in \mathbb{R}^p$ and moment conditions:

$$
\mathbb{E}[g(y_t, \theta_0)] = 0
$$

where $g: \mathbb{R}^d \times \mathbb{R}^p \to \mathbb{R}^L$ is a vector of $L$ moment functions and $\theta_0$ is the true parameter.

### 3.1 GMM estimation procedure

**Step 1: Sample moments**

From observed data $\{y_t\}_{t=1}^T$, construct the sample moment average:
$$
\bar{g}_T(\theta) = \frac{1}{T} \sum_{t=1}^T g(y_t, \theta)
$$

**Step 2: GMM criterion**

Choose $\theta$ to minimize the quadratic form:
$$
Q_T(\theta) = \bar{g}_T(\theta)^\top W \bar{g}_T(\theta)
$$

where $W$ is a positive semi-definite weighting matrix.

**Step 3: Two-step efficient GMM**

1. **First step**: Use $W = I$ (identity matrix) to get initial estimate $\hat{\theta}_1$
2. **Second step**: Use optimal weighting matrix $W = \hat{S}^{-1}$ where $\hat{S}$ is the long-run covariance of moments:
   $$
   S = \lim_{T \to \infty} \text{Var}(\sqrt{T} \bar{g}_T(\theta_0))
   $$

**Step 4: Asymptotic inference**

Under regularity conditions, the efficient GMM estimator is asymptotically normal:
$$
\sqrt{T}(\hat{\theta} - \theta_0) \xrightarrow{d} N(0, V)
$$

where $V = (D^\top W D)^{-1}$ and $D = \mathbb{E}[\nabla_\theta g(y_t, \theta_0)]$ is the Jacobian of moments.

**Step 5: Overidentification test (J-test)**

When $L > p$ (more moments than parameters), we can test model specification:
$$
J = T \bar{g}_T(\hat{\theta})^\top \hat{S}^{-1} \bar{g}_T(\hat{\theta}) \xrightarrow{d} \chi^2_{L-p}
$$

### 3.2 General GMM implementation

We implement a general-purpose GMM estimator that can be applied to any model with moment conditions.

In [None]:
def nw_lags(T: int) -> int:
    """Newey-West lag selection rule (Andrews 1991)."""
    return max(0, int(4 * (T/100.0)**(2.0/9.0)))

def hac_covariance(G: np.ndarray, L: Optional[int] = None) -> np.ndarray:
    """
    Newey-West HAC estimator for long-run covariance.
    
    Parameters:
    -----------
    G : np.ndarray, shape (T, L)
        Time series of moment conditions (T observations, L moments)
    L : int, optional
        Number of lags for Newey-West. If None, uses automatic selection.
    
    Returns:
    --------
    S : np.ndarray, shape (L, L)
        HAC covariance matrix for sqrt(T) * gbar
    """
    T, Lm = G.shape
    if L is None:
        L = nw_lags(T)
    
    # Center moments
    Gc = G - G.mean(axis=0, keepdims=True)
    
    # Gamma_0 (contemporaneous covariance)
    S = (Gc.T @ Gc) / T
    
    # Add weighted autocovariances
    for ell in range(1, L+1):
        w = 1.0 - ell / (L + 1.0)  # Bartlett kernel
        Gamma_ell = (Gc[ell:].T @ Gc[:-ell]) / T
        S = S + w * (Gamma_ell + Gamma_ell.T)
    
    return S

def numerical_jacobian(moment_fn, theta, eps=1e-6):
    """
    Compute numerical Jacobian of moment function.
    
    Parameters:
    -----------
    moment_fn : callable
        Function that takes theta and returns gbar (L-vector)
    theta : np.ndarray
        Parameter vector (p-dimensional)
    eps : float
        Step size for numerical differentiation
    
    Returns:
    --------
    D : np.ndarray, shape (L, p)
        Jacobian matrix d(gbar)/d(theta)
    """
    g0 = moment_fn(theta)
    p = len(theta)
    L = g0.size
    D = np.zeros((L, p))
    
    for j in range(p):
        theta_plus = np.array(theta, dtype=float)
        theta_plus[j] += eps
        gj = moment_fn(theta_plus)
        D[:, j] = (gj - g0) / eps
    
    return D

def gmm_criterion(theta, moment_fn, W):
    """GMM objective function Q_T(theta) = gbar' * W * gbar."""
    g = moment_fn(theta)
    return float(g.T @ W @ g)

def gmm_estimate(moment_fn, theta0, bounds=None, W=None, n_step=1, hac_lags=None):
    """
    General two-step GMM estimator.
    
    Parameters:
    -----------
    moment_fn : callable
        Function that takes theta and returns gbar (sample moments)
    theta0 : array-like
        Initial parameter guess
    bounds : list of tuples, optional
        Parameter bounds for optimization
    W : np.ndarray, optional
        Initial weighting matrix. If None, uses identity.
    n_step : int
        Number of steps: 1 (one-step), 2 (two-step), -1 (iterated)
    hac_lags : int, optional
        Lags for HAC covariance estimation
    
    Returns:
    --------
    theta_hat : np.ndarray
        GMM parameter estimate
    V : np.ndarray
        Asymptotic covariance matrix
    W : np.ndarray
        Final weighting matrix used
    S : np.ndarray
        HAC covariance of moments
    """
    # First step: identity weighting
    g0 = moment_fn(theta0)
    L = g0.size
    if W is None:
        W = np.eye(L)
    
    res = opt.minimize(gmm_criterion, np.array(theta0, dtype=float), 
                      args=(moment_fn, W), method='L-BFGS-B', bounds=bounds)
    theta1 = res.x
    
    if n_step == 1:
        # Single-step GMM (not efficient)
        D = numerical_jacobian(moment_fn, theta1)
        M = D.T @ W @ D
        Minv = la.pinv(M)
        V = Minv  # Simplified, not accounting for S
        return theta1, V, W, None

    elif n_step == 2:
        # Second step: efficient weighting with HAC
        # Need moment time series to compute S - this requires moment_matrix_fn
        # For now, return first-step estimate
        # (Full implementation requires moment_matrix_fn returning G_t)
        assert 0, "Two-step GMM not implemented!"

    elif n_step == -1:
        assert 0, "Interated GMM not implemented!"

    else:
        return theta1, None, W, None


### 3.3 Hansen-Singleton Euler equation model

We now apply GMM to test consumption-based asset pricing. The representative agent's Euler equation implies:

$$
1 = \mathbb{E}_t\left[ M_{t+1} \cdot R_{t+1,k} \right]
$$

where $M_{t+1} = \beta \frac{u'(c_{t+1})}{u'(c_t)}$ is the stochastic discount factor (SDF) and $R_{t+1,k}$ is the gross return on asset $k$.

**CRRA utility:** $u(c) = \frac{c^{1+\alpha}}{1+\alpha}$

The marginal utility is $u'(c) = c^\alpha$, so:
$$
M_{t+1} = \beta \left(\frac{c_{t+1}}{c_t}\right)^\alpha
$$

Parameters: $\theta = (\alpha, \beta)$ where $\alpha$ is relative risk aversion and $\beta$ is the discount factor.

**CARA utility:** $u(c) = -e^{-\gamma c}$

The marginal utility is $u'(c) = \gamma e^{-\gamma c}$, so:
$$
M_{t+1} = \beta e^{-\gamma(c_{t+1} - c_t)}
$$

Parameters: $\theta = (\gamma, \beta)$ where $\gamma$ is absolute risk aversion.

**Moment conditions with instruments**

Using instruments $x_t$ (information known at time $t$), the unconditional moment conditions are:
$$
\mathbb{E}[x_t \cdot (M_{t+1} R_{t+1,k} - 1)] = 0
$$

With $K$ assets and $L_x$ instruments, we have $L = K \times L_x$ moment conditions.

_e.g.,_ For instance, if we are using CRRA utility using NDS (Consumption with non-durables and services), EWR (Equally-weighted Gross Return) and one of their lagged terms as instruments, then our moment conditions becomes:
$$
\mathbb{E}\left[\begin{pmatrix} 1\\ \frac{c_t^{*}}{c_{t-1}^{*}}\\ r_{t}^{*} \end{pmatrix}
    \cdot \left(r_{t+1}^{*}\beta\left(\frac{c_{t+1}^{*}}{c_t^{*}}\right)^{\alpha}-1\right)\right] = 0 \\
$$

**Sample moments**

For each time $t$, the moment vector is:
$$
g_t(\theta) = \text{vec}(x_t \otimes (M_{t+1}(\theta) R_{t+1} - \mathbf{1}_K)^\top)
$$

The sample average is:
$$
\bar{g}_T(\theta) = \frac{1}{T} \sum_{t=1}^T g_t(\theta)
$$

### 3.4 Implementation for Hansen-Singleton model

**Interpretation of estimates**

- **CRRA** ($\alpha$): Coefficient of relative risk aversion. If $\alpha < 0$, the utility function is convex (risk-loving), which is economically implausible.
- **CARA** ($\gamma$): Coefficient of absolute risk aversion. Higher $\gamma$ means more risk-averse.
- **$\beta$**: Subjective discount factor. Should be close to 1 (e.g., 0.96 corresponds to ~4% time preference).

Hansen & Singleton (1982) found:
- 10 out of 12 specifications rejected at 5% level
- Remaining specifications with $\alpha > 0$ (convex utility) are economically implausible
- This is the "equity premium puzzle": consumption-based models struggle to match asset returns

### 3.5 Unified configuration system

We now create a comprehensive configuration system that allows specifying all aspects of GMM estimation through a single object.

In [None]:
@dataclass
class GMMConfig:
    """Configuration for Hansen-Singleton GMM estimation."""
    util: str  # 'CRRA' or 'CARA'
    returns_cols: List[str]
    instruments: List[str]
    cons_col: str
    cons_ratio_col: str
    cons_f1_col: str
    beta_bounds: tuple = (1e-5, 0.9999)
    alpha_bounds: tuple = (-8.0, 4.0)    # CRRA
    gamma_bounds: tuple = (1e-5, 50.0)   # CARA
    hac_lags: Optional[int] = None

@dataclass
class EstimationSpec:
    """
    Complete specification for a GMM estimation.
    
    This class encapsulates all aspects of what to estimate:
    - Model choice (utility function)
    - Data selection (consumption, returns)
    - Instruments
    - Sample period
    - Parameter restrictions
    - Estimation method
    """
    
    # Identification
    name: str
    
    # Model specification
    utility: str  # 'CRRA' or 'CARA'
    
    # Data selection
    consumption_var: str  # Column name for consumption
    returns: List[str]    # List of return column names
    
    # Instruments
    instruments: List[str]  # List of instrument column names
    
    # Sample period
    sample_start: Optional[str] = None  # YYYY-MM-DD or None for full sample
    sample_end: Optional[str] = None
    
    # Parameter restrictions (e.g., {'beta': 0.96} to fix beta)
    restrictions: Optional[Dict[str, float]] = None
    
    # Estimation options
    gmm_method: str = 'one-step'  # 'one-step', 'two-step', or 'iterated'
    max_iter: int = 10            # Maximum iterations for iterated GMM
    iter_tol: float = 1e-6        # Convergence tolerance for iterated GMM
    hac_lags: Optional[int] = None  # None for automatic selection
    
    # Optimization
    initial_values: Optional[Dict[str, float]] = None
    bounds: Optional[Dict[str, Tuple[float, float]]] = None
    
    def __post_init__(self):
        """Set default bounds if not provided."""
        if self.bounds is None:
            if self.utility.upper() == 'CRRA':
                self.bounds = {'alpha': (-8.0, 4.0), 'beta': (0.01, 0.9999)}
            else:  # CARA
                self.bounds = {'gamma': (1e-5, 50.0), 'beta': (0.01, 0.9999)}
        
        # Set default initial values if not provided
        if self.initial_values is None:
            if self.utility.upper() == 'CRRA':
                self.initial_values = {'alpha': -1.0, 'beta': 0.95}
            else:  # CARA
                self.initial_values = {'gamma': 1.0, 'beta': 0.95}
        
        # Initialize restrictions dict if None
        if self.restrictions is None:
            self.restrictions = {}
    
    def get_param_names(self) -> List[str]:
        """Get list of parameter names for this specification."""
        if self.utility.upper() == 'CRRA':
            base = ['alpha', 'beta']
        else:
            base = ['gamma', 'beta']
        
        # Remove restricted parameters
        return [p for p in base if p not in self.restrictions]
    
    def get_free_params(self) -> List[str]:
        """Get list of free (non-restricted) parameters."""
        return self.get_param_names()
    
    def specification_string(self) -> str:
        """Generate human-readable specification description."""
        lines = [
            f"Specification: {self.name}",
            f"Utility: {self.utility}",
            f"Consumption: {self.consumption_var}",
            f"Returns: {', '.join(self.returns)}",
            f"Instruments: {len(self.instruments)} ({', '.join(self.instruments[:3])}{'...' if len(self.instruments) > 3 else ''})",
            f"Sample: {self.sample_start or 'start'} to {self.sample_end or 'end'}",
            f"Method: {self.gmm_method}",
        ]
        if self.restrictions:
            lines.append(f"Restrictions: {self.restrictions}")
        return '\n'.join(lines)
    
    def to_dict(self) -> Dict:
        """Convert to dictionary for serialization."""
        return {
            'name': self.name,
            'utility': self.utility,
            'consumption_var': self.consumption_var,
            'returns': self.returns,
            'instruments': self.instruments,
            'sample_start': self.sample_start,
            'sample_end': self.sample_end,
            'restrictions': self.restrictions,
            'gmm_method': self.gmm_method,
            'max_iter': self.max_iter,
            'iter_tol': self.iter_tol,
            'hac_lags': self.hac_lags,
        }

In [None]:
@dataclass
class EstimationResult:
    """
    Results from GMM estimation.
    
    This class contains all estimation results and provides methods
    for displaying and exporting them.
    """
    
    # Original specification
    spec: EstimationSpec
    
    # Parameter estimates
    theta: np.ndarray          # Full parameter vector (including restricted)
    theta_free: np.ndarray     # Free parameters only
    se: np.ndarray             # Standard errors (free parameters)
    t_stats: np.ndarray        # t-statistics (free parameters)
    param_names: List[str]     # All parameter names
    free_param_names: List[str]  # Free parameter names
    
    # Covariance matrices
    V: np.ndarray              # Asymptotic variance of free parameters
    S: np.ndarray              # HAC covariance of moments
    W: np.ndarray              # Weighting matrix
    
    # Diagnostics
    J_stat: float              # J-test statistic
    J_pvalue: float            # J-test p-value
    J_dof: int                 # Degrees of freedom
    
    # Sample information
    n_obs: int                 # Number of observations
    n_moments: int             # Number of moment conditions
    n_params: int              # Number of free parameters
    
    # Convergence
    converged: bool
    
    def get_param_dict(self) -> Dict[str, float]:
        """Get parameter estimates as dictionary."""
        return dict(zip(self.param_names, self.theta))
    
    def get_se_dict(self) -> Dict[str, float]:
        """Get standard errors as dictionary (free params only)."""
        return dict(zip(self.free_param_names, self.se))
    
    def summary(self, decimals: int = 4) -> pd.DataFrame:
        """
        Generate summary table of results.
        
        Returns DataFrame with columns: parameter, estimate, std_error, t_stat
        """
        rows = []
        param_dict = self.get_param_dict()
        
        for i, pname in enumerate(self.free_param_names):
            rows.append({
                'Parameter': pname,
                'Estimate': round(self.theta_free[i], decimals),
                'Std. Error': round(self.se[i], decimals),
                't-stat': round(self.t_stats[i], decimals)
            })
        
        # Add restricted parameters (no SE)
        for pname in self.spec.restrictions:
            rows.append({
                'Parameter': pname,
                'Estimate': round(self.spec.restrictions[pname], decimals),
                'Std. Error': '--',
                't-stat': '--'
            })
        
        df = pd.DataFrame(rows)
        return df
    
    def summary_string(self) -> str:
        """Generate formatted string summary using tabulate."""
        from tabulate import tabulate

        lines = [
            "=" * 70,
            self.spec.specification_string(),
            "=" * 70,
            "\nParameter Estimates:",
        ]
        
        # Create table data
        table_data = []
        for i, pname in enumerate(self.free_param_names):
            table_data.append([
                pname,
                f"{self.theta_free[i]:.4f}",
                f"{self.se[i]:.4f}",
                f"{self.t_stats[i]:.4f}"
            ])
        
        # Add restricted parameters
        for pname in self.spec.restrictions:
            table_data.append([
                pname,
                f"{self.spec.restrictions[pname]:.4f}",
                "(fixed)",
                "--"
            ])
        
        # Format with tabulate
        headers = ["Parameter", "Estimate", "Std. Error", "t-stat"]
        table_str = tabulate(table_data, headers=headers, tablefmt="grid")
        lines.append(table_str)
        
        # Add diagnostics
        lines.extend([
            "",
            f"Observations: {self.n_obs}",
            f"Moments: {self.n_moments}",
            f"Parameters: {self.n_params}",
            f"",
            f"J-statistic: {self.J_stat:.4f}",
            f"J p-value: {self.J_pvalue:.4f}",
            f"Degrees of freedom: {self.J_dof}",
            f"Converged: {self.converged}",
            "=" * 70,
        ])
        
        return '\n'.join(lines)
    
    def to_latex(self, caption: str = None, label: str = None) -> str:
        """
        Export results as LaTeX table.
        
        Parameters:
        -----------
        caption : str, optional
            Table caption
        label : str, optional
            Table label for cross-referencing
        """
        df = self.summary()
        
        # Format for LaTeX
        latex_lines = [
            "\\begin{table}[htbp]",
            "\\centering",
        ]
        
        if caption:
            latex_lines.append(f"\\caption{{{caption}}}")
        if label:
            latex_lines.append(f"\\label{{{label}}}")
        
        # Convert DataFrame to latex
        latex_table = df.to_latex(
            index=False,
            float_format="%.4f",
            na_rep="--",
            column_format="lrrr"
        )
        
        latex_lines.append(latex_table)
        
        # Add notes
        latex_lines.extend([
            f"\\multicolumn{{4}}{{l}}{{\\textit{{Notes:}} $N={self.n_obs}$, ",
            f"$J={self.J_stat:.2f}$ (p={self.J_pvalue:.3f})}}\\\\",
            "\\end{table}",
        ])
        
        return '\n'.join(latex_lines)
    
    def __str__(self) -> str:
        """String representation."""
        return self.summary_string()
    
    def __repr__(self) -> str:
        """REPL representation."""
        return f"EstimationResult('{self.spec.name}', n_obs={self.n_obs}, J={self.J_stat:.2f})"

In [None]:
@dataclass
class SpecificationRunner:
    """
    Run multiple specifications and organize results.

    Streamlined version that returns DataFrames for flexible display and export.
    """

    results: List = None

    def __post_init__(self):
        if self.results is None:
            self.results = []

    def run_specs(self, specifications: List[EstimationSpec], data: pd.DataFrame,
                  methods: List[str] = None, verbose: bool = True):
        """
        Run list of specifications with multiple GMM methods.

        Parameters:
        -----------
        specifications : List[EstimationSpec]
            List of manually defined specifications
        data : pd.DataFrame
            Data for estimation
        methods : List[str], optional
            GMM methods to run. Default: ['one-step', 'two-step', 'iterated']
        verbose : bool
            Show progress

        Returns:
        --------
        self (for method chaining)
        """
        import copy

        if methods is None:
            methods = ['one-step', 'two-step', 'iterated']

        total = len(specifications) * len(methods)
        count = 0

        if verbose:
            print(f"Running {len(specifications)} specifications √ó {len(methods)} methods = {total} estimations\n")

        for spec in specifications:
            for method in methods:
                count += 1

                # Create copy with specified method
                spec_copy = copy.deepcopy(spec)
                spec_copy.gmm_method = method

                if verbose:
                    print(f"[{count}/{total}] {spec.name} ({method})...", end=' ')

                try:
                    result = estimate_gmm(spec_copy, data)
                    self.results.append(result)
                    if verbose:
                        print("‚úì")
                except Exception as e:
                    if verbose:
                        print(f"‚úó Error: {e}")

        if verbose:
            print(f"\n‚úì Completed {count} estimations")

        return self

    def _extract_nlag(self, result) -> int:
        """Extract NLAG from instruments (count consumption lags)."""
        cons_var = result.spec.consumption_var
        nlag = sum(1 for inst in result.spec.instruments
                   if f'{cons_var}_ratio_l' in inst)
        return nlag if nlag > 0 else 1

    def _format_returns(self, returns: List[str]) -> str:
        """Format return variable names."""
        return_map = {
            'ret_ew_gross': 'EW',
            'ret_vw_gross': 'VW',
            'rf_gross': 'RF'
        }
        formatted = [return_map.get(r, r) for r in returns]
        return '+'.join(formatted)

    def _format_consumption(self, cons_var: str) -> str:
        """Format consumption variable name."""
        cons_map = {
            'cons_nds_pc': 'NDS',
            'cons_nds_svc_pc': 'NDS+SVC'
        }
        return cons_map.get(cons_var, cons_var)

    def to_dataframe(self,
                     by_method: bool = True,
                     include_se: bool = True,
                     diagnostics: List[str] = None,
                     decimals: int = 4) -> pd.DataFrame:
        """
        Convert results to DataFrame (replaces display_table).

        Returns one row per model with columns for specification and estimates.

        Parameters:
        -----------
        by_method : bool, default True
            If True, include 'Method' column
            If False, assumes filtering by method happens externally
        include_se : bool, default True
            Include standard error columns
        diagnostics : List[str], optional
            Additional columns to include:
            - 'N': Number of observations
            - 'Moments': Number of moment conditions
            - 'J_stat': J-statistic (œá¬≤)
            - 'J_pval': J-test p-value
            - 'DOF': Degrees of freedom
            - 'Converged': Convergence status
        decimals : int
            Decimal places for rounding

        Returns:
        --------
        pd.DataFrame with one row per model

        Example:
        --------
        >>> df = runner.to_dataframe(diagnostics=['J_stat', 'J_pval', 'DOF'])
        >>> print(df)
        >>> df.to_excel('results.xlsx')
        """
        if not self.results:
            return pd.DataFrame()

        if diagnostics is None:
            diagnostics = []

        # Build rows
        data_rows = []
        for result in self.results:
            # Determine parameter names based on utility
            if result.spec.utility == 'CARA':
                param1_name = 'gamma'
            else:  # CRRA
                param1_name = 'alpha'

            theta = result.theta
            se = result.se

            # Base columns
            row = {
                'Consumption': self._format_consumption(result.spec.consumption_var),
                'Returns': self._format_returns(result.spec.returns),
                'NLAG': self._extract_nlag(result),
                param1_name: round(theta[0], decimals),
                'beta': round(theta[1], decimals)
            }

            # Add method column if requested
            if by_method:
                row = {'Method': result.spec.gmm_method, **row}

            # Add standard errors
            if include_se:
                row[f'se_{param1_name}'] = round(se[0], decimals)
                row['se_beta'] = round(se[1], decimals)

            # Add optional diagnostics
            if 'N' in diagnostics:
                row['N'] = result.n_obs
            if 'Moments' in diagnostics:
                row['Moments'] = result.n_moments
            if 'J_stat' in diagnostics:
                row['J_stat'] = round(result.J_stat, decimals)
            if 'J_pval' in diagnostics:
                row['J_pval'] = round(result.J_pvalue, decimals)
            if 'DOF' in diagnostics:
                row['DOF'] = result.J_dof
            if 'Converged' in diagnostics:
                row['Converged'] = result.converged

            data_rows.append(row)

        return pd.DataFrame(data_rows)

    def to_dataframe_by_method(self, method: str, **kwargs) -> pd.DataFrame:
        """
        Get DataFrame for specific GMM method.

        Parameters:
        -----------
        method : str
            GMM method: 'one-step', 'two-step', or 'iterated'
        **kwargs : passed to to_dataframe()

        Returns:
        --------
        pd.DataFrame filtered to specified method
        """
        df = self.to_dataframe(by_method=True, **kwargs)
        return df[df['Method'] == method].drop(columns=['Method']).reset_index(drop=True)

    def to_excel(self, filepath: str,
                 include_se: bool = True,
                 diagnostics: List[str] = None,
                 decimals: int = 4):
        """
        Export results to Excel with separate sheets per GMM method.

        Parameters:
        -----------
        filepath : str
            Output Excel file path
        include_se : bool
            Include standard error columns
        diagnostics : List[str]
            Optional diagnostic columns (see to_dataframe)
        decimals : int
            Decimal places
        """
        if not self.results:
            print("No results to export")
            return

        with pd.ExcelWriter(filepath, engine='openpyxl') as writer:
            # Sheet per method
            for method in ['one-step', 'two-step', 'iterated']:
                df = self.to_dataframe_by_method(
                    method,
                    include_se=include_se,
                    diagnostics=diagnostics,
                    decimals=decimals
                )
                if len(df) > 0:
                    df.to_excel(writer, sheet_name=method, index=False)

            # Combined sheet
            df_all = self.to_dataframe(
                by_method=True,
                include_se=include_se,
                diagnostics=diagnostics,
                decimals=decimals
            )
            df_all.to_excel(writer, sheet_name='All_Methods', index=False)

        print(f"‚úì Results exported to {filepath}")

    def display(self, by_method: bool = True, max_rows: int = None, **kwargs):
        """
        Display results as formatted DataFrame.

        Parameters:
        -----------
        by_method : bool
            If True, show separate tables per GMM method
            If False, show single combined table
        max_rows : int, optional
            Maximum rows to display (None = all)
        **kwargs : passed to to_dataframe()
        """
        if by_method:
            for method in ['one-step', 'two-step', 'iterated']:
                df = self.to_dataframe_by_method(method, **kwargs)
                if len(df) > 0:
                    print("\n" + "="*80)
                    print(f"{method.upper()} GMM".center(80))
                    print("="*80)
                    if max_rows:
                        print(df.head(max_rows).to_string(index=False))
                    else:
                        print(df.to_string(index=False))
        else:
            df = self.to_dataframe(by_method=True, **kwargs)
            print("\n" + "="*80)
            print("ALL METHODS".center(80))
            print("="*80)
            if max_rows:
                print(df.head(max_rows).to_string(index=False))
            else:
                print(df.to_string(index=False))

    def summary_stats(self):
        """Print summary statistics."""
        print("\n" + "="*80)
        print("SUMMARY STATISTICS")
        print("="*80)

        total = len(self.results)
        rejected_05 = sum(1 for r in self.results if r.J_pvalue < 0.05)
        rejected_10 = sum(1 for r in self.results if r.J_pvalue < 0.10)

        print(f"\nTotal estimations: {total}")
        print(f"Rejected at 5% (J-test): {rejected_05}/{total} ({100*rejected_05/total:.1f}%)")
        print(f"Rejected at 10% (J-test): {rejected_10}/{total} ({100*rejected_10/total:.1f}%)")
        print(f"Accepted at 5%: {total-rejected_05}/{total} ({100*(total-rejected_05)/total:.1f}%)")

        # By method
        print("\nBy GMM Method:")
        for method in ['one-step', 'two-step', 'iterated']:
            method_results = [r for r in self.results if r.spec.gmm_method == method]
            if method_results:
                n = len(method_results)
                rej = sum(1 for r in method_results if r.J_pvalue < 0.05)
                print(f"  {method}: {rej}/{n} rejected ({100*rej/n:.1f}%)")

    def __len__(self):
        return len(self.results)

    def __repr__(self):
        return f"SpecificationRunner({len(self.results)} results)"


### 3.6 Main estimation function

The `estimate_gmm()` function is the unified interface for running GMM estimation with any specification.

In [None]:
def build_moment_matrix(theta: np.ndarray, cfg: GMMConfig, df: pd.DataFrame) -> np.ndarray:
    """
    Construct moment time series G_t for Hansen-Singleton model.

    Parameters:
    -----------
    theta : np.ndarray
        Parameter vector (alpha/gamma, beta)
    cfg : GMMConfig
        Model configuration
    df : pd.DataFrame
        Data with returns, consumption, and instruments

    Returns:
    --------
    G : np.ndarray, shape (T, L)
        Moment matrix where L = (#instruments) * (#assets)
        Each row is g_t = vec(x_t * (M_t*R_t - 1)')
    """
    x = df[cfg.instruments].to_numpy()     # T x L_x
    R = df[cfg.returns_cols].to_numpy()    # T x K (assets)
    T, L_x = x.shape
    K = R.shape[1]

    # Compute stochastic discount factor
    if cfg.util.upper() == 'CRRA':
        alpha, beta = float(theta[0]), float(theta[1])
        cr = df[cfg.cons_ratio_col].to_numpy()  # T vector
        M = sdf_CRRA(cr.reshape(-1,1), alpha, beta)  # T x 1
    else:
        gamma, beta = float(theta[0]), float(theta[1])
        c_next = df[cfg.cons_f1_col].to_numpy().reshape(-1,1)
        c_now = df[cfg.cons_col].to_numpy().reshape(-1,1)
        M = sdf_CARA(c_next, c_now, gamma, beta)  # T x 1

    # Pricing errors: M * R - 1
    resid = R * M - 1.0  # T x K

    # Moments: x_t ‚äó resid_t, then vectorize
    G = np.einsum('ti,tj->tij', x, resid)  # T x L_x x K
    G = G.reshape(T, L_x * K)
    return G

def sdf_CRRA(cons_ratio, alpha, beta):
    """Stochastic discount factor for CRRA utility."""
    return beta * (cons_ratio ** alpha)

def sdf_CARA(c_next, c_now, gamma, beta):
    """Stochastic discount factor for CARA utility."""
    return beta * np.exp(-gamma * (c_next - c_now))

def gbar(theta, cfg, df):
    """Sample moment average."""
    G = build_moment_matrix(theta, cfg, df)
    return G.mean(axis=0)

In [None]:
def estimate_gmm(spec: EstimationSpec, data: pd.DataFrame) -> EstimationResult:
    """
    Run GMM estimation based on specification.
    
    Supports one-step, two-step, and iterated GMM.
    """
    from scipy.stats import chi2
    
    # Step 1: Filter data by sample period
    df = data.copy()
    if spec.sample_start:
        df = df[df['t_idx'] >= spec.sample_start]
    if spec.sample_end:
        df = df[df['t_idx'] <= spec.sample_end]
    
    # Verify all required columns exist
    required_cols = [spec.consumption_var] + spec.returns + spec.instruments
    missing_cols = [c for c in required_cols if c not in df.columns]
    if missing_cols:
        raise ValueError(f"Missing columns in data: {missing_cols}")
    
    df = df.dropna(subset=required_cols).reset_index(drop=True)
    n_obs = len(df)
    
    if n_obs == 0:
        raise ValueError(f"No observations left after filtering for spec '{spec.name}'")
    
    # Step 2: Set up parameter names and restrictions
    if spec.utility.upper() == 'CRRA':
        all_param_names = ['alpha', 'beta']
    else:
        all_param_names = ['gamma', 'beta']
    
    free_param_names = [p for p in all_param_names if p not in spec.restrictions]
    n_params = len(free_param_names)
    
    # Step 3: Build configuration for GMM functions
    cons_ratio_col = spec.consumption_var + '_ratio'
    cons_f1_col = spec.consumption_var + '_f1'
    
    if cons_ratio_col not in df.columns:
        if cons_f1_col not in df.columns:
            df[cons_f1_col] = df[spec.consumption_var].shift(-1)
        df[cons_ratio_col] = df[cons_f1_col] / df[spec.consumption_var]
    
    cfg = GMMConfig(
        util=spec.utility,
        returns_cols=spec.returns,
        instruments=spec.instruments,
        cons_col=spec.consumption_var,
        cons_ratio_col=cons_ratio_col,
        cons_f1_col=cons_f1_col,
        hac_lags=spec.hac_lags
    )
    
    # Step 4: Prepare initial values and bounds
    theta0_dict = spec.initial_values.copy()
    theta0_free = np.array([theta0_dict[p] for p in free_param_names])
    bounds_free = [spec.bounds[p] for p in free_param_names]
    
    # Step 5: Create wrapper functions that handle restrictions
    def expand_theta(theta_free):
        theta_full = np.zeros(len(all_param_names))
        free_idx = 0
        for i, pname in enumerate(all_param_names):
            if pname in spec.restrictions:
                theta_full[i] = spec.restrictions[pname]
            else:
                theta_full[i] = theta_free[free_idx]
                free_idx += 1
        return theta_full
    
    def moment_fn_restricted(theta_free):
        theta_full = expand_theta(theta_free)
        return gbar(theta_full, cfg, df)

    def build_G_restricted(theta_free):
        theta_full = expand_theta(theta_free)
        return build_moment_matrix(theta_full, cfg, df)
    
    # Step 6: Run GMM estimation
    L = len(spec.instruments) * len(spec.returns)
    
    if spec.gmm_method == 'one-step':
        # One-step GMM: single optimization with identity weighting
        W = np.eye(L)
        res = opt.minimize(
            lambda th: gmm_criterion(th, moment_fn_restricted, W),
            theta0_free, method='L-BFGS-B', bounds=bounds_free
        )
        theta_free_hat = res.x
        converged = res.success
        G = build_G_restricted(theta_free_hat)
        S = hac_covariance(G, spec.hac_lags)
        D = numerical_jacobian(moment_fn_restricted, theta_free_hat)
        M = D.T @ W @ D
        Minv = la.pinv(M)
        V = Minv @ (D.T @ W @ S @ W @ D) @ Minv

    elif spec.gmm_method == 'two-step':
        # ============================================================
        # TODO: IMPLEMENT TWO-STEP GMM
        # ============================================================
        # INSTRUCTIONS:
        # 1. First step: Optimize with identity weighting matrix W0 = I
        # 2. Compute HAC covariance S using first-step estimates
        # 3. Compute optimal weighting W = S^(-1)
        # 4. Second step: Re-optimize with optimal weighting W
        # 5. Compute asymptotic variance using final estimates
        #
        # Hint: Follow the same structure as one-step GMM above
        # Hint: Use res1 for first step, res2 for second step
        assert False, "TODO: Implement two-step GMM. Remove this assertion when implemented."

    elif spec.gmm_method == 'iterated':
        # ============================================================
        # TODO: IMPLEMENT ITERATED GMM
        # ============================================================
        # INSTRUCTIONS:
        # 1. Start with identity weighting matrix W = I
        # 2. Iterate until convergence (max spec.max_iter iterations):
        #    a. Optimize with current W
        #    b. Compute HAC covariance S with new estimates
        #    c. Update W = S^(-1)
        #    d. Check convergence: ||theta_new - theta_old|| < spec.tol
        # 3. Compute final asymptotic variance
        #
        # Hint: Use a for loop with early break on convergence
        # Hint: Track theta_curr and theta_new for convergence check
        assert False, "TODO: Implement iterated GMM. Remove this assertion when implemented."

        else:
            # Max iterations reached
            theta_free_hat = theta_curr
            converged = False
        
        # Final variance
        D = numerical_jacobian(moment_fn_restricted, theta_free_hat)
        M = D.T @ W @ D
        Minv = la.pinv(M)
        V = Minv @ (D.T @ W @ S @ W @ D) @ Minv

    else:
        raise ValueError(f"Unknown GMM method: {spec.gmm_method}")

    # Step 7: Compute diagnostics
    theta_full_hat = expand_theta(theta_free_hat)
    g_hat = moment_fn_restricted(theta_free_hat)
    J = n_obs * float(g_hat.T @ W @ g_hat)
    J_dof = L - n_params
    J_pvalue = 1 - chi2.cdf(J, J_dof) if J_dof > 0 else np.nan
    
    # Standard errors and t-statistics
    se = np.sqrt(np.diag(V) / n_obs)
    t_stats = theta_free_hat / se
    
    # Step 8: Package results
    result = EstimationResult(
        spec=spec,
        theta=theta_full_hat,
        theta_free=theta_free_hat,
        se=se,
        t_stats=t_stats,
        param_names=all_param_names,
        free_param_names=free_param_names,
        V=V, S=S, W=W,
        J_stat=J,
        J_pvalue=J_pvalue,
        J_dof=J_dof,
        n_obs=n_obs,
        n_moments=L,
        n_params=n_params,
        converged=converged
    )
    
    return result

## 4. Application Examples

Now we demonstrate how to use the unified configuration system to answer the assignment questions.

### 4.1 Example: Hansen-Singleton replication (Table 1)
<img src="table1.png">

In [1]:
# ============================================================================
# PANEL A: EW + VW Returns
# ============================================================================

print("="*80)
print("EXAMPLE: TABLE I REPLICATION")
print("="*80)

# Define Panel A specifications manually
panel_specs = [

    # Specification 1: NDS, NLAG=1
    EstimationSpec(
        name='Table1-NDS-NLAG1',
        utility='CRRA',
        consumption_var='cons_nds_pc',
        returns=['ret_ew_gross'],
        instruments=[
            'const',
            'cons_nds_pc_ratio_l1',   # 1 lag of consumption growth
            'ret_ew_gross_l1',         # 1 lag of EW return
        ],  # Total: 3 instruments √ó 1 return = 3 moments, DOF = 3-2 = 1
        sample_start='1959-04-01',
        sample_end='1978-12-31'
    ),

    # Specification 2: NDS, NLAG=2
    EstimationSpec(
        name='Table1-NDS-NLAG2',
        utility='CRRA',
        consumption_var='cons_nds_pc',
        returns=['ret_ew_gross'],
        instruments=[
            'const',
            'cons_nds_pc_ratio_l1',   # 1 lag of consumption growth
            'cons_nds_pc_ratio_l2',   # 2 lag of consumption growth
            'ret_ew_gross_l1',         # 1 lag of EW return
            'ret_ew_gross_l2',         # 2 lag of EW return
        ],  # Total: 5 instruments √ó 1 return = 5 moments, DOF = 5-2 = 3
        sample_start='1959-04-01',
        sample_end='1978-12-31'
    ),

    # Specification 3: NDS, NLAG=4
    EstimationSpec(
        name='Table1-NDS-NLAG4',
        utility='CRRA',
        consumption_var='cons_nds_pc',
        returns=['ret_ew_gross'],
        instruments=[
            'const',
            'cons_nds_pc_ratio_l1',   # 1 lag of consumption growth
            'cons_nds_pc_ratio_l2',   # 2 lag of consumption growth
            'cons_nds_pc_ratio_l3',   # 3 lag of consumption growth
            'cons_nds_pc_ratio_l4',   # 4 lag of consumption growth
            'ret_ew_gross_l1',         # 1 lag of EW return
            'ret_ew_gross_l2',         # 2 lag of EW return
            'ret_ew_gross_l3',         # 3 lag of EW return
            'ret_ew_gross_l4',         # 4 lag of EW return
        ],  # Total: 9 instruments √ó 1 return = 9 moments, DOF = 9-2 = 7
        sample_start='1959-04-01',
        sample_end='1978-12-31'
    ),
        # Specification 3: NDS, NLAG=6
    EstimationSpec(
        name='Table1-NDS-NLAG6',
        utility='CRRA',
        consumption_var='cons_nds_pc',
        returns=['ret_ew_gross'],
        instruments=[
            'const',
            'cons_nds_pc_ratio_l1',   # 1 lag of consumption growth
            'cons_nds_pc_ratio_l2',   # 2 lag of consumption growth
            'cons_nds_pc_ratio_l3',   # 3 lag of consumption growth
            'cons_nds_pc_ratio_l4',   # 4 lag of consumption growth
            'cons_nds_pc_ratio_l5',   # 5 lag of consumption growth
            'cons_nds_pc_ratio_l6',   # 6 lag of consumption growth
            'ret_ew_gross_l1',         # 1 lag of EW return
            'ret_ew_gross_l2',         # 2 lag of EW return
            'ret_ew_gross_l3',         # 3 lag of EW return
            'ret_ew_gross_l4',         # 4 lag of EW return
            'ret_ew_gross_l5',         # 5 lag of EW return
            'ret_ew_gross_l6',         # 6 lag of EW return
        ],  # Total: 13 instruments √ó 1 return =  13moments, DOF = 13-2 = 11
        sample_start='1959-04-01',
        sample_end='1978-12-31'
    ),
]

print(f"Defined {len(panel_specs)} specifications")

# Run all Panel specifications with all GMM methods
runner_panel = SpecificationRunner()

# Set verbose = True for detailed information
runner_panel.run_specs(panel_specs, data, verbose=False, methods=['one-step', 'two-step', 'iterated'])

# Display results organized by GMM method (Optional: May choose additional diagnostics)
df = runner_panel.to_dataframe(
    by_method=False,
    decimals=4,
    diagnostics=['N', 'Moments', 'J_stat', 'J_pval', 'DOF', 'Converged']
)
print(df)
# df.to_excel('example_results.xlsx', index=False)

# Optional: Export table format to Excel
# runner_panel.to_excel('example_table_1.xlsx', decimals=4)

EXAMPLE: TABLE I REPLICATION


NameError: name 'EstimationSpec' is not defined

### 4.2 Question 2: Hansen-Singleton replication (Table 3)

#### Panel A
<img src="table3A.png">

There is one more complication here: we now have two sets of orthogonality conditions. The first line estimates of Table III uses equally- and value-weighted returns, with one lag for each of the consumption ration and equally- and value-weighted returns. This is a set of 8 population moment conditions in 2 parameters ($\alpha,\beta$).

$$
\begin{align*}
\mathbb{E}\left[\begin{pmatrix}
1\\
\frac{c_t}{c_{t-1}}\\
r_{t}^*\\
r_{t}
\end{pmatrix} \cdot \left(r_{t+1}^*\beta\left(\frac{c_{t+1}^*}{c_t^*}\right)^{\alpha}-1\right)\right] &= 0 \\
\mathbb{E}\left[\begin{pmatrix}
1\\
\frac{c_t}{c_{t-1}}\\
r_{t}^*\\
r_{t}
\end{pmatrix} \cdot \left(r_{t+1}\beta\left(\frac{c_{t+1}^*}{c_t^*}\right)^{\alpha}-1\right)\right] &= 0
\end{align*}
$$

The degree of freedom (DF) is 2*(1+(NLAG of consumption ratio)+(NLAG of EWR)+(NLAG of VWR))-2
- NLAG=1: $2\times(1+1+1+1)-2=6$
- NLAG=2: $2\times(1+2+2+2)-2=12$
- NLAG=4: $2\times(1+4+4+4)-2=24$

In [None]:
"""
Question 2: Table III Panel A Replication
"""

# ============================================================================
# TODO: DEFINE PANEL A SPECIFICATIONS
# ============================================================================
# INSTRUCTIONS:
# Replicate Table 3 Panel A from Hansen-Singleton (1982).
#
# Panel A uses:
# - Consumption: cons_nds_pc (Nondurables)
# - Returns: ret_ew_gross AND ret_vw_gross (both EW and VW)
# - Utility: CRRA
# - Sample period: '1959-04-01' to '1978-12-31'
# - NLAG values: 1, 2, 4
#
# For each NLAG, you need to define:
# 1. instruments = ['const', consumption lags, return lags]
#    - Consumption lags: cons_nds_pc_ratio_l1, cons_nds_pc_ratio_l2, ..., cons_nds_pc_ratio_lN
#    - Return lags: ret_ew_gross_l1, ret_vw_gross_l1, ret_ew_gross_l2, ret_vw_gross_l2, ..., ret_*_lN
# 2. Create EstimationSpec with these parameters
#
# Example structure for NLAG=1:
# panel_a_specs = [
#     EstimationSpec(
#         name='Table3A-NDS-NLAG1',
#         utility='CRRA',
#         consumption_var='cons_nds_pc',
#         returns=['ret_ew_gross', 'ret_vw_gross'],
#         instruments=['const', 'cons_nds_pc_ratio_l1', 'ret_ew_gross_l1', 'ret_vw_gross_l1'],
#         sample_start='1959-04-01',
#         sample_end='1978-12-31'
#     ),
#     # TODO: Add NLAG=2 specification
#     # TODO: Add NLAG=4 specification
# ]

# TODO: Define panel_a_specs list here
assert False, "TODO: Define panel_a_specs for Question 2 Panel A. Remove this assertion when implemented."

# ============================================================================
# PANEL A: EW + VW Returns
# ============================================================================

print("="*80)
print("Question 2: Panel A (Nondurables, EW+VW)")
print("="*80)

# Run estimations
runner_panel_a = SpecificationRunner()
runner_panel_a.run_specs(panel_a_specs, data_quarterly)

# Display results
df = runner_panel_a.to_dataframe(
    by_method=False,
    decimals=4,
    diagnostics=['N', 'Moments', 'J_stat', 'J_pval', 'DOF', 'Converged']
)
print(df)
# df.to_excel('question2_panel_a.xlsx', index=False)

# Summary statistics
runner_panel_a.summary_stats()


#### Panel B

<img src="table3B.png">

In Hansen and Singleton (1984), they describe how they used the nominal risk-free returns:

$$
\begin{align*}
\mathbb{E}\left[\begin{pmatrix}
1\\
\frac{c_t}{c_{t-1}}\\
r_{t}\\
\frac{R_{t+1}^f}{R_{t}^f}\\
\frac{R_{t}^f}{R_{t-1}^f}
\end{pmatrix} \cdot \left(r_{t+1}\beta\left(\frac{c_{t+1}^*}{c_t^*}\right)^{\alpha}-1\right)\right] &= 0 \\
\mathbb{E}\left[\begin{pmatrix}
1\\
\frac{c_t}{c_{t-1}}\\
r_t \\
\frac{R_{t+1}^f}{R_{t}^f}\\
\frac{R_{t}^f}{R_{t-1}^f}
\end{pmatrix} \cdot \left(R_{t+1}^f \beta\left(\frac{c_{t+1}}{c_t}\right)^{\alpha}-1\right)\right] &= 0
\end{align*}
$$

Their claim is that the degree of freedom is 2* [constant + (risk-free ratio + NLAG for risk-free ratio) + (NLAG for consumption ratio) + (NLAG for VWR)]-2
- NLAG=1: $2\times(1+(1+1)+1+1)-2=8$
- NLAG=2: $2\times(1+(1+2)+2+2)-2=14$
- NLAG=4: $2\times(1+(1+4)+4+4)-2=26$

(Credit to Christine Dabbs) However, it seems like Hansen and Singleton double-counted the number of risk-free ratios. They have df = 8, 16, 32. These numbers are replicable if we counted degree of freedoms as:

2* [constant + 2*(NLAG for risk-free ratio) + (NLAG for consumption ratio) + (NLAG for VWR)]-2
- NLAG=1: $2\times(1+2*1+1+1)-2=8$
- NLAG=2: $2\times(1+2*2+2+2)-2=16$
- NLAG=4: $2\times(1+2*4+4+4)-2=32$


In [None]:
# ============================================================================
# TODO: DEFINE PANEL B SPECIFICATIONS
# ============================================================================
# INSTRUCTIONS:
# Replicate Table 3 Panel B from Hansen-Singleton (1982).
#
# Panel B uses:
# - Consumption: cons_nds_pc (Nondurables)
# - Returns: ret_vw_gross AND rf_gross (VW and Risk-free)
# - Utility: CRRA
# - Sample period: '1959-04-01' to '1978-12-31'
# - NLAG values: 1, 2, 4
#
# Follow the same pattern as Panel A, but change:
# - returns to ['ret_vw_gross', 'rf_gross']
# - return lags: ret_vw_gross_l*, rf_gross_l*
#
# TODO: Define panel_b_specs list here
assert False, "TODO: Define panel_b_specs for Question 2 Panel B. Remove this assertion when implemented."

# ============================================================================
# PANEL B: VW + RF Returns
# ============================================================================

print("="*80)
print("Question 2: Panel B (Nondurables, VW+RF)")
print("="*80)

# Run estimations
runner_panel_b = SpecificationRunner()
runner_panel_b.run_specs(panel_b_specs, data_quarterly)

# Display results
df = runner_panel_b.to_dataframe(
    by_method=False,
    decimals=4,
    diagnostics=['N', 'Moments', 'J_stat', 'J_pval', 'DOF', 'Converged']
)
print(df)
# df.to_excel('question2_panel_b.xlsx', index=False)

# Summary statistics
runner_panel_b.summary_stats()


### 4.3 Question 3: Testing the over-identifying restrictions

Here, you are now estimating 4 parameters, $(\alpha_1,\beta_1,\alpha_2,\beta_2)$. You may want to find $(\alpha_1,\beta_1)$ only with data from 1960.1 to 1978.12 and $(\alpha_2,\beta_2)$ with data from 1979.1 and onward. You can change `f_HS_TABLE3_EXAMPLE` to do this exercise. You can only use data from 1960.1 to 1978.12 for `m1` and use data from 1979.1 to 2020.12 for `m2`. You may also want to use $4\times1$ vector $\vec{\gamma}$. This is your 'unconstrained' GMM.

Now, you are asked to test $(\alpha_1,\beta_1)=(\alpha_2,\beta_2)$. The following theorem will be helpful:

#### Theorem ([Professor Miller's Lecture Notes, slide 17](https://comlabgames.com/structuraleconometrics/3%20Asymptotic%20Theory%20for%20Nonlinear%20Models/13%20Testing%20Parametric%20Models/13%20Testing%20Parametric%20Models.pdf))
Let $J_{un}$ be the unconstrained GMM criterion function with $l_1$ orthogonality conditions and $k_1$ parameters. Let $J_{con}$ be the constrained GMM criterion function with $l_2$ orthogonality conditions and $k_2$ parameters. Then,
$$
J_{con} - J_{un} \overset{d}{\longrightarrow} \chi^2_{(l_2-k_2)-(l_1-k_1)}.
$$

Additionally, you might be able to test models with more parameters in more time frames, say, before/after the financial crisis in 2008 and before/after COVID-19. To do so, you may need more instruments using more lagged variables.

In [None]:
"""
Question 3: Parameter Stability Test Across Time Periods
"""

# ============================================================================
# TODO: IMPLEMENT PARAMETER STABILITY TEST
# ============================================================================
# OBJECTIVE:
# Test if parameters are stable across time by comparing:
# - CONSTRAINED model: One (Œ±, Œ≤) for all data ‚Üí 2 parameters
# - UNCONSTRAINED model: Different (Œ±‚ÇÅ, Œ≤‚ÇÅ), (Œ±‚ÇÇ, Œ≤‚ÇÇ) per period ‚Üí 4 parameters
#
# If unconstrained fits significantly better ‚Üí parameters NOT stable
# If constrained fits just as well ‚Üí parameters stable
#
# ============================================================================
# PSEUDOCODE STRUCTURE:
# ============================================================================
#
# PART A: CONSTRAINED MODEL (2 parameters)
# ============================================================================
# Estimate ONE set of parameters (Œ±, Œ≤) using ALL data from both periods
#
# STEP A1: Create specification for full sample
# --------------------------------
# constrained_spec = EstimationSpec(
#     name='Constrained_FullSample',
#     utility='CRRA',
#     consumption_var='cons_nds_pc',
#     returns=['ret_ew_gross', 'ret_vw_gross'],
#     instruments=[...],  # NLAG=1 instruments
#     gmm_method='two-step',
#     sample_start='1960-01-01',  # Start of Period 1
#     sample_end='2020-12-31'      # End of Period 2
# )
#
# STEP A2: Run constrained estimation
# --------------------------------
# runner_constrained = SpecificationRunner()
# runner_constrained.run_specs([constrained_spec], data_quarterly)
# result_constrained = runner_constrained.results[0]
#
# # Extract constrained estimates
# theta_constrained = result_constrained.theta  # [Œ±, Œ≤] - same for both periods
# J_constrained = result_constrained.J_stat     # GMM objective value
# n_constrained = result_constrained.n_obs      # Total sample size
#
# print("CONSTRAINED MODEL (2 parameters):")
# print(f"Œ± = {theta_constrained[0]:.4f}")
# print(f"Œ≤ = {theta_constrained[1]:.4f}")
# print(f"J-statistic = {J_constrained:.4f}")
# print(f"N = {n_constrained}")
#
#
# PART B: UNCONSTRAINED MODEL (4 parameters)
# ============================================================================
# Estimate SEPARATE parameters (Œ±‚ÇÅ, Œ≤‚ÇÅ) and (Œ±‚ÇÇ, Œ≤‚ÇÇ) for each period
#
# STEP B1: Estimate for Period 1
# --------------------------------
# import copy
#
# spec_period1 = copy.deepcopy(constrained_spec)
# spec_period1.name = 'Unconstrained_Period1'
# spec_period1.sample_start = '1960-01-01'
# spec_period1.sample_end = '1978-12-31'
#
# runner_period1 = SpecificationRunner()
# runner_period1.run_specs([spec_period1], data_quarterly)
# result_period1 = runner_period1.results[0]
#
# theta1 = result_period1.theta  # [Œ±‚ÇÅ, Œ≤‚ÇÅ]
# V1 = result_period1.V          # Covariance matrix
# n1 = result_period1.n_obs
#
# STEP B2: Estimate for Period 2
# --------------------------------
# spec_period2 = copy.deepcopy(constrained_spec)
# spec_period2.name = 'Unconstrained_Period2'
# spec_period2.sample_start = '1979-01-01'
# spec_period2.sample_end = '2020-12-31'
#
# runner_period2 = SpecificationRunner()
# runner_period2.run_specs([spec_period2], data_quarterly)
# result_period2 = runner_period2.results[0]
#
# theta2 = result_period2.theta  # [Œ±‚ÇÇ, Œ≤‚ÇÇ]
# V2 = result_period2.V
# n2 = result_period2.n_obs
#
# print("\nUNCONSTRAINED MODEL (4 parameters):")
# print(f"Period 1 (1960-1978): Œ±‚ÇÅ = {theta1[0]:.4f}, Œ≤‚ÇÅ = {theta1[1]:.4f}, N = {n1}")
# print(f"Period 2 (1979-2020): Œ±‚ÇÇ = {theta2[0]:.4f}, Œ≤‚ÇÇ = {theta2[1]:.4f}, N = {n2}")
#
#
# PART C: WALD TEST
# ============================================================================
# Test H0: (Œ±‚ÇÅ, Œ≤‚ÇÅ) = (Œ±‚ÇÇ, Œ≤‚ÇÇ)
# Equivalently: Is unconstrained model significantly better than constrained?
#
# STEP C1: Method 1 - Test parameter differences
# --------------------------------
# # Test if Œ∏‚ÇÅ = Œ∏‚ÇÇ
# # W = (Œ∏‚ÇÅ - Œ∏‚ÇÇ)' √ó Var(Œ∏‚ÇÅ - Œ∏‚ÇÇ)^(-1) √ó (Œ∏‚ÇÅ - Œ∏‚ÇÇ)
# # where Var(Œ∏‚ÇÅ - Œ∏‚ÇÇ) = V‚ÇÅ/n‚ÇÅ + V‚ÇÇ/n‚ÇÇ
#
# import numpy as np
# from scipy.stats import chi2
# import numpy.linalg as la
#
# theta_diff = theta1 - theta2
# V_diff = V1/n1 + V2/n2
#
# try:
#     V_diff_inv = la.inv(V_diff)
# except:
#     V_diff_inv = la.pinv(V_diff)
#
# wald_stat = theta_diff @ V_diff_inv @ theta_diff
# dof = 2  # Testing 2 restrictions: Œ±‚ÇÅ=Œ±‚ÇÇ and Œ≤‚ÇÅ=Œ≤‚ÇÇ
#
# # Alternative: Method 2 - Using J-statistics (if available)
# # W = n_constrained * J_constrained - (n1 * J1 + n2 * J2)
# # This works if you have the GMM objective values from each estimation
#
# STEP C2: Compute p-value
# --------------------------------
# pvalue = 1 - chi2.cdf(wald_stat, dof)
#
# STEP C3: Display results
# --------------------------------
# print("\n" + "="*80)
# print("WALD TEST: CONSTRAINED vs UNCONSTRAINED")
# print("="*80)
# print(f"H0: Parameters are STABLE ‚Üí (Œ±‚ÇÅ, Œ≤‚ÇÅ) = (Œ±‚ÇÇ, Œ≤‚ÇÇ)")
# print(f"Ha: Parameters are NOT STABLE ‚Üí (Œ±‚ÇÅ, Œ≤‚ÇÅ) ‚â† (Œ±‚ÇÇ, Œ≤‚ÇÇ)")
# print()
# print(f"Constrained model:   Œ± = {theta_constrained[0]:.4f}, Œ≤ = {theta_constrained[1]:.4f}")
# print(f"Period 1:           Œ±‚ÇÅ = {theta1[0]:.4f}, Œ≤‚ÇÅ = {theta1[1]:.4f}")
# print(f"Period 2:           Œ±‚ÇÇ = {theta2[0]:.4f}, Œ≤‚ÇÇ = {theta2[1]:.4f}")
# print()
# print(f"Wald statistic: {wald_stat:.4f}")
# print(f"Degrees of freedom: {dof}")
# print(f"P-value: {pvalue:.4f}")
# print()
# if pvalue < 0.05:
#     print("‚úì REJECT H0 at 5% level")
#     print("  ‚Üí Unconstrained model fits significantly better")
#     print("  ‚Üí Parameters are NOT stable across time")
#     print("  ‚Üí The consumption-returns relationship has changed")
# else:
#     print("‚úó FAIL TO REJECT H0 at 5% level")
#     print("  ‚Üí Constrained model fits just as well")
#     print("  ‚Üí Parameters appear stable across time")
#     print("  ‚Üí Can use one model for both periods")
# print("="*80)
#
#
# ============================================================================
# SUMMARY OF WHAT TO COMPARE:
# ============================================================================
#
# Model                  Parameters        Uses Data From
# --------------------   --------------    ------------------
# CONSTRAINED            (Œ±, Œ≤)            Both periods combined
#                        2 params          ‚Üí Assumes stability
#
# UNCONSTRAINED          (Œ±‚ÇÅ, Œ≤‚ÇÅ, Œ±‚ÇÇ, Œ≤‚ÇÇ)  Each period separate
#                        4 params          ‚Üí Allows instability
#
# WALD TEST: Is the extra flexibility of 4 params justified?
# - If yes (reject H0) ‚Üí Parameters changed over time
# - If no (fail to reject) ‚Üí Parameters stable, use constrained model
#
# ============================================================================
# IMPLEMENTATION TIPS:
# ============================================================================
# 1. Use the SAME instruments for all three estimations
# 2. Use the SAME GMM method (e.g., two-step) for all three
# 3. Make sure period dates don't overlap
# 4. The constrained model uses n‚ÇÅ + n‚ÇÇ observations
# 5. Check that n_constrained ‚âà n1 + n2 (accounting for lag losses)
#
# ============================================================================

# TODO: Implement Question 3 here following the pseudocode above
# Remove the assertion below when you start implementing
assert False, "TODO: Implement Question 3 parameter stability test. Remove this assertion when implemented."

# ============================================================================
# EXAMPLE OUTPUT:
# ============================================================================
#
# CONSTRAINED MODEL (2 parameters):
# Œ± = -1.045
# Œ≤ = 0.978
# J-statistic = 12.45
# N = 499
#
# UNCONSTRAINED MODEL (4 parameters):
# Period 1 (1960-1978): Œ±‚ÇÅ = -1.001, Œ≤‚ÇÅ = 0.981, N = 235
# Period 2 (1979-2020): Œ±‚ÇÇ = -1.125, Œ≤‚ÇÇ = 0.975, N = 264
#
# ================================================================================
# WALD TEST: CONSTRAINED vs UNCONSTRAINED
# ================================================================================
# H0: Parameters are STABLE ‚Üí (Œ±‚ÇÅ, Œ≤‚ÇÅ) = (Œ±‚ÇÇ, Œ≤‚ÇÇ)
# Ha: Parameters are NOT STABLE ‚Üí (Œ±‚ÇÅ, Œ≤‚ÇÅ) ‚â† (Œ±‚ÇÇ, Œ≤‚ÇÇ)
#
# Constrained model:   Œ± = -1.045, Œ≤ = 0.978
# Period 1:           Œ±‚ÇÅ = -1.001, Œ≤‚ÇÅ = 0.981
# Period 2:           Œ±‚ÇÇ = -1.125, Œ≤‚ÇÇ = 0.975
#
# Wald statistic: 2.34
# Degrees of freedom: 2
# P-value: 0.3102
#
# ‚úó FAIL TO REJECT H0 at 5% level
#   ‚Üí Constrained model fits just as well
#   ‚Üí Parameters appear stable across time
#   ‚Üí Can use one model for both periods
# ================================================================================
#
# INTERPRETATION:
# - We compared two models: one assuming stability, one allowing change
# - The Wald test asks: "Is the unconstrained model significantly better?"
# - P-value > 0.05 means: No, the simple constrained model is sufficient
# - Conclusion: Parameters haven't changed significantly over time
# ============================================================================


### 4.4 Question 4: Hansen-Singleton replication (Table 3) using CARA
Repeat the exercises entailed in Questions 2 and 3 under CARA utility. Compare the estimation results and over-identification test results under CRRA and CARA. Discuss any differences you observe.

### Panel A

In [None]:
"""
Question 4: Table III Panel A & B Replication (CARA)
"""

# ============================================================================
# TODO: DEFINE PANEL A SPECIFICATIONS (CARA UTILITY)
# ============================================================================
# INSTRUCTIONS:
# Repeat Question 2, but using CARA utility instead of CRRA.
# 
# Changes from Question 2:
# - utility='CARA' (instead of 'CRRA')
# - All other parameters remain the same
#
# Panel A uses:
# - Consumption: cons_nds_pc (Nondurables)
# - Returns: ret_ew_gross AND ret_vw_gross (both EW and VW)
# - Utility: CARA  ‚Üê KEY CHANGE
# - Sample period: '1959-04-01' to '1978-12-31'
# - NLAG values: 1, 2, 4
#
# Example for NLAG=1:
# panel_a_cara_specs = [
#     EstimationSpec(
#         name='Table3A-NDS-NLAG1-CARA',
#         utility='CARA',  # ‚Üê Changed from CRRA
#         consumption_var='cons_nds_pc',
#         returns=['ret_ew_gross', 'ret_vw_gross'],
#         instruments=['const', 'cons_nds_pc_ratio_l1', 'ret_ew_gross_l1', 'ret_vw_gross_l1'],
#         sample_start='1959-04-01',
#         sample_end='1978-12-31'
#     ),
#     # TODO: Add NLAG=2 specification
#     # TODO: Add NLAG=4 specification
# ]

# TODO: Define panel_a_cara_specs list here
assert False, "TODO: Define panel_a_cara_specs for Question 4 Panel A. Remove this assertion when implemented."

# ============================================================================
# PANEL A: EW + VW Returns (CARA)
# ============================================================================

print("="*80)
print("Question 4: Panel A (Nondurables, EW+VW, CARA)")
print("="*80)

# Run estimations
runner_panel_a_cara = SpecificationRunner()
runner_panel_a_cara.run_specs(panel_a_cara_specs, data_quarterly)

# Display results
df = runner_panel_a_cara.to_dataframe(
    by_method=False, 
    decimals=4, 
    diagnostics=['N', 'Moments', 'J_stat', 'J_pval', 'DOF', 'Converged']
)
print(df)
# df.to_excel('question4_panel_a_cara.xlsx', index=False)

# Summary statistics
runner_panel_a_cara.summary_stats()


In [None]:
# ============================================================================
# TODO: DEFINE PANEL B SPECIFICATIONS (CARA UTILITY)
# ============================================================================
# INSTRUCTIONS:
# Replicate Panel B with CARA utility.
#
# Panel B uses:
# - Consumption: cons_nds_pc (Nondurables)
# - Returns: ret_vw_gross AND rf_gross (VW and Risk-free)
# - Utility: CARA  ‚Üê KEY CHANGE
# - Sample period: '1959-04-01' to '1978-12-31'
# - NLAG values: 1, 2, 4
#
# TODO: Define panel_b_cara_specs list here
assert False, "TODO: Define panel_b_cara_specs for Question 4 Panel B. Remove this assertion when implemented."

# ============================================================================
# PANEL B: VW + RF Returns (CARA)
# ============================================================================

print("="*80)
print("Question 4: Panel B (Nondurables, VW+RF, CARA)")
print("="*80)

# Run estimations
runner_panel_b_cara = SpecificationRunner()
runner_panel_b_cara.run_specs(panel_b_cara_specs, data_quarterly)

# Display results
df = runner_panel_b_cara.to_dataframe(
    by_method=False, 
    decimals=4, 
    diagnostics=['N', 'Moments', 'J_stat', 'J_pval', 'DOF', 'Converged']
)
print(df)
# df.to_excel('question4_panel_b_cara.xlsx', index=False)

# Summary statistics
runner_panel_b_cara.summary_stats()


### 4.5 Question 5: Validation
On the basis of the evidence from your work, which is the more palatable parameterization. Briefly explain the reasons for your choice.
