# Tutorial 07: Synthesis - Comparing All Standard Error Methods

**Author**: PanelBox Development Team  
**Date**: 2026-02-16  
**Estimated Duration**: 120-150 minutes  
**Difficulty Level**: Advanced (synthesis)  
**Prerequisites**: Notebooks 01-06 (all SE methods)

---

## Learning Objectives

By the end of this tutorial, you will be able to:

1. **Synthesize** all standard error methods learned in previous notebooks
2. **Apply** a systematic comparison framework using `StandardErrorComparison`
3. **Develop** decision-making intuition for choosing SE methods
4. **Identify** when SE choice critically affects inference
5. **Create** publication-ready tables with multiple SE specifications
6. **Follow** best practices for reporting robust inference
7. **Conduct** comprehensive sensitivity analysis

---

## Table of Contents

1. [Introduction: The SE Decision Tree](#section1)
2. [Application 1: Financial Panel - Two-Way Clustering](#section2)
3. [Application 2: Macro Panel - Driscoll-Kraay vs PCSE](#section3)
4. [Application 3: Wage Panel - Cluster-Robust vs Quantile](#section4)
5. [Decision-Making Framework](#section5)
6. [Publication-Ready Tables](#section6)
7. [Case Study: When SE Choice Matters Most](#section7)
8. [Exercises](#section8)
9. [Summary and Takeaways](#section9)
10. [References](#section10)

---

<a id='setup'></a>
## Setup

In [None]:
# Standard imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib.patches import FancyBboxPatch
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

# PanelBox imports
import sys
sys.path.insert(0, '../../../desenvolvimento')
import panelbox as pb
from panelbox.models.static import FixedEffects, PooledOLS
from panelbox.models.quantile import PooledQuantile
from panelbox.inference.quantile import QuantileBootstrap
from panelbox.standard_errors.comparison import StandardErrorComparison

# Configuration
np.random.seed(42)
sns.set_style("whitegrid")
plt.rcParams['figure.dpi'] = 100
plt.rcParams['figure.figsize'] = (12, 7)

# Paths
DATA_PATH = '../data/'
FIG_PATH = '../outputs/figures/07_comparison/'

import os
os.makedirs(FIG_PATH, exist_ok=True)
os.makedirs('../outputs/reports/', exist_ok=True)

print("Setup complete! All libraries loaded.")
print(f"Output directory: {FIG_PATH}")

---

<a id='section1'></a>
## 1. Introduction: The SE Decision Tree

### 1.1 Review: Taxonomy of Standard Error Methods

After six tutorials covering individual methods, let's synthesize everything into a unified framework.

**Complete Map of Standard Errors in Panel Data**:

```
Standard Errors in Panel Data
│
├─ HETEROSKEDASTICITY ONLY
│  ├─ HC0 (White 1980)                  → Basic sandwich
│  ├─ HC1 (DF-corrected)                → Stata default 'robust'
│  ├─ HC2 (Leverage-adjusted)           → Moderate correction
│  └─ HC3 (Aggressive leverage)         → Best small samples
│
├─ TEMPORAL CORRELATION
│  ├─ Cluster by Entity                 → Most common in micro panels
│  ├─ Newey-West HAC                    → Time series / short panels
│  └─ Driscoll-Kraay HAC               → Panel with cross-section corr.
│
├─ CROSS-SECTIONAL CORRELATION
│  ├─ Cluster by Time                   → Common shocks approach
│  ├─ Driscoll-Kraay HAC               → Handles both dimensions
│  └─ PCSE                             → Political science default
│
├─ BOTH TEMPORAL & CROSS-SECTIONAL
│  ├─ Two-Way Clustering               → Finance panels standard
│  ├─ Driscoll-Kraay HAC               → Macro panels
│  └─ Spatial HAC                      → Geographic data
│
└─ NONLINEAR MODELS
   ├─ MLE Sandwich (robust)             → Logit, Probit, Poisson
   ├─ MLE Cluster-robust               → Panel nonlinear models
   └─ Bootstrap                         → Quantile regression
```

### 1.2 Key Questions for Choosing SE Method

Before selecting a SE method, answer these diagnostic questions:

| # | Question | Implication |
|---|----------|-------------|
| 1 | Data type: Cross-section, Time series, or Panel? | Determines applicable methods |
| 2 | Panel type: Micro (N large, T small) or Macro (N small, T large)? | Cluster vs HAC |
| 3 | Heteroskedasticity present? | Almost always yes → use robust |
| 4 | Temporal correlation within entities? | Cluster by entity |
| 5 | Cross-sectional correlation (same time period)? | Two-way or Driscoll-Kraay |
| 6 | Spatial correlation? | Spatial HAC |
| 7 | Number of clusters G ≥ 20? | Required for cluster asymptotics |

### 1.3 Objectives of This Notebook

We apply these principles to **three real empirical problems**:

| Application | Dataset | Primary Issue | Methods Compared |
|-------------|---------|---------------|------------------|
| **Financial Panel** | Stock returns (50 firms, 120 months) | Both temporal & cross-section corr. | Robust, Cluster, Two-Way, Driscoll-Kraay |
| **Macro Panel** | Growth (30 countries, 40 years) | Cross-section + temporal corr. | Driscoll-Kraay, PCSE, Two-Way Cluster |
| **Wage Panel** | Wages (2000 individuals, 5 years) | Temporal corr. within persons | Cluster, Robust, Quantile Bootstrap |

In [None]:
# Visualize the SE Decision Framework
fig, ax = plt.subplots(figsize=(16, 11))
ax.set_xlim(0, 16)
ax.set_ylim(0, 11)
ax.axis('off')

def add_box(ax, x, y, w, h, text, color, fontsize=9.5, alpha=0.85, bold=False):
    box = FancyBboxPatch((x - w/2, y - h/2), w, h,
                          boxstyle="round,pad=0.1",
                          facecolor=color, edgecolor='#333333', linewidth=1.5,
                          alpha=alpha)
    ax.add_patch(box)
    weight = 'bold' if bold else 'normal'
    ax.text(x, y, text, ha='center', va='center', fontsize=fontsize,
            fontweight=weight, wrap=True,
            multialignment='center')

def add_arrow(ax, x1, y1, x2, y2, label='', color='#555555'):
    ax.annotate('', xy=(x2, y2), xytext=(x1, y1),
                arrowprops=dict(arrowstyle='->', color=color, lw=1.8))
    if label:
        mx, my = (x1+x2)/2, (y1+y2)/2
        ax.text(mx + 0.15, my, label, fontsize=8, color=color, fontstyle='italic')

# Start
add_box(ax, 8, 10.2, 3.2, 0.7, 'Panel Data Model\n(Regression)', '#4C72B0', fontsize=10, bold=True)

# Q1: Panel type
add_box(ax, 8, 9.0, 4.0, 0.7, 'Panel Type?', '#FFA500', fontsize=9.5, alpha=0.9, bold=True)
add_arrow(ax, 8, 9.85, 8, 9.35)

# Micro panel
add_box(ax, 3.5, 7.8, 3.8, 0.65,
        'Micro Panel\n(N large, T small)', '#90EE90', fontsize=9)
add_arrow(ax, 6.0, 9.0, 5.4, 7.8, 'N>>T')

# Macro panel
add_box(ax, 12.5, 7.8, 3.8, 0.65,
        'Macro Panel\n(N small, T large)', '#FFB6C1', fontsize=9)
add_arrow(ax, 10.0, 9.0, 11.6, 7.8, 'T>>N')

# Micro: clustering questions
add_box(ax, 2.0, 6.5, 3.2, 0.6, 'G ≥ 20 clusters?', '#FFA500', fontsize=9, alpha=0.8, bold=False)
add_box(ax, 5.0, 6.5, 3.2, 0.6, 'Cross-section\ncorrelation?', '#FFA500', fontsize=9, alpha=0.8)
add_arrow(ax, 3.5, 7.48, 2.0, 6.8)
add_arrow(ax, 3.5, 7.48, 5.0, 6.8)

# Micro solutions
add_box(ax, 1.0, 5.3, 2.4, 0.65,
        'Bootstrap\n(small G)', '#87CEEB', fontsize=8.5)
add_box(ax, 3.2, 5.3, 2.8, 0.65,
        'Cluster by Entity\n(HC1 baseline)', '#3CB371', fontsize=8.5, alpha=0.9)
add_box(ax, 5.8, 5.3, 2.8, 0.65,
        'Two-Way\nClustering', '#2E8B57', fontsize=8.5, alpha=0.9)
add_arrow(ax, 2.0, 6.2, 1.0, 5.65, 'No')
add_arrow(ax, 2.0, 6.2, 3.2, 5.65, 'Yes')
add_arrow(ax, 5.0, 6.2, 5.8, 5.65, 'Yes')
add_arrow(ax, 5.0, 6.2, 3.2, 5.65, 'No')

# Macro: method questions  
add_box(ax, 11.0, 6.5, 3.2, 0.6, 'T > N?', '#FFA500', fontsize=9, alpha=0.8, bold=False)
add_box(ax, 14.0, 6.5, 3.2, 0.6, 'T > 20?', '#FFA500', fontsize=9, alpha=0.8)
add_arrow(ax, 12.5, 7.48, 11.0, 6.8)
add_arrow(ax, 12.5, 7.48, 14.0, 6.8)

# Macro solutions
add_box(ax, 10.0, 5.3, 2.8, 0.65,
        'Driscoll-Kraay\n(primary)', '#DC143C', fontsize=8.5, alpha=0.9)
add_box(ax, 13.0, 5.3, 2.8, 0.65,
        'PCSE\n(T >> N)', '#FF6347', fontsize=8.5, alpha=0.9)
add_box(ax, 15.2, 5.3, 1.8, 0.65,
        'Cluster\nOnly', '#FFA07A', fontsize=8.5, alpha=0.9)
add_arrow(ax, 11.0, 6.2, 10.0, 5.65, 'No/T≈N')
add_arrow(ax, 11.0, 6.2, 13.0, 5.65, 'Yes')
add_arrow(ax, 14.0, 6.2, 13.0, 5.65, 'Yes')
add_arrow(ax, 14.0, 6.2, 15.2, 5.65, 'No')

# Special cases row
add_box(ax, 8, 4.0, 14, 0.65,
        'Special Cases: Nonlinear Models → MLE Sandwich/Cluster-Robust  |  '
        'Quantile Regression → Cluster Bootstrap  |  Spatial Data → Spatial HAC',
        '#D8BFD8', fontsize=8.5, alpha=0.9)

# Best practices row
add_box(ax, 8, 3.0, 14, 0.65,
        'ALWAYS: Report primary + robustness check  |  '
        'NEVER use nonrobust in panel data  |  '
        'G < 20 clusters → Bootstrap or aggregate',
        '#FFFACD', fontsize=8.5, alpha=0.9, bold=False)

# Legend
legend_elements = [
    mpatches.Patch(facecolor='#FFA500', label='Decision node'),
    mpatches.Patch(facecolor='#3CB371', label='Micro panel method'),
    mpatches.Patch(facecolor='#DC143C', label='Macro panel method'),
    mpatches.Patch(facecolor='#D8BFD8', label='Special cases'),
]
ax.legend(handles=legend_elements, loc='lower left', fontsize=9,
          bbox_to_anchor=(0.01, 0.01))

ax.set_title('Standard Error Decision Flowchart for Panel Data',
             fontsize=16, fontweight='bold', pad=15)
plt.tight_layout()
plt.savefig(FIG_PATH + 'decision_flowchart.png', dpi=150, bbox_inches='tight')
plt.show()
print("Decision flowchart saved.")

---

<a id='section2'></a>
## 2. Application 1: Financial Panel — Two-Way Clustering

### 2.1 Context and Research Question

**Research Question**: Do market risk and value-to-book ratio explain stock excess returns?

**Data Structure**:
- N = 50 firms, T = 120 months (10 years) → balanced panel
- Variables: `returns` (excess return), `market_ret` (market premium), `book_to_market` (value factor)

**Expected Correlations**:
- **Temporal (within firm)**: Momentum, persistent idiosyncratic risk
- **Cross-sectional (same month)**: Market shocks, macroeconomic news

> **Key Insight**: Both correlations present → Classical or even firm-clustered SEs are inadequate.

In [None]:
# Load financial panel
fin_data = pd.read_csv(DATA_PATH + 'financial_panel.csv')

print("=" * 60)
print("FINANCIAL PANEL DATA")
print("=" * 60)
print(f"Shape: {fin_data.shape}")
print(f"Firms: {fin_data['firm_id'].nunique()}")
print(f"Months: {fin_data['month'].nunique()}")
print(f"Variables: {list(fin_data.columns)}")
print()

print("Descriptive Statistics:")
print(fin_data[['returns', 'market_ret', 'book_to_market']].describe().round(4))

# Note on collinearity: 'size' is time-invariant within firms → absorbed by FE
# We use market_ret and book_to_market (both vary within firms over time)
size_within_std = fin_data.groupby('firm_id')['size'].std().mean()
print(f"\nNote: 'size' within-firm std = {size_within_std:.4f} → time-invariant, absorbed by FE")
print("Model uses: returns ~ market_ret + book_to_market")

In [None]:
# Fixed effects model (firm + month fixed effects)
fe_fin = FixedEffects("returns ~ market_ret + book_to_market",
                      fin_data, "firm_id", "month")

# Estimate with different SE methods
results_fin = {}

# 1. Non-robust (baseline — incorrect for panel data)
results_fin['nonrobust'] = fe_fin.fit(cov_type='nonrobust')
print("[1/6] Non-robust: done")

# 2. Robust HC1 (ignores correlation structure)
results_fin['robust'] = fe_fin.fit(cov_type='robust')
print("[2/6] Robust (HC1): done")

# 3. Cluster by firm (captures within-firm temporal correlation)
results_fin['cluster_firm'] = fe_fin.fit(cov_type='clustered')
print("[3/6] Cluster by firm: done")

# 4. Two-way clustering (firm + month — captures both dimensions)
results_fin['twoway'] = fe_fin.fit(cov_type='twoway')
print("[4/6] Two-way clustering: done")

# 5. Driscoll-Kraay (alternative to two-way for macro-style panels)
results_fin['driscoll_kraay'] = fe_fin.fit(cov_type='driscoll_kraay', max_lags=6)
print("[5/6] Driscoll-Kraay (lags=6): done")

# 6. Newey-West HAC
results_fin['newey_west'] = fe_fin.fit(cov_type='newey_west', max_lags=6)
print("[6/6] Newey-West (lags=6): done")

print("\nAll financial panel estimations complete.")

### 2.2 Comprehensive SE Comparison — Financial Panel

In [None]:
# Create comparison using StandardErrorComparison
comp_fin = StandardErrorComparison(results_fin['twoway'])

result_fin = comp_fin.compare_all(
    se_types=['nonrobust', 'robust', 'clustered', 'twoway', 'driscoll_kraay', 'newey_west']
)

print("=" * 80)
print("FINANCIAL PANEL: STANDARD ERROR COMPARISON")
print("=" * 80)
print()

# SE comparison table
se_table = result_fin.se_comparison.copy()
se_table.columns = ['NonRobust', 'Robust', 'Cluster\n(Firm)', 'Two-Way', 'Driscoll\nKraay', 'Newey\nWest']
print("Standard Errors by Method:")
print(se_table.round(5).to_string())
print()

# SE ratios relative to non-robust
print("SE Ratios Relative to Non-Robust:")
print(result_fin.se_ratios.round(3).to_string())
print()

# Significance comparison
print("Significance Stars (* p<0.10, ** p<0.05, *** p<0.01):")
print(result_fin.significance.to_string())

In [None]:
# Quantify the inflation: nonrobust vs twoway vs driscoll_kraay
print("=" * 70)
print("KEY FINDING: SE INFLATION IN FINANCIAL PANEL")
print("=" * 70)

for var in ['market_ret', 'book_to_market']:
    se_nr = results_fin['nonrobust'].std_errors[var]
    se_rob = results_fin['robust'].std_errors[var]
    se_cl = results_fin['cluster_firm'].std_errors[var]
    se_tw = results_fin['twoway'].std_errors[var]
    se_dk = results_fin['driscoll_kraay'].std_errors[var]

    print(f"\n{var}:")
    print(f"  Non-robust:     {se_nr:.5f} (baseline = 1.00x)")
    print(f"  Robust:         {se_rob:.5f} ({se_rob/se_nr:.2f}x vs nonrobust)")
    print(f"  Cluster (firm): {se_cl:.5f} ({se_cl/se_nr:.2f}x vs nonrobust)")
    print(f"  Two-Way:        {se_tw:.5f} ({se_tw/se_nr:.2f}x vs nonrobust)")
    print(f"  Driscoll-Kraay: {se_dk:.5f} ({se_dk/se_nr:.2f}x vs nonrobust)")

print()
print("Observations:")
print("  1. Two-way SEs are substantially larger than robust SEs")
print("  2. Firm clustering > Non-robust (temporal autocorrelation present)")
print("  3. Driscoll-Kraay ≈ Two-Way (both capture multi-dimensional correlation)")
print("  4. Ignoring correlation structure → over-rejection of null hypotheses")

### 2.3 Visualization: SE Comparison Bar Chart

In [None]:
# Visualize SE comparison for financial panel
vars_to_plot = ['market_ret', 'book_to_market']
method_keys = ['nonrobust', 'robust', 'cluster_firm', 'twoway', 'driscoll_kraay', 'newey_west']
method_labels = ['Non\nRobust', 'Robust\n(HC1)', 'Cluster\n(Firm)', 'Two-\nWay', 'Driscoll\nKraay', 'Newey\nWest']
colors = ['#aaaaaa', '#5B9BD5', '#FFA500', '#C00000', '#70AD47', '#9B59B6']

fig, axes = plt.subplots(1, 2, figsize=(16, 6))

for ax, var in zip(axes, vars_to_plot):
    ses = [results_fin[k].std_errors[var] for k in method_keys]
    bars = ax.bar(method_labels, ses, color=colors,
                  edgecolor='black', linewidth=1.2, alpha=0.85)

    # Annotate with values
    for bar, se in zip(bars, ses):
        height = bar.get_height()
        ax.text(bar.get_x() + bar.get_width()/2., height * 1.01,
                f'{se:.4f}', ha='center', va='bottom', fontsize=8.5,
                fontweight='bold')

    # Highlight non-robust as dangerous reference
    ax.axhline(y=ses[0], color='red', linestyle='--', alpha=0.5,
               linewidth=1.5, label='Non-robust (baseline)')

    ax.set_ylabel('Standard Error', fontsize=12)
    ax.set_title(f'SE Comparison: {var}', fontsize=13, fontweight='bold')
    ax.grid(axis='y', alpha=0.3)
    ax.legend(fontsize=9)

plt.suptitle('Financial Panel: Standard Error Comparison Across Methods\n'
             '(50 Firms × 120 Months)',
             fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
plt.savefig(FIG_PATH + 'fin_se_comparison.png', dpi=150, bbox_inches='tight')
plt.show()
print("Figure saved.")

In [None]:
# Coefficient plots with CIs under each SE method
fig = comp_fin.plot_comparison(result=result_fin, figsize=(14, 9))
plt.suptitle('Financial Panel: Coefficients and 95% CIs by SE Method',
             fontsize=13, fontweight='bold', y=1.01)
plt.tight_layout()
plt.savefig(FIG_PATH + 'fin_ci_comparison.png', dpi=150, bbox_inches='tight')
plt.show()
print("Confidence interval comparison saved.")

### 2.4 Sensitivity Analysis and Recommendation

**Does SE choice change inference?**

In [None]:
print("=" * 85)
print("FINANCIAL PANEL: INFERENCE SENSITIVITY")
print("=" * 85)
print(f"{'Method':<20} | {'Var':<17} | {'Coef':>8} | {'SE':>8} | {'t-stat':>7} | {'p-value':>8} | Sig")
print("-" * 85)

method_display = {
    'nonrobust': 'Non-Robust',
    'robust': 'Robust (HC1)',
    'cluster_firm': 'Cluster (Firm)',
    'twoway': 'Two-Way',
    'driscoll_kraay': 'Driscoll-Kraay',
    'newey_west': 'Newey-West'
}

for method, res in results_fin.items():
    for var in ['market_ret', 'book_to_market']:
        coef = res.params[var]
        se = res.std_errors[var]
        t = coef / se
        p = res.pvalues[var]
        sig = '***' if p < 0.01 else ('**' if p < 0.05 else ('*' if p < 0.10 else 'ns'))
        print(f"{method_display[method]:<20} | {var:<17} | {coef:>8.4f} | {se:>8.5f} | {t:>7.2f} | {p:>8.4f} | {sig}")
    print("-" * 85)

print()
print("=" * 85)
print("RECOMMENDATION: FINANCIAL PANEL")
print("=" * 85)
print("""
Primary specification: Two-way clustering (firm × month)
  - Accounts for both within-firm temporal correlation AND cross-firm shocks
  - Standard in financial economics (Petersen 2009, Thompson 2011)
  - N_firms=50, N_months=120 — both above G=20 minimum

Robustness check: Driscoll-Kraay with automatic lag selection
  - Provides independent confirmation of two-way cluster results
  - HAC kernel approach versus clustering approach

Table note: 'Standard errors two-way clustered by firm and month
  (50 firms, 120 months). * p<0.10, ** p<0.05, *** p<0.01.'

AVOID: Non-robust or simple robust SEs — substantially understate uncertainty!
""")

---

<a id='section3'></a>
## 3. Application 2: Macro Panel — Driscoll-Kraay vs PCSE

### 3.1 Context: Trade and Economic Growth

**Research Question**: Does trade openness promote economic growth?

**Data Structure**:
- N = 30 countries, T = 40 years → balanced macro panel  
- Variables: `gdp_growth`, `investment`, `education`, `openness`

**Expected Features**:
- **Global shocks** (financial crises, commodity cycles) → cross-section correlation
- **Country-specific trends** → temporal correlation within countries
- T = 40 > N = 30 → PCSE condition T > N satisfied

**Applicable Methods**: Cluster by country, Driscoll-Kraay, PCSE, Two-Way

In [None]:
# Load macro panel
macro_data = pd.read_csv(DATA_PATH + 'macro_growth.csv')

print("=" * 60)
print("MACRO PANEL DATA")
print("=" * 60)
print(f"Shape: {macro_data.shape}")
print(f"Countries: {macro_data['country_id'].nunique()}")
print(f"Years: {macro_data['year'].max() - macro_data['year'].min() + 1} " +
      f"({macro_data['year'].min()}–{macro_data['year'].max()})")
print(f"Variables: {list(macro_data.columns)}")

# Check balance
balance = macro_data.groupby('country_id').size()
print(f"\nBalance check:")
print(f"  Min obs per country: {balance.min()}")
print(f"  Max obs per country: {balance.max()}")
print(f"  Balanced: {balance.nunique() == 1}")

# T vs N check for PCSE
T = macro_data['year'].nunique()
N = macro_data['country_id'].nunique()
print(f"\nT = {T}, N = {N}, T > N: {T > N} → PCSE condition satisfied")

print()
print("Descriptive Statistics:")
print(macro_data[['gdp_growth', 'investment', 'education', 'openness']].describe().round(3))

In [None]:
# Fixed effects model
fe_macro = FixedEffects("gdp_growth ~ investment + education + openness",
                        macro_data, "country_id", "year")

results_macro = {}

# 1. Non-robust
results_macro['nonrobust'] = fe_macro.fit(cov_type='nonrobust')
print("[1/6] Non-robust: done")

# 2. Robust (HC1)
results_macro['robust'] = fe_macro.fit(cov_type='robust')
print("[2/6] Robust: done")

# 3. Cluster by country (within-country temporal correlation)
results_macro['cluster_country'] = fe_macro.fit(cov_type='clustered')
print("[3/6] Cluster by country: done")

# 4. Two-way clustering (country + year)
results_macro['twoway'] = fe_macro.fit(cov_type='twoway')
print("[4/6] Two-way clustering: done")

# 5. Driscoll-Kraay (T=40 — well above T>20 requirement)
results_macro['driscoll_kraay'] = fe_macro.fit(cov_type='driscoll_kraay', max_lags=4)
print("[5/6] Driscoll-Kraay (lags=4): done")

# 6. PCSE (T=40 > N=30 ✓)
results_macro['pcse'] = fe_macro.fit(cov_type='pcse')
print("[6/6] PCSE: done")

print("\nAll macro panel estimations complete.")

In [None]:
# Comprehensive comparison using StandardErrorComparison
comp_macro = StandardErrorComparison(results_macro['driscoll_kraay'])
result_macro = comp_macro.compare_all(
    se_types=['nonrobust', 'robust', 'clustered', 'twoway', 'driscoll_kraay', 'pcse']
)

print("=" * 90)
print("MACRO PANEL: STANDARD ERROR COMPARISON")
print("=" * 90)
print()

se_display = result_macro.se_comparison.copy()
se_display.columns = ['NonRobust', 'Robust', 'Cluster\n(Country)', 'Two-Way', 'Driscoll\nKraay', 'PCSE']
print("Standard Errors:")
print(se_display.round(5).to_string())
print()

print("SE Ratios (relative to Non-Robust):")
print(result_macro.se_ratios.round(3).to_string())
print()

print("Significance (* p<0.10, ** p<0.05, *** p<0.01):")
print(result_macro.significance.to_string())

In [None]:
# Focus: Driscoll-Kraay vs PCSE vs Cluster
print("=" * 85)
print("MACRO PANEL: INFERENCE SENSITIVITY TABLE")
print("=" * 85)
print(f"{'Method':<22} | {'Variable':<12} | {'Coef':>8} | {'SE':>8} | {'t-stat':>7} | {'p-value':>8} | Sig")
print("-" * 85)

macro_method_display = {
    'nonrobust': 'Non-Robust',
    'robust': 'Robust (HC1)',
    'cluster_country': 'Cluster (Country)',
    'twoway': 'Two-Way',
    'driscoll_kraay': 'Driscoll-Kraay',
    'pcse': 'PCSE'
}

for method, res in results_macro.items():
    for var in ['investment', 'education', 'openness']:
        coef = res.params[var]
        se = res.std_errors[var]
        t = coef / se
        p = res.pvalues[var]
        sig = '***' if p < 0.01 else ('**' if p < 0.05 else ('*' if p < 0.10 else 'ns'))
        print(f"{macro_method_display[method]:<22} | {var:<12} | {coef:>8.4f} | {se:>8.5f} | {t:>7.2f} | {p:>8.4f} | {sig}")
    print("-" * 85)

print("\nKey: consistent significance across DK and PCSE = robust finding")

### 3.2 Driscoll-Kraay vs PCSE: When to Use Which?

| Criterion | Driscoll-Kraay | PCSE |
|-----------|----------------|------|
| **T requirement** | T > 20 (practical) | T > N (ideally T >> N) |
| **Cross-section correlation** | Yes (via HAC) | Yes (estimates full Σ) |
| **Temporal correlation** | Yes (HAC kernel) | Optional (AR1 spec) |
| **Asymptotics** | T → ∞ (N fixed) | T → ∞, T/N → ∞ |
| **Typical use** | Macro panels, DID | Political science (legislator voting) |
| **This dataset** | ✅ T=40 > 20 | ✅ T=40 > N=30 |

**Practical Rules**:
- T < 2N: Prefer Driscoll-Kraay (safer asymptotics)
- T > 2N: PCSE acceptable, estimate full covariance matrix
- Always: Compare both as robustness check

In [None]:
# Visualization: SE comparison across methods for macro panel
macro_method_keys = ['nonrobust', 'robust', 'cluster_country', 'twoway', 'driscoll_kraay', 'pcse']
macro_labels = ['Non\nRobust', 'Robust\n(HC1)', 'Cluster\n(Country)', 'Two-\nWay', 'Driscoll\nKraay', 'PCSE']
colors_macro = ['#aaaaaa', '#5B9BD5', '#FFA500', '#C00000', '#70AD47', '#9B59B6']

vars_macro = ['investment', 'education', 'openness']
fig, axes = plt.subplots(1, 3, figsize=(18, 6))

for ax, var in zip(axes, vars_macro):
    ses = [results_macro[k].std_errors[var] for k in macro_method_keys]
    bars = ax.bar(macro_labels, ses, color=colors_macro,
                  edgecolor='black', linewidth=1.2, alpha=0.85)

    for bar, se in zip(bars, ses):
        ax.text(bar.get_x() + bar.get_width()/2., bar.get_height() * 1.01,
                f'{se:.4f}', ha='center', va='bottom', fontsize=8, fontweight='bold')

    ax.set_ylabel('Standard Error', fontsize=11)
    ax.set_title(f'SE Comparison: {var}', fontsize=12, fontweight='bold')
    ax.grid(axis='y', alpha=0.3)

plt.suptitle('Macro Panel: Standard Error Comparison\n(30 Countries × 40 Years)',
             fontsize=14, fontweight='bold', y=1.01)
plt.tight_layout()
plt.savefig(FIG_PATH + 'macro_se_comparison.png', dpi=150, bbox_inches='tight')
plt.show()
print("Figure saved.")

In [None]:
print("=" * 70)
print("RECOMMENDATION: MACRO PANEL")
print("=" * 70)
print("""
Primary specification: Driscoll-Kraay with lag selection
  - Handles both cross-section and temporal correlation via HAC
  - Does NOT require T > N (works with T > 20)
  - More robust to misspecification of correlation structure

Robustness check: PCSE
  - T=40 > N=30 → PCSE condition satisfied
  - Estimates full cross-section error covariance matrix
  - Compare results with DK — should be similar if both assumptions hold

Secondary robustness: Two-way clustering
  - Simpler assumption, good if G_country ≥ 20 (✓ N=30) and G_year ≥ 20 (✓ T=40)

AVOID: Simple robust SEs — ignore both temporal and cross-section correlation!

Table note: 'Standard errors are Driscoll-Kraay (4 lags). PCSE results
  in Appendix Table A1 confirm all main conclusions.'
""")

---

<a id='section4'></a>
## 4. Application 3: Wage Panel — Cluster-Robust vs Quantile Regression

### 4.1 Context: Returns to Education and Experience

**Research Question**: How do education and experience affect wages?

**Data Structure**:
- N = 2000 individuals, T = 5 years → typical micro panel

**Expected Features**:
- **Strong within-person correlation**: Persistent wage levels, unobserved ability
- **Time-invariant variables**: Education, gender, union status don't change across years
  → Use PooledOLS (not FE) to retain between-person identification
- **Micro panel**: N >> T → cluster by entity is the right approach
- **Extension**: Quantile regression to capture heterogeneous returns

In [None]:
# Load wage panel
wage_data = pd.read_csv(DATA_PATH + 'wage_panel.csv')

print("=" * 60)
print("WAGE PANEL DATA")
print("=" * 60)
print(f"Shape: {wage_data.shape}")
print(f"Individuals: {wage_data['person_id'].nunique()}")
print(f"Years: {wage_data['year'].nunique()} ({wage_data['year'].min()}–{wage_data['year'].max()})")
print(f"Variables: {list(wage_data.columns)}")
print()

# Check within-person wage persistence (motivation for clustering)
wage_data['log_wage'] = np.log(wage_data['wage'])
within_corr = wage_data.groupby('person_id')['log_wage'].apply(
    lambda x: x.autocorr() if len(x) > 1 else np.nan
).mean()
print(f"Mean within-person wage autocorrelation: {within_corr:.3f}")
print("(High autocorrelation → clustering is essential)")
print()
print("Descriptive Statistics:")
print(wage_data[['log_wage', 'education', 'experience']].describe().round(3))

In [None]:
# Wage panel: most variables are time-invariant (education, female, union)
# → Use PooledOLS with cluster-robust SEs (standard Mincer wage equation approach)
# Note: PooledOLS preserves between-person variation needed to identify education/gender effects
pool_wage = PooledOLS("log_wage ~ education + experience + female + union",
                      wage_data, "person_id", "year")

results_wage = {}

# 1. Non-robust
results_wage['nonrobust'] = pool_wage.fit(cov_type='nonrobust')
print("[1/4] Non-robust: done")

# 2. Robust
results_wage['robust'] = pool_wage.fit(cov_type='robust')
print("[2/4] Robust (HC1): done")

# 3. Cluster by person (correct for micro panel — absorbs within-person correlation)
results_wage['cluster_person'] = pool_wage.fit(cov_type='clustered')
print("[3/4] Cluster by person: done")

# 4. Two-way clustering (person + year)
results_wage['twoway'] = pool_wage.fit(cov_type='twoway')
print("[4/4] Two-way clustering: done")

print("\nWage panel estimations complete.")
print("\nCoefficients:")
for var, coef in results_wage['cluster_person'].params.items():
    se = results_wage['cluster_person'].std_errors[var]
    print(f"  {var:<12}: {coef:.5f}  (SE: {se:.5f})")

In [None]:
# SE comparison — wage panel
comp_wage = StandardErrorComparison(results_wage['cluster_person'])
result_wage = comp_wage.compare_all(
    se_types=['nonrobust', 'robust', 'clustered', 'twoway']
)

print("=" * 70)
print("WAGE PANEL: STANDARD ERROR COMPARISON (PooledOLS)")
print("=" * 70)
print()

se_w = result_wage.se_comparison.copy()
se_w.columns = ['NonRobust', 'Robust', 'Cluster\n(Person)', 'Two-Way']
print("Standard Errors:")
print(se_w.round(5).to_string())
print()

print("SE Ratios relative to Non-Robust:")
print(result_wage.se_ratios.round(3).to_string())
print()

print("Significance (* p<0.10, ** p<0.05, *** p<0.01):")
print(result_wage.significance.to_string())
print()

# Inflation factor — education and experience
for var in ['education', 'experience']:
    se_nr = results_wage['nonrobust'].std_errors[var]
    se_cl = results_wage['cluster_person'].std_errors[var]
    print(f"{var}: Cluster SE is {se_cl/se_nr:.2f}x larger than non-robust SE")
print("(Reflects strong within-person serial correlation)")

### 4.2 Extension: Quantile Regression with Cluster-Robust SE

PooledOLS estimates the **average** effect. Quantile regression reveals **heterogeneous effects** across the wage distribution — e.g., whether education premiums differ for low vs. high earners.

We use `PooledQuantile` with cluster-robust SEs (clustered by person) to account for within-person correlation across years.

In [None]:
# Prepare data for PooledQuantile (takes numpy arrays)
y_wage = wage_data['log_wage'].values
X_wage = np.column_stack([
    np.ones(len(wage_data)),           # Intercept
    wage_data['education'].values,
    wage_data['experience'].values
])
entity_ids = wage_data['person_id'].values
time_ids = wage_data['year'].values

# Estimate quantile regression at three quantiles
quantiles = [0.25, 0.50, 0.75]
qr_results = {}

for tau in quantiles:
    qr = PooledQuantile(
        endog=y_wage,
        exog=X_wage,
        entity_id=entity_ids,
        time_id=time_ids,
        quantiles=tau
    )
    res = qr.fit(se_type='cluster')
    qr_results[tau] = res
    print(f"Q{int(tau*100)}: params={res.params.flatten().round(4)}, "
          f"se={res.std_errors.round(4)}")

print("\nQuantile estimates complete.")

In [None]:
# Compare median quantile regression with cluster SE vs PooledOLS cluster SE
# (QuantileBootstrap not compatible with PooledQuantile in this version)
# Instead we compare cluster-robust SEs across methods

print("=" * 70)
print("COMPARISON: PooledOLS Cluster vs Quantile Regression (Cluster SE)")
print("=" * 70)
print(f"{'Variable':<15} | {'OLS Coef':>9} | {'OLS SE':>9} | {'QR(0.5)':>9} | {'QR SE':>9}")
print("-" * 60)

# Median QR already fitted in qr_results — extract cluster SEs
# Indices: 0=Intercept, 1=education, 2=experience
for i, var in enumerate(['education', 'experience']):
    ols_coef = results_wage['cluster_person'].params[var]
    ols_se = results_wage['cluster_person'].std_errors[var]
    qr_coef = qr_results[0.50].params.flatten()[i+1]  # skip intercept
    qr_se = qr_results[0.50].std_errors[i+1]
    print(f"{var:<15} | {ols_coef:>9.5f} | {ols_se:>9.5f} | {qr_coef:>9.5f} | {qr_se:>9.5f}")

print()
print("Notes:")
print("  QR(0.5) = median regression estimate")
print("  QR SEs = cluster-robust (clustered by entity)")
print("  Close agreement between OLS and QR(0.5) suggests homogeneous returns")
print()

# Compute inflation factor of cluster vs nonrobust for QR
qr_nr = PooledQuantile(
    endog=y_wage, exog=X_wage,
    entity_id=entity_ids, time_id=time_ids,
    quantiles=0.50
)
res_qr_nr = qr_nr.fit(se_type='nonrobust')
res_qr_cl = qr_results[0.50]

print("QR SE Inflation (cluster vs. nonrobust):")
for i, var in enumerate(['education', 'experience']):
    ratio = res_qr_cl.std_errors[i+1] / res_qr_nr.std_errors[i+1]
    print(f"  {var}: {ratio:.2f}x larger cluster SE")

In [None]:
# Visualize quantile coefficients vs PooledOLS
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

for ax_idx, (var_idx, var_name) in enumerate(zip([1, 2], ['Education', 'Experience'])):
    ax = axes[ax_idx]

    # Quantile estimates with cluster SEs
    qr_coefs = [qr_results[tau].params.flatten()[var_idx] for tau in quantiles]
    qr_ses = [qr_results[tau].std_errors[var_idx] for tau in quantiles]

    ax.errorbar(quantiles, qr_coefs,
                yerr=[1.96 * se for se in qr_ses],
                fmt='o-', color='#C00000', linewidth=2, markersize=8,
                capsize=5, label='Quantile Regression ± 1.96 SE (cluster)', zorder=5)

    # PooledOLS cluster estimate (constant across distribution)
    ols_coef = results_wage['cluster_person'].params[var_name.lower()]
    ols_se = results_wage['cluster_person'].std_errors[var_name.lower()]
    ax.axhline(ols_coef, color='#5B9BD5', linestyle='--', linewidth=2.5,
               label=f'OLS (cluster) = {ols_coef:.4f}', zorder=3)
    ax.fill_between([0.20, 0.80],
                    ols_coef - 1.96*ols_se, ols_coef + 1.96*ols_se,
                    alpha=0.15, color='#5B9BD5', label='OLS 95% CI')

    ax.set_xlabel('Quantile', fontsize=12)
    ax.set_ylabel('Coefficient', fontsize=12)
    ax.set_title(f'Returns to {var_name}\nAcross Wage Distribution', fontsize=12, fontweight='bold')
    ax.set_xticks(quantiles)
    ax.set_xticklabels(['Q25\n(25th)', 'Q50\n(Median)', 'Q75\n(75th)'])
    ax.legend(fontsize=9)
    ax.grid(alpha=0.3)

plt.suptitle('Wage Panel: Quantile Regression vs PooledOLS\n'
             'Cluster-Robust SEs',
             fontsize=13, fontweight='bold', y=1.02)
plt.tight_layout()
plt.savefig(FIG_PATH + 'wage_quantile_vs_ols.png', dpi=150, bbox_inches='tight')
plt.show()
print("Figure saved.")

---

<a id='section5'></a>
## 5. Decision-Making Framework

### 5.1 Comprehensive Decision Table

In [None]:
# Comprehensive decision table
decision_table = pd.DataFrame({
    'Data Structure': [
        'Cross-section only',
        'Time series only',
        'Micro panel (N large, T small)',
        'Micro panel + cross-section corr.',
        'Macro panel (N small, T large)',
        'Macro panel (T >> N)',
        'Spatial data',
        'Nonlinear model (Logit, Probit)',
        'Quantile regression'
    ],
    'Primary Method': [
        'Robust (HC1)',
        'Newey-West HAC',
        'Cluster by entity',
        'Two-way clustering',
        'Driscoll-Kraay',
        'PCSE',
        'Spatial HAC',
        'MLE Sandwich',
        'Cluster Bootstrap'
    ],
    'Robustness Check': [
        'HC2, HC3',
        'Different lags',
        'Two-way clustering',
        'Driscoll-Kraay',
        'PCSE (if T>N)',
        'Driscoll-Kraay',
        'Vary cutoff distance',
        'Cluster-robust MLE',
        'Pairs bootstrap'
    ],
    'Key Requirement': [
        'n > 50',
        'T > 50',
        'G ≥ 20 entities',
        'G ≥ 20 in both dims',
        'T > 20',
        'T > N (ideally T > 2N)',
        'Geographic coordinates',
        'Model correctly specified',
        'B ≥ 499 bootstrap reps'
    ],
    'Example Dataset': [
        'CPS wage survey',
        'GDP quarterly',
        'Wage panel (N=2000)',
        'Financial panel (N=50)',
        'Macro growth (N=30, T=40)',
        'Legislative voting',
        'Regional housing prices',
        'Binary outcome panels',
        'Wage quantiles'
    ]
})

print("=" * 120)
print("DECISION FRAMEWORK: CHOOSING STANDARD ERROR METHOD")
print("=" * 120)
print(decision_table.to_string(index=False))
print("=" * 120)

### 5.2 Red Flags: When to Be Cautious

**Warning Signs That SE Choice Is Critical**:

| Red Flag | Threshold | Implication |
|----------|-----------|-------------|
| SE ratio (cluster/robust) | > 2x | Strong temporal correlation — clustering essential |
| Significance flip | Any | SE choice affects conclusions — report both |
| Few clusters | G < 20 | Cluster asymptotics unreliable — use bootstrap |
| Short T with DK | T < 20 | Asymptotic approximation poor — use clustering |
| T < N with PCSE | T < N | Σ matrix singular — use Driscoll-Kraay |

In [None]:
def check_red_flags(result_baseline, result_alternative,
                    name_baseline='Robust', name_alternative='Cluster',
                    se_threshold=2.0, p_threshold=0.05):
    """
    Compare two SE methods and identify red flags.

    Parameters
    ----------
    result_baseline : PanelResults
        Baseline SE method (usually robust)
    result_alternative : PanelResults
        Alternative SE method (e.g., clustered)
    name_baseline : str
        Name for baseline
    name_alternative : str
        Name for alternative
    se_threshold : float
        Ratio threshold for flagging large SE differences
    p_threshold : float
        Significance threshold for checking inference flips

    Returns
    -------
    dict with flags and details
    """
    print(f"RED FLAG CHECK: {name_baseline} vs {name_alternative}")
    print("-" * 70)

    flags = {}

    for var in result_baseline.params.index:
        se_b = result_baseline.std_errors[var]
        se_a = result_alternative.std_errors[var]
        ratio = se_a / se_b

        p_b = result_baseline.pvalues[var]
        p_a = result_alternative.pvalues[var]
        sig_b = p_b < p_threshold
        sig_a = p_a < p_threshold

        large_diff = ratio > se_threshold
        sig_flip = sig_b != sig_a

        flag_markers = []
        if large_diff:
            flag_markers.append(f"[!] SE ratio={ratio:.2f}x > {se_threshold}x")
        if sig_flip:
            prev_sig = "significant" if sig_b else "NOT significant"
            new_sig = "significant" if sig_a else "NOT significant"
            flag_markers.append(f"[!] Inference flips: {prev_sig} → {new_sig}")

        status = "OK" if not flag_markers else "FLAG"
        print(f"  {var:<20}: ratio={ratio:.3f}x  {name_baseline}:p={p_b:.4f}  "
              f"{name_alternative}:p={p_a:.4f}  → {status}")

        if flag_markers:
            for m in flag_markers:
                print(f"    {m}")

        flags[var] = {
            'ratio': ratio, 'large_diff': large_diff,
            'sig_flip': sig_flip, 'flags': flag_markers
        }

    total_flags = sum(1 for v in flags.values() if v['flags'])
    print(f"\nSummary: {total_flags}/{len(flags)} variables flagged")
    if total_flags > 0:
        print("WARNING: SE choice may affect inference — report multiple methods!")
    else:
        print("OK: Inference consistent across SE methods.")

    return flags


print("=" * 70)
print("APPLICATION 1: Financial Panel")
print("=" * 70)
flags_fin = check_red_flags(
    results_fin['robust'], results_fin['twoway'],
    'Robust', 'Two-Way', se_threshold=2.0
)

print()
print("=" * 70)
print("APPLICATION 2: Macro Panel")
print("=" * 70)
flags_macro = check_red_flags(
    results_macro['robust'], results_macro['driscoll_kraay'],
    'Robust', 'Driscoll-Kraay', se_threshold=2.0
)

print()
print("=" * 70)
print("APPLICATION 3: Wage Panel")
print("=" * 70)
flags_wage = check_red_flags(
    results_wage['nonrobust'], results_wage['cluster_person'],
    'Non-Robust', 'Cluster (Person)', se_threshold=2.0
)

In [None]:
print("=" * 70)
print("BEST PRACTICES CHECKLIST")
print("=" * 70)
print("""
Before Running Regressions:
  [ ] Identify data structure (cross-section / time series / panel)
  [ ] Determine N vs T ratio
  [ ] Check number of clusters (G ≥ 20?)
  [ ] Check T > N for PCSE (if applicable)
  [ ] Pre-specify SE method in analysis plan

When Reporting Results:
  [ ] Estimate with at least 3 SE methods
  [ ] Use StandardErrorComparison for systematic check
  [ ] Run check_red_flags() against baseline method
  [ ] Report primary specification + robustness check
  [ ] Specify SE method in table notes
  [ ] Report number of clusters if applicable
  [ ] Discuss if conclusions sensitive to SE choice

In Tables:
  [ ] Standard errors in parentheses (not t-statistics)
  [ ] Specify SE method in table notes
  [ ] Include number of clusters / observations
  [ ] Consistent significance levels (*, **, ***)
  [ ] Consider multi-column format for robustness
""")

---

<a id='section6'></a>
## 6. Publication-Ready Tables

### 6.1 Regression Table with Multiple SE Specifications

The standard format in applied economics shows the same model with different SE methods as separate columns.

In [None]:
def create_regression_table(models, model_names, title, dep_var, decimals=4,
                             sig_levels=(0.10, 0.05, 0.01)):
    """
    Create a publication-ready regression table.

    Parameters
    ----------
    models : list of PanelResults
        List of fitted model results
    model_names : list of str
        Column headers
    title : str
        Table title
    dep_var : str
        Dependent variable description
    decimals : int
        Decimal places for display
    sig_levels : tuple
        Significance thresholds (10%, 5%, 1%)

    Returns
    -------
    str : Formatted table
    """
    def stars(p):
        if p < sig_levels[2]: return '***'
        if p < sig_levels[1]: return '**'
        if p < sig_levels[0]: return '*'
        return ''

    vars_list = list(models[0].params.index)
    n_models = len(models)
    col_w = 16
    var_w = 22
    total_w = var_w + col_w * n_models + 2

    lines = []
    lines.append('=' * total_w)
    lines.append(title.center(total_w))
    lines.append(f"Dependent Variable: {dep_var}".center(total_w))
    lines.append('-' * total_w)

    # Column headers
    header = ' ' * var_w
    for i, name in enumerate(model_names, 1):
        header += f"({i}){name:<{col_w-3}}"
    lines.append(header)
    lines.append('-' * total_w)

    # Coefficients and SEs
    for var in vars_list:
        coef_line = f"{var:<{var_w}}"
        se_line = ' ' * var_w
        for res in models:
            coef = res.params[var]
            se = res.std_errors[var]
            p = res.pvalues[var]
            coef_str = f"{coef:.{decimals}f}{stars(p)}"
            se_str = f"({se:.{decimals}f})"
            coef_line += f"{coef_str:^{col_w}}"
            se_line += f"{se_str:^{col_w}}"
        lines.append(coef_line)
        lines.append(se_line)
        lines.append('')

    lines.append('-' * total_w)

    # Model statistics
    stat_rows = [
        ('Observations', [f"{res.nobs:,}" for res in models]),
        ('Entities', [f"{res.n_entities}" for res in models]),
        ('R-squared (within)', [f"{res.rsquared_within:.3f}" for res in models]),
        ('SE method', model_names),
    ]
    for label, values in stat_rows:
        row = f"{label:<{var_w}}"
        for v in values:
            row += f"{v:^{col_w}}"
        lines.append(row)

    lines.append('=' * total_w)
    lines.append(f"Notes: * p<{sig_levels[0]}, ** p<{sig_levels[1]}, *** p<{sig_levels[2]}.")
    lines.append("Standard errors in parentheses.")

    return '\n'.join(lines)


# Financial panel table: 3 SE specifications
models_to_report = [
    results_fin['robust'],
    results_fin['cluster_firm'],
    results_fin['twoway']
]
model_names = ['Robust', 'Cluster(Firm)', 'Two-Way']

fin_table = create_regression_table(
    models=models_to_report,
    model_names=model_names,
    title='Fama-French Factor Model',
    dep_var='Stock Excess Return'
)

print(fin_table)

In [None]:
# Robustness appendix table — ALL SE methods
all_models = [
    results_fin['nonrobust'],
    results_fin['robust'],
    results_fin['cluster_firm'],
    results_fin['twoway'],
    results_fin['driscoll_kraay'],
    results_fin['newey_west']
]
all_names = ['Classical', 'Robust', 'Cluster\nFirm', 'Two-\nWay', 'Driscoll\nKraay', 'Newey\nWest']

robustness_table = create_regression_table(
    models=all_models,
    model_names=all_names,
    title='Appendix: Robustness — All Standard Error Methods',
    dep_var='Stock Excess Return'
)

print(robustness_table)

# Save to file
with open('../outputs/reports/robustness_table.txt', 'w') as f:
    f.write(robustness_table)
print("\nRobustness table saved to ../outputs/reports/robustness_table.txt")

In [None]:
def create_latex_table(models, model_names, title, dep_var, decimals=4,
                        sig_levels=(0.10, 0.05, 0.01)):
    """
    Generate LaTeX code for regression table.

    Parameters
    ----------
    models : list of PanelResults
    model_names : list of str
    title : str
    dep_var : str
    decimals : int
    sig_levels : tuple

    Returns
    -------
    str : LaTeX table
    """
    def stars(p):
        if p < sig_levels[2]: return '$^{***}$'
        if p < sig_levels[1]: return '$^{**}$'
        if p < sig_levels[0]: return '$^{*}$'
        return ''

    n_models = len(models)
    col_spec = 'l' + 'c' * n_models

    lines = [
        r'\begin{table}[htbp]',
        r'\centering',
        r'\caption{' + title + '}',
        r'\label{tab:regression}',
        r'\begin{tabular}{' + col_spec + '}',
        r'\hline\hline',
    ]

    # Header
    header = 'Variable & ' + ' & '.join(f'({i+1}) {name}'
                                         for i, name in enumerate(model_names))
    lines.append(header + r' \\')
    lines.append(r'\hline')

    vars_list = list(models[0].params.index)
    for var in vars_list:
        coef_row = var.replace('_', '\_') + ' & '
        se_row = ' & '
        for res in models:
            coef = res.params[var]
            se = res.std_errors[var]
            p = res.pvalues[var]
            coef_row += f"{coef:.{decimals}f}{stars(p)} & "
            se_row += f"({se:.{decimals}f}) & "
        lines.append(coef_row.rstrip(' & ') + r' \\')
        lines.append(se_row.rstrip(' & ') + r' \\')
        lines.append('')

    lines.append(r'\hline')
    obs_row = 'Observations & ' + ' & '.join(f"{m.nobs:,}" for m in models)
    r2_row = 'R$^2$ (within) & ' + ' & '.join(f"{m.rsquared_within:.3f}" for m in models)
    lines.append(obs_row + r' \\')
    lines.append(r2_row + r' \\')
    lines.append(r'\hline\hline')
    lines.append(r'\end{tabular}')
    lines.append(r'\begin{tablenotes}')
    lines.append(r'\small')
    lines.append(r'\item \textit{Notes}: Dependent variable: ' + dep_var +
                 r'. Standard errors in parentheses. '
                 r'$^{*}$ p$<$0.10, $^{**}$ p$<$0.05, $^{***}$ p$<$0.01.')
    lines.append(r'\end{tablenotes}')
    lines.append(r'\end{table}')

    return '\n'.join(lines)


latex_table = create_latex_table(
    models=models_to_report,
    model_names=['Robust', 'Cluster (Firm)', 'Two-Way'],
    title='Fama-French Factor Model: Return Determinants',
    dep_var='Stock Excess Return'
)

with open('../outputs/reports/regression_table.tex', 'w') as f:
    f.write(latex_table)

print("LaTeX table saved to ../outputs/reports/regression_table.tex")
print()
print("LaTeX output preview:")
print("-" * 50)
print(latex_table)

---

<a id='section7'></a>
## 7. Case Study: When SE Choice Matters Most

### 7.1 Monte Carlo: Type I Error Under Misspecification

**Objective**: Demonstrate empirically that using wrong SEs leads to inflated Type I error rates (false discoveries).

**Design**:
- Panel with N=50 entities, T=20 periods
- True β = 1.0 (test H₀: β = 1.0, which is TRUE)
- Errors autocorrelated within entities (ρ = 0.5)
- Expected rejection rate: 5% (nominal α = 0.05)
- 1,000 Monte Carlo replications

In [None]:
# Monte Carlo simulation
np.random.seed(42)
N_entities = 50
T_periods = 20
n_sim = 1000
true_beta = 1.0
autocorr_rho = 0.5

print("=" * 70)
print("MONTE CARLO SIMULATION")
print("=" * 70)
print(f"N = {N_entities} entities, T = {T_periods} periods")
print(f"True β = {true_beta} (H₀: β = {true_beta} is TRUE)")
print(f"Autocorrelation ρ = {autocorr_rho}")
print(f"Replications = {n_sim}")
print(f"Nominal significance level = 0.05")
print(f"\nRunning simulation...")

reject_nonrobust = []
reject_robust = []
reject_cluster = []

for sim in range(n_sim):
    rows = []
    for i in range(N_entities):
        x_i = np.random.normal(0, 1, T_periods)
        # Autocorrelated errors within entity
        eps_i = np.zeros(T_periods)
        eps_i[0] = np.random.normal(0, 1)
        for t in range(1, T_periods):
            eps_i[t] = autocorr_rho * eps_i[t-1] + np.sqrt(1 - autocorr_rho**2) * np.random.normal(0, 1)
        y_i = true_beta * x_i + eps_i

        for t in range(T_periods):
            rows.append({'entity': i, 'time': t, 'y': y_i[t], 'x': x_i[t]})

    sim_df = pd.DataFrame(rows)

    fe_sim = FixedEffects("y ~ x", sim_df, "entity", "time")

    res_nr = fe_sim.fit(cov_type='nonrobust')
    res_rob = fe_sim.fit(cov_type='robust')
    res_cl = fe_sim.fit(cov_type='clustered')

    # Test H0: beta = true_beta (TRUE null)
    t_nr = (res_nr.params['x'] - true_beta) / res_nr.std_errors['x']
    t_rob = (res_rob.params['x'] - true_beta) / res_rob.std_errors['x']
    t_cl = (res_cl.params['x'] - true_beta) / res_cl.std_errors['x']

    reject_nonrobust.append(abs(t_nr) > 1.96)
    reject_robust.append(abs(t_rob) > 1.96)
    reject_cluster.append(abs(t_cl) > 1.96)

    if (sim + 1) % 250 == 0:
        print(f"  {sim + 1}/{n_sim} replications done")

print("Simulation complete!")

In [None]:
# Display Type I error rates
rate_nr = np.mean(reject_nonrobust)
rate_rob = np.mean(reject_robust)
rate_cl = np.mean(reject_cluster)

print("=" * 70)
print("TYPE I ERROR RATES (H₀ is TRUE — should reject ≈ 5%)")
print("=" * 70)
print(f"  Non-robust SEs:      {rate_nr:.3f} ({rate_nr*100:.1f}%)", end='')
print(f" {'← OVER-REJECTS!' if rate_nr > 0.07 else '✓'}")
print(f"  Robust (HC1) SEs:    {rate_rob:.3f} ({rate_rob*100:.1f}%)", end='')
print(f" {'← OVER-REJECTS!' if rate_rob > 0.07 else '✓'}")
print(f"  Cluster SEs:         {rate_cl:.3f} ({rate_cl*100:.1f}%)", end='')
print(f" {'← OVER-REJECTS!' if rate_cl > 0.07 else '← correct size ✓' if 0.03 <= rate_cl <= 0.07 else '← conservative'}")
print()
print(f"  Excess false positives (Robust vs Cluster): "
      f"{(rate_rob - rate_cl)*100:.1f} percentage points")
print(f"  Multiplier: Robust rejects {rate_rob/rate_cl:.1f}x more than Cluster")

print()
print("=" * 70)
print("IMPLICATION")
print("=" * 70)
print(f"""
With autocorrelated panel errors (ρ = {autocorr_rho}):
  - Non-robust SEs reject {rate_nr*100:.1f}% instead of 5% → {rate_nr/0.05:.1f}x over-rejection
  - Robust HC1 SEs reject {rate_rob*100:.1f}% instead of 5% → {rate_rob/0.05:.1f}x over-rejection
  - Cluster SEs reject {rate_cl*100:.1f}% → approximately correct size

In a published sample of 100 studies:
  - Using robust SEs: expect ~{int(rate_rob*100)} false discoveries
  - Using cluster SEs: expect ~{int(rate_cl*100)} false discoveries

This is the statistical mechanism behind replication crises in social science!
""")

In [None]:
# Visualize Monte Carlo results
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

# Plot 1: Type I Error Rates
ax = axes[0]
methods = ['Non-Robust', 'Robust\n(HC1)', 'Cluster\n(Entity)']
rates = [rate_nr, rate_rob, rate_cl]
colors_bar = ['#d62728', '#FF7F0E', '#2ca02c']

bars = ax.bar(methods, rates, color=colors_bar, alpha=0.8,
              edgecolor='black', linewidth=1.5)
ax.axhline(0.05, color='black', linestyle='--', linewidth=2.5,
           label='Nominal 5% level', zorder=10)
ax.axhspan(0.03, 0.07, alpha=0.1, color='green', label='Acceptable range')

for bar, rate in zip(bars, rates):
    ax.text(bar.get_x() + bar.get_width()/2., bar.get_height() + 0.005,
            f'{rate:.3f}\n({rate*100:.1f}%)',
            ha='center', va='bottom', fontweight='bold', fontsize=11)

ax.set_ylabel('Rejection Rate', fontsize=12)
ax.set_title(f'Type I Error Rates\nH₀ is TRUE (β={true_beta}), ρ={autocorr_rho}',
             fontsize=12, fontweight='bold')
ax.set_ylim(0, max(rates) * 1.4)
ax.legend(fontsize=10)
ax.grid(axis='y', alpha=0.3)

# Plot 2: Distribution of t-statistics under H0
ax2 = axes[1]

# Run one more simulation to get t-statistic distributions
np.random.seed(99)
t_nr_dist, t_cl_dist = [], []

for sim in range(500):
    rows = []
    for i in range(N_entities):
        x_i = np.random.normal(0, 1, T_periods)
        eps_i = np.zeros(T_periods)
        eps_i[0] = np.random.normal(0, 1)
        for t in range(1, T_periods):
            eps_i[t] = autocorr_rho * eps_i[t-1] + np.sqrt(1 - autocorr_rho**2) * np.random.normal(0, 1)
        y_i = true_beta * x_i + eps_i
        for t in range(T_periods):
            rows.append({'entity': i, 'time': t, 'y': y_i[t], 'x': x_i[t]})

    sim_df = pd.DataFrame(rows)
    fe_sim = FixedEffects("y ~ x", sim_df, "entity", "time")
    res_nr = fe_sim.fit(cov_type='nonrobust')
    res_cl = fe_sim.fit(cov_type='clustered')
    t_nr_dist.append((res_nr.params['x'] - true_beta) / res_nr.std_errors['x'])
    t_cl_dist.append((res_cl.params['x'] - true_beta) / res_cl.std_errors['x'])

# Plot t-statistic distributions
from scipy import stats as scipy_stats
x_range = np.linspace(-5, 5, 200)
normal_pdf = scipy_stats.norm.pdf(x_range)

ax2.hist(t_nr_dist, bins=40, alpha=0.5, color='#d62728',
         label='Non-Robust t-stats', density=True)
ax2.hist(t_cl_dist, bins=40, alpha=0.5, color='#2ca02c',
         label='Cluster t-stats', density=True)
ax2.plot(x_range, normal_pdf, 'k-', linewidth=2.5, label='Standard Normal')
ax2.axvline(-1.96, color='black', linestyle=':', linewidth=1.5)
ax2.axvline(1.96, color='black', linestyle=':', linewidth=1.5, label='±1.96 critical values')

ax2.set_xlabel('t-statistic', fontsize=12)
ax2.set_ylabel('Density', fontsize=12)
ax2.set_title('Distribution of t-statistics under H₀\n'
              '(Should follow Standard Normal)',
              fontsize=12, fontweight='bold')
ax2.legend(fontsize=9)
ax2.set_xlim(-5, 5)
ax2.grid(alpha=0.3)

plt.suptitle(f'Monte Carlo Evidence: Consequences of Wrong SE Method\n'
             f'(N={N_entities}, T={T_periods}, ρ={autocorr_rho}, B={n_sim} replications)',
             fontsize=13, fontweight='bold', y=1.02)
plt.tight_layout()
plt.savefig(FIG_PATH + 'monte_carlo_type1_error.png', dpi=150, bbox_inches='tight')
plt.show()
print("Monte Carlo figure saved.")

### 7.2 Discussion: SE Choice and the Replication Crisis

The Monte Carlo evidence above illustrates a key mechanism behind the **replication crisis** in social science:

1. **Publication bias** → journals favor statistically significant results
2. **Wrong SE choice** → inflated test statistics → spurious significance
3. **Replication attempts** use correct SEs → null results
4. **Conclusion**: Original "finding" was a Type I error

**Prevention**:
- Pre-register analysis including SE method choice
- Report multiple SE methods as robustness checks
- Follow field conventions (cluster in panel data, always)
- Report exact SE method in all tables

**Key Reference**: Cameron & Miller (2015) — the definitive practitioner's guide to cluster-robust inference.

---

<a id='section8'></a>
## 8. Exercises

### Exercise 1: Complete SE Analysis (Moderate)

**Task**: Apply the full SE comparison framework to a new dataset.

**Requirements**:
1. Load `agricultural_panel.csv` from the data directory
2. Estimate a model of your choice (explore the variables first)
3. Estimate with nonrobust, robust, clustered, and twoway SEs
4. Create a comparison using `StandardErrorComparison`
5. Run `check_red_flags()` comparing robust vs clustered
6. Recommend primary SE method with justification

**Deliverable**: Summary table + 1-paragraph recommendation

**Starter Code**:

In [None]:
# Exercise 1: Complete SE Analysis
# Load agricultural panel data
agri_data = pd.read_csv(DATA_PATH + 'agricultural_panel.csv')
print("Agricultural Panel Data:")
print(f"Shape: {agri_data.shape}")
print(f"Variables: {list(agri_data.columns)}")
print()
print(agri_data.head())

# YOUR CODE:
# Step 1: Explore N, T, variables
# YOUR CODE HERE

# Step 2: Choose dependent and independent variables, build formula
# formula = "..."
# entity_col = "..."
# time_col = "..."

# Step 3: Estimate FixedEffects model with 4 SE methods
# fe_agri = FixedEffects(formula, agri_data, entity_col, time_col)
# results_agri = {}
# results_agri['nonrobust'] = fe_agri.fit(cov_type='nonrobust')
# results_agri['robust'] = fe_agri.fit(cov_type='robust')
# results_agri['clustered'] = fe_agri.fit(cov_type='clustered')
# results_agri['twoway'] = fe_agri.fit(cov_type='twoway')

# Step 4: StandardErrorComparison
# comp_agri = StandardErrorComparison(results_agri['clustered'])
# result_agri = comp_agri.compare_all(se_types=['nonrobust', 'robust', 'clustered', 'twoway'])
# comp_agri.summary(result_agri)

# Step 5: Red flag check
# check_red_flags(results_agri['robust'], results_agri['clustered'])

# Step 6: Write recommendation
print("\n" + "=" * 70)
print("YOUR RECOMMENDATION (write here):")
print("=" * 70)
print("""
Based on the data structure (N=?, T=?) and the comparison results:
  Primary SE method: [...]
  Reasoning: [...]
  Robustness check: [...]
""")

### Exercise 2: Sensitivity Analysis (Moderate)

**Task**: Determine whether key conclusions depend on SE method choice.

**Requirements**:
1. Use the macro panel (`macro_growth.csv`)
2. Focus on the coefficient on `openness`
3. Estimate with 5 different SE methods
4. Create a table showing: Coef, SE, t-stat, p-value, Significance for each method
5. Conclusion: Is the openness-growth relationship robust to SE choice?

**Starter Code**:

In [None]:
# Exercise 2: Sensitivity Analysis
# We already have results_macro from Section 3
# Focus on 'openness' coefficient sensitivity

# YOUR CODE:
# Step 1: Extract openness results from all macro methods
# Step 2: Build sensitivity table (Coef, SE, t-stat, p-value, Sig)
# Step 3: Visualize with coefficient plot (estimate + CI by method)
# Step 4: Conclusion

print("Exercise 2: Sensitivity Analysis — 'openness' coefficient")
print()

# Starter: extract openness results
var = 'openness'
print(f"{'Method':<22} | {'Coef':>8} | {'SE':>8} | {'t-stat':>7} | {'p-value':>8} | Sig")
print("-" * 70)
for method, res in results_macro.items():
    coef = res.params[var]
    se = res.std_errors[var]
    t = coef / se
    p = res.pvalues[var]
    sig = '***' if p < 0.01 else ('**' if p < 0.05 else ('*' if p < 0.10 else 'ns'))
    print(f"{macro_method_display[method]:<22} | {coef:>8.5f} | {se:>8.5f} | {t:>7.2f} | {p:>8.4f} | {sig}")

print()
print("YOUR CONCLUSION (complete this):")
print("""
The openness coefficient is [significant/not significant] across SE methods.
The most appropriate SE method for this data is [...] because [...].
The conclusion that trade openness [affects/does not affect] growth is [robust/sensitive]
to SE choice.
""")

### Exercise 3: Build an SE Selection Function (Advanced)

**Task**: Create a function that recommends an SE method based on data characteristics.

**Requirements**:
1. Function takes panel dataset and returns recommended SE method + rationale
2. Decision logic based on N, T, and user-specified characteristics
3. Validate with all three datasets used in this notebook

**Starter Code**:

In [None]:
# Exercise 3: SE Selection Function

def recommend_se_method(data, entity_col, time_col,
                         has_cross_section_corr=None,
                         has_temporal_corr=None,
                         is_nonlinear=False,
                         min_clusters=20):
    """
    Recommend standard error method based on data characteristics.

    Parameters
    ----------
    data : pd.DataFrame
        Panel dataset
    entity_col : str
        Entity identifier column
    time_col : str
        Time identifier column
    has_cross_section_corr : bool or None
        Whether cross-section correlation is expected
    has_temporal_corr : bool or None
        Whether temporal correlation is expected
    is_nonlinear : bool
        Whether using nonlinear model
    min_clusters : int
        Minimum clusters for asymptotic validity

    Returns
    -------
    dict with keys: 'primary', 'robustness', 'rationale', 'warnings'
    """
    N = data[entity_col].nunique()
    T = data[time_col].nunique()

    recommendation = {
        'N': N, 'T': T,
        'primary': None,
        'robustness': None,
        'rationale': [],
        'warnings': []
    }

    # YOUR CODE: Implement decision logic
    # Hints:
    # - Micro panel: N >> T → cluster by entity
    # - Macro panel: T >> N → Driscoll-Kraay
    # - Cross-section correlation? → two-way or DK
    # - Few clusters (G < min_clusters)? → bootstrap warning
    # - T > N? → PCSE viable

    # Placeholder (replace with your implementation):
    if N >= T:
        recommendation['primary'] = 'clustered'
        recommendation['robustness'] = 'twoway'
        recommendation['rationale'].append(f'Micro panel: N={N} >> T={T} → cluster by entity')
    else:
        recommendation['primary'] = 'driscoll_kraay'
        recommendation['robustness'] = 'pcse' if T > N else 'clustered'
        recommendation['rationale'].append(f'Macro panel: T={T} > N={N} → Driscoll-Kraay')

    if N < min_clusters:
        recommendation['warnings'].append(f'Only {N} entities < {min_clusters} → cluster asymptotics may be unreliable')

    # YOUR CODE: Add more decision rules here

    return recommendation


# Test with all three datasets
print("SE Method Recommendations:")
print("=" * 70)

datasets = [
    (fin_data, 'firm_id', 'month', 'Financial Panel'),
    (macro_data, 'country_id', 'year', 'Macro Panel'),
    (wage_data, 'person_id', 'year', 'Wage Panel')
]

for df, ent, t, name in datasets:
    rec = recommend_se_method(df, ent, t)
    print(f"\n{name}: N={rec['N']}, T={rec['T']}")
    print(f"  Primary: {rec['primary']}")
    print(f"  Robustness: {rec['robustness']}")
    for r in rec['rationale']:
        print(f"  Rationale: {r}")
    for w in rec['warnings']:
        print(f"  WARNING: {w}")

print()
print("EXTEND THIS FUNCTION: Add logic for cross-section correlation,")
print("nonlinear models, spatial data, and bootstrap recommendations.")

---

<a id='section9'></a>
## 9. Summary and Takeaways

### What We Learned Across All 7 Tutorials

In [None]:
# Summary of all tutorials and methods
tutorial_summary = pd.DataFrame({
    'Tutorial': [
        '01 Robust', '02 Clustering', '03 HAC',
        '04 Spatial', '05 MLE Inference', '06 Bootstrap',
        '07 Synthesis'
    ],
    'Methods Covered': [
        'HC0, HC1, HC2, HC3',
        'Cluster (entity, time, two-way)',
        'Newey-West, Driscoll-Kraay',
        'Spatial HAC (geographic)',
        'MLE Sandwich, cluster MLE',
        'Pairs, cluster, wild bootstrap',
        'All methods compared'
    ],
    'Key Use Case': [
        'Heteroskedasticity (always)',
        'Within-cluster correlation',
        'Temporal + cross-sec. correlation',
        'Geographic proximity',
        'Logit, Probit, Poisson',
        'Quantile regression, small G',
        'Capstone: decision framework'
    ],
    'Key Finding': [
        'HC3 best in small samples',
        'G ≥ 20 for valid asymptotics',
        'DK for macro; NW for time series',
        'Spatial cutoff choice matters',
        'MLE cluster often needed',
        'B ≥ 499 for stable results',
        'No one-size-fits-all — context!'
    ]
})

print("=" * 110)
print("COMPLETE SERIES SUMMARY: 7 TUTORIALS ON STANDARD ERRORS")
print("=" * 110)
print(tutorial_summary.to_string(index=False))
print("=" * 110)

In [None]:
print("=" * 70)
print("THE 8 COMMANDMENTS OF STANDARD ERRORS")
print("=" * 70)
print("""
1. NEVER use non-robust SEs in panel data (heteroskedasticity is pervasive)

2. ALWAYS use cluster-robust SEs for micro panels (N large, T small)
   → Cluster by entity captures within-entity temporal correlation

3. PREFER Driscoll-Kraay for macro panels (N small, T large)
   → Also handles cross-sectional correlation via HAC

4. USE two-way clustering for financial panels
   → Both firm and time dimensions correlated (Petersen 2009)

5. CHECK cluster asymptotics: G ≥ 20 required
   → With fewer clusters, use bootstrap or aggregate

6. VERIFY T > N before using PCSE
   → If T ≈ N or T < N, Driscoll-Kraay is safer

7. REPORT primary + robustness SE specification
   → Use StandardErrorComparison for systematic checks

8. DISCUSS when SE choice affects conclusions
   → Transparency essential for replicability
""")

print("=" * 70)
print("SIMPLIFIED DECISION TREE")
print("=" * 70)
print("""
Panel Data
    │
    ├─→ N large, T small (micro panel)
    │       │
    │       ├─→ G ≥ 20? YES → Cluster by entity (primary)
    │       │               → Two-way cluster (robustness)
    │       │
    │       └─→ G < 20?  → Bootstrap (cluster or pairs)
    │
    └─→ N small, T large (macro panel)
            │
            ├─→ T > N? YES → PCSE (primary if T >> N)
            │               → Driscoll-Kraay (always valid check)
            │
            └─→ T ≤ N?  → Driscoll-Kraay (primary)
                         → Cluster by entity (secondary)
""")

---

<a id='section10'></a>
## 10. References

### Synthesis Papers

1. **Angrist, J. D., & Pischke, J.-S. (2009)**. *Mostly Harmless Econometrics*. Princeton University Press. [Chapter 8: Nonstandard Standard Errors]

2. **Cameron, A. C., & Miller, D. L. (2015)**. "A practitioner's guide to cluster-robust inference." *Journal of Human Resources*, 50(2), 317–372.
   - *The definitive practical reference for clustering in panel data.*

3. **Petersen, M. A. (2009)**. "Estimating standard errors in finance panel data sets: Comparing approaches." *Review of Financial Studies*, 22(1), 435–480.
   - *Key reference for two-way clustering in finance.*

4. **Thompson, S. B. (2011)**. "Simple formulas for standard errors that cluster by both firm and time." *Journal of Financial Economics*, 99(1), 1–10.

### Method-Specific Papers

5. **Driscoll, J. C., & Kraay, A. C. (1998)**. "Consistent covariance matrix estimation with spatially dependent panel data." *Review of Economics and Statistics*, 80(4), 549–560.

6. **Beck, N., & Katz, J. N. (1995)**. "What to do (and not to do) with time-series cross-section data." *American Political Science Review*, 89(3), 634–647. [PCSE]

7. **White, H. (1980)**. "A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity." *Econometrica*, 48(4), 817–838. [HC0]

8. **MacKinnon, J. G., & White, H. (1985)**. "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties." *Journal of Econometrics*, 29(3), 305–325. [HC1-HC3]

### For Reporting and Reproducibility

9. **Ioannidis, J. P. A. (2005)**. "Why most published research findings are false." *PLOS Medicine*, 2(8), e124.
   - *Context for understanding why correct SEs matter for scientific validity.*

10. **Open Science Collaboration (2015)**. "Estimating the reproducibility of psychological science." *Science*, 349(6251), aac4716.

---

### PanelBox API Reference

```python
# Standard error methods summary
cov_types = {
    'nonrobust':      'Classical OLS (baseline, often wrong)',
    'robust':         'HC1 heteroskedasticity-robust',
    'hc0':            'HC0 (White 1980)',
    'hc2':            'HC2 (leverage-adjusted)',
    'hc3':            'HC3 (aggressive leverage, best small samples)',
    'clustered':      'Cluster by entity (temporal correlation)',
    'twoway':         'Two-way cluster (entity + time)',
    'driscoll_kraay': 'Driscoll-Kraay HAC (macro panels)',
    'newey_west':     'Newey-West HAC (time series / short panels)',
    'pcse':           'Panel-corrected SE (requires T > N)',
}

# Usage
model = FixedEffects(formula, data, entity_col, time_col)
results = model.fit(cov_type='clustered')

# Comparison
comp = StandardErrorComparison(results)
result = comp.compare_all()  # or compare_all(se_types=['robust', 'clustered', 'twoway'])
comp.summary(result)
comp.plot_comparison(result)
```

---

**End of Tutorial 07 — Standard Errors Series Complete!**

You have now completed a comprehensive survey of standard error methods for panel data. The key message throughout: **always match your SE method to the correlation structure in your data**, use `StandardErrorComparison` to systematically check robustness, and be transparent about SE choices in your reporting.