# Tutorial 01: Robust Standard Errors - Fundamentals

**Author**: PanelBox Development Team
**Date**: 2026-02-16
**Estimated Duration**: 45-60 minutes
**Prerequisites**: Basic econometrics, Python, pandas

---

## Learning Objectives

By the end of this tutorial, you will be able to:

1. Diagnose heteroskedasticity in panel data using residual plots and formal tests
2. Understand the difference between HC0, HC1, HC2, and HC3 robust standard errors
3. Apply robust standard errors to linear panel models (Pooled OLS and Fixed Effects)
4. Interpret the impact of heteroskedasticity on statistical inference
5. Choose appropriate robust standard error corrections for different data structures

---

## Table of Contents

1. [Setup and Data Loading](#setup)
2. [The Heteroskedasticity Problem](#problem)
3. [Robust Standard Error Variants (HC0-HC3)](#variants)
4. [Application to Panel Data](#application)
5. [Comparison and Interpretation](#comparison)
6. [Exercises](#exercises)
7. [References](#references)

---

<a id='setup'></a>
## 1. Setup and Data Loading

We'll start by importing the necessary libraries and loading the Grunfeld dataset.

In [None]:
# Standard imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

# PanelBox imports
import panelbox as pb
from panelbox.models.static import PooledOLS, FixedEffects

# Local utilities
import sys
sys.path.append('../utils')
from plotting import plot_residuals, plot_se_comparison, plot_heteroskedasticity_test
from diagnostics import test_heteroskedasticity

# Configuration
np.random.seed(42)
sns.set_style("whitegrid")
plt.rcParams['figure.dpi'] = 100

# Define paths
DATA_PATH = '../data/'
FIG_PATH = '../outputs/figures/01_robust/'

print("Setup complete!")

### Load Grunfeld Dataset

The Grunfeld dataset contains investment data for 10 firms over 20 years (1935-1954).

In [None]:
# Load data
data = pd.read_csv(DATA_PATH + 'grunfeld.csv')

# Display basic info
print(f"Shape: {data.shape}")
print(f"\nColumns: {list(data.columns)}")
print(f"\nEntities (firms): {data['firm_id'].nunique()}")
print(f"Time periods (years): {data['year'].nunique()}")
print(f"\nFirst few rows:")
data.head()

---

<a id='problem'></a>
## 2. The Heteroskedasticity Problem

### What is Heteroskedasticity?

**Heteroskedasticity** occurs when the variance of the error term is not constant across observations:

$$\text{Var}(u_i | X_i) = \sigma_i^2 \neq \sigma^2$$

In the presence of heteroskedasticity:
- OLS coefficients remain **unbiased** and **consistent**
- Standard errors are **biased** ‚Üí invalid t-tests and confidence intervals
- Efficiency is lost (OLS is no longer BLUE)

### Diagnosing Heteroskedasticity

Let's estimate a simple investment model and check for heteroskedasticity.

In [None]:
# Estimate pooled OLS model
model_pooled = PooledOLS("invest ~ value + capital", data, "firm_id", "year")

result_pooled = model_pooled.fit()
print(result_pooled.summary())

# Extract fitted values and residuals
fitted = result_pooled.fittedvalues
residuals = result_pooled.resid

In [None]:
# Plot residuals vs fitted values
fig = plot_residuals(fitted, residuals,
                     title="Residuals vs Fitted Values - Grunfeld Data")
plt.savefig(FIG_PATH + 'residuals_vs_fitted.png', dpi=300, bbox_inches='tight')
plt.show()

# Interpretation
print("\nüìä Visual Inspection:")
print("Look for:")
print("  - Fan-shaped pattern ‚Üí increasing variance")
print("  - Funnel pattern ‚Üí decreasing variance")
print("  - Horizontal band ‚Üí homoskedasticity (ideal)")

### Formal Tests for Heteroskedasticity

We'll use two standard tests:

1. **White Test**: General test with no specific functional form assumption
2. **Breusch-Pagan Test**: Assumes variance is linear function of regressors

In [None]:
# Prepare regressor matrix (without constant)
X = data[['value', 'capital']].values

# White test
white_result = test_heteroskedasticity(residuals, X, test_type='white')
print("=" * 60)
print("WHITE TEST FOR HETEROSKEDASTICITY")
print("=" * 60)
print(white_result)
print("\n")

# Breusch-Pagan test
bp_result = test_heteroskedasticity(residuals, X, test_type='breusch_pagan')
print("=" * 60)
print("BREUSCH-PAGAN TEST FOR HETEROSKEDASTICITY")
print("=" * 60)
print(bp_result)

---

<a id='simulation'></a>
## 2.5 Monte Carlo Simulation: Understanding SE Bias

**Objective**: Demonstrate empirically that classical SEs are biased under heteroskedasticity, while robust SEs maintain valid inference.

### Simulation Design

**Data Generating Process (DGP)**:
- Sample size: N = 500
- True model: y = 2 + 0.5¬∑x + Œµ  
- **Heteroskedastic errors**: Œµ ~ N(0, œÉ¬≤¬∑x¬≤) [variance proportional to x¬≤]
- Replications: 1000

**Test**: H‚ÇÄ: Œ≤ = 0.5 (the true value)
**Expected rejection rate**: 5% (if inference is valid)

> **Key Insight**: Under heteroskedasticity, classical SEs are systematically too small, leading to over-rejection (liberal tests). Robust SEs correct this bias.

In [None]:
# Monte Carlo Simulation
n_simulations = 1000
n_obs = 500
true_beta = 0.5

print("=" * 70)
print("MONTE CARLO SIMULATION: CLASSICAL vs ROBUST SE")
print("=" * 70)
print(f"Replications: {n_simulations}")
print(f"Sample size: {n_obs}")
print(f"True Œ≤‚ÇÅ: {true_beta}")
print("\nRunning simulation...")

# Storage for results
reject_classical = []
reject_robust = []
se_classical_list = []
se_robust_list = []

for sim in range(n_simulations):
    # Generate heteroskedastic data
    x = np.random.uniform(1, 10, n_obs)
    epsilon = np.random.normal(0, x**2, n_obs)  # Variance ‚àù x¬≤
    y = 2 + true_beta * x + epsilon
    
    # Create DataFrame with required entity and time columns
    sim_data = pd.DataFrame({
        'y': y,
        'x': x,
        'entity': 1,  # Single entity for cross-sectional data
        'time': range(n_obs)  # Time index
    })
    
    # Estimate with classical SEs
    model_sim = PooledOLS("y ~ x", sim_data, "entity", "time")
    res_classical = model_sim.fit(cov_type='nonrobust')
    
    # Estimate with robust SEs (HC1)
    res_robust = model_sim.fit(cov_type='hc1')
    
    # Test H0: beta = true_beta
    beta_hat = res_classical.params['x']
    t_classical = (beta_hat - true_beta) / res_classical.std_errors['x']
    t_robust = (beta_hat - true_beta) / res_robust.std_errors['x']
    
    # Record rejection (|t| > 1.96)
    reject_classical.append(abs(t_classical) > 1.96)
    reject_robust.append(abs(t_robust) > 1.96)
    
    # Store SEs
    se_classical_list.append(res_classical.std_errors['x'])
    se_robust_list.append(res_robust.std_errors['x'])
    
    if (sim + 1) % 250 == 0:
        print(f"  {sim + 1}/{n_simulations} completed...")

print("\nSimulation complete!")

In [None]:
# Calculate empirical rejection rates
rejection_rate_classical = np.mean(reject_classical)
rejection_rate_robust = np.mean(reject_robust)

print("\n" + "=" * 70)
print("SIMULATION RESULTS")
print("=" * 70)
print(f"\nEmpirical Rejection Rates (H‚ÇÄ is TRUE ‚Üí should reject ~5%):")
print(f"  Classical SE:    {rejection_rate_classical:.3f} ({rejection_rate_classical*100:.1f}%)")
print(f"  Robust SE (HC1): {rejection_rate_robust:.3f} ({rejection_rate_robust*100:.1f}%)")
print()

print("Interpretation:")
if rejection_rate_classical > 0.07:
    print(f"  ‚úó Classical: TOO LIBERAL (rejects {rejection_rate_classical*100:.1f}% > 5%)")
    print("     ‚Üí Classical SEs are biased downward under heteroskedasticity")
    print("     ‚Üí Leads to spurious significant findings (Type I error inflation)")
else:
    print("  ‚úì Classical: Acceptable rejection rate")

if 0.04 <= rejection_rate_robust <= 0.06:
    print(f"  ‚úì Robust: CORRECT SIZE (rejects {rejection_rate_robust*100:.1f}% ‚âà 5%)")
    print("     ‚Üí Robust SEs provide valid inference under heteroskedasticity")
else:
    print(f"  ~ Robust: {rejection_rate_robust*100:.1f}% (close to nominal 5%)")

print("\n" + "=" * 70)
print("KEY TAKEAWAY")
print("=" * 70)
print("Under heteroskedasticity:")
print("  ‚Ä¢ Classical SEs ‚Üí Invalid inference (over-rejection)")
print("  ‚Ä¢ Robust SEs ‚Üí Valid inference (correct test size)")
print("  ‚Ä¢ Always use robust SEs when heteroskedasticity is suspected!")

In [None]:
# Visualize simulation results
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Distribution of Standard Errors
ax = axes[0]
ax.hist(se_classical_list, bins=50, alpha=0.6, label='Classical',  
        color='steelblue', edgecolor='black', linewidth=0.5)
ax.hist(se_robust_list, bins=50, alpha=0.6, label='Robust (HC1)', 
        color='darkorange', edgecolor='black', linewidth=0.5)

mean_classical = np.mean(se_classical_list)
mean_robust = np.mean(se_robust_list)

ax.axvline(mean_classical, color='blue', linestyle='--', linewidth=2.5,
           label=f'Mean Classical: {mean_classical:.4f}')
ax.axvline(mean_robust, color='red', linestyle='--', linewidth=2.5,
           label=f'Mean Robust: {mean_robust:.4f}')

ax.set_xlabel('Standard Error', fontsize=12, fontweight='bold')
ax.set_ylabel('Frequency', fontsize=12, fontweight='bold')
ax.set_title('Distribution of Standard Errors\n(1000 Monte Carlo replications)', 
             fontsize=13, fontweight='bold')
ax.legend(fontsize=10)
ax.grid(alpha=0.3, axis='y')

# Plot 2: Rejection Rates
ax = axes[1]
methods = ['Classical\nSE', 'Robust SE\n(HC1)']
rates = [rejection_rate_classical, rejection_rate_robust]
colors = ['#d62728' if r > 0.07 else '#2ca02c' for r in rates]

bars = ax.bar(methods, rates, color=colors, alpha=0.7, edgecolor='black', linewidth=2)
ax.axhline(0.05, color='black', linestyle='--', linewidth=2.5, 
           label='Nominal Size (5%)', zorder=10)

ax.set_ylabel('Rejection Rate', fontsize=12, fontweight='bold')
ax.set_title('Empirical Rejection Rates\n(H‚ÇÄ is TRUE)', 
             fontsize=13, fontweight='bold')
ax.set_ylim(0, max(rates) * 1.3)
ax.legend(fontsize=10)
ax.grid(axis='y', alpha=0.3)

# Add value labels
for bar, rate in zip(bars, rates):
    height = bar.get_height()
    ax.text(bar.get_x() + bar.get_width()/2., height + 0.005,
            f'{rate:.3f}\n({rate*100:.1f}%)',
            ha='center', va='bottom', fontweight='bold', fontsize=11)

plt.tight_layout()
plt.savefig(FIG_PATH + 'monte_carlo_simulation.png', dpi=300, bbox_inches='tight')
plt.show()

print("\n‚úì Simulation plots saved")

---

<a id='variants'></a>
## 3. Robust Standard Error Variants (HC0-HC3)

When heteroskedasticity is detected, we need to correct the standard errors. The **robust covariance matrix** (also called sandwich estimator or Huber-White estimator) is:

$$\hat{V}_{\text{robust}} = (X'X)^{-1} \left(\sum_{i=1}^n \hat{u}_i^2 x_i x_i'\right) (X'X)^{-1}$$

Several variants exist that differ in how they weight the residuals:

### HC0 (Original White)
$$\hat{V}_{\text{HC0}} = (X'X)^{-1} \left(\sum_{i=1}^n \hat{u}_i^2 x_i x_i'\right) (X'X)^{-1}$$

### HC1 (Degrees of Freedom Correction)
$$\hat{V}_{\text{HC1}} = \frac{n}{n-k} \hat{V}_{\text{HC0}}$$

### HC2 (Leverage Correction)
$$\hat{u}_i^{(2)} = \frac{\hat{u}_i}{\sqrt{1 - h_i}}$$

where $h_i$ is the leverage of observation $i$.

### HC3 (Davidson-MacKinnon)
$$\hat{u}_i^{(3)} = \frac{\hat{u}_i}{1 - h_i}$$

**Recommendation**: HC3 is generally preferred in small samples as it provides better finite-sample properties.

---

<a id='application'></a>
## 4. Application to Panel Data

Let's estimate the model with different robust SE variants and compare results.

In [None]:
# Re-estimate with different robust SE methods
se_types = ['classical', 'HC0', 'HC1', 'HC2', 'HC3']
results_dict = {}

for se_type in se_types:
    if se_type == 'classical':
        cov_type = 'nonrobust'
    else:
        cov_type = se_type.lower()

    result = model_pooled.fit(cov_type=cov_type)
    results_dict[se_type] = result

print("‚úì Estimated models with all SE variants")

In [None]:
# Create comparison table
comparison_data = []

for var in ['value', 'capital']:
    for se_type in se_types:
        res = results_dict[se_type]
        comparison_data.append({
            'Variable': var,
            'SE Type': se_type,
            'Coefficient': res.params[var],
            'Std Error': res.std_errors[var],
            't-statistic': res.tvalues[var],
            'p-value': res.pvalues[var]
        })

comparison_df = pd.DataFrame(comparison_data)
print("\n" + "=" * 80)
print("COMPARISON OF STANDARD ERROR METHODS")
print("=" * 80)
print(comparison_df.to_string(index=False))

In [None]:
# Plot SE comparison for 'value' variable
estimates = {se: results_dict[se].params['value'] for se in se_types}
std_errors = {se: results_dict[se].std_errors['value'] for se in se_types}

fig = plot_se_comparison(
    coef_name='value',
    estimates=estimates,
    std_errors=std_errors,
    methods=se_types,
    title='Comparison of Standard Error Methods: Value Coefficient'
)
plt.savefig(FIG_PATH + 'se_comparison_value.png', dpi=300, bbox_inches='tight')
plt.show()

### Application to Fixed Effects Model

Now let's see how robust SEs work with entity fixed effects.

In [None]:
# Estimate fixed effects model
model_fe = FixedEffects("invest ~ value + capital", data, "firm_id", "year")

# Compare classical vs robust SEs
result_fe_classical = model_fe.fit(cov_type='nonrobust')
result_fe_robust = model_fe.fit(cov_type='hc3')

print("=" * 80)
print("FIXED EFFECTS: CLASSICAL vs ROBUST SE")
print("=" * 80)
print("\nClassical SE:")
print(result_fe_classical.summary())
print("\nRobust SE (HC3):")
print(result_fe_robust.summary())

---

<a id='comparison'></a>
## 5. Comparison and Interpretation

### Key Insights

1. **Magnitude of Correction**: How much do robust SEs differ from classical SEs?
2. **Inference Impact**: Do conclusions change when using robust SEs?
3. **Choice of Variant**: How much do HC0-HC3 differ in practice?

In [None]:
# Calculate SE ratios (robust/classical)
print("=" * 60)
print("STANDARD ERROR RATIOS (Robust / Classical)")
print("=" * 60)

for var in ['value', 'capital']:
    classical_se = results_dict['classical'].std_errors[var]
    print(f"\nVariable: {var}")
    print(f"  Classical SE: {classical_se:.6f}")

    for se_type in ['HC0', 'HC1', 'HC2', 'HC3']:
        robust_se = results_dict[se_type].std_errors[var]
        ratio = robust_se / classical_se
        print(f"  {se_type} SE: {robust_se:.6f} (ratio: {ratio:.3f})")

### 5.1 Understanding Leverage and HC2/HC3 Adjustments

**Leverage** ($h_i$) measures the influence of observation $i$ on its own fitted value.

High leverage observations:
- Are far from the mean in X-space
- Have large potential influence on regression
- Need special treatment in SE calculation

HC2 and HC3 adjust for leverage:
- **HC2**: Divides residuals by $\sqrt{1 - h_i}$
- **HC3**: Divides residuals by $(1 - h_i)$ (more aggressive)

Effect: Observations with high leverage get **larger weights** ‚Üí more conservative SEs

In [None]:
# Calculate leverage for Grunfeld data
from scipy.linalg import inv

# Reconstruct design matrix from data
# We need: intercept, value, capital
X_design = np.column_stack([
    np.ones(len(data)),  # Intercept
    data['value'].values,
    data['capital'].values
])

# Calculate hat matrix diagonal: h_i = X_i (X'X)^(-1) X_i'
XtX_inv = inv(X_design.T @ X_design)
leverage = np.sum(X_design @ XtX_inv * X_design, axis=1)

print("=" * 70)
print("LEVERAGE STATISTICS")
print("=" * 70)
print(f"Mean leverage: {leverage.mean():.4f}")
print(f"Max leverage: {leverage.max():.4f}")
print(f"Min leverage: {leverage.min():.4f}")
print(f"Std leverage: {leverage.std():.4f}")
print(f"\nHigh leverage threshold (2k/n): {2 * X_design.shape[1] / len(X_design):.4f}")
high_leverage = leverage > (2 * X_design.shape[1] / len(X_design))
print(f"Observations with high leverage: {high_leverage.sum()} ({high_leverage.sum()/len(leverage)*100:.1f}%)")

# Calculate HC adjustment factors
hc1_factor = np.sqrt(len(leverage) / (len(leverage) - X_design.shape[1]))  # Constant
hc2_factor = 1 / np.sqrt(1 - leverage)
hc3_factor = 1 / (1 - leverage)

print(f"\nAdjustment factors:")
print(f"  HC1: {hc1_factor:.4f} (constant)")
print(f"  HC2: {hc2_factor.mean():.4f} (mean), max={hc2_factor.max():.4f}")
print(f"  HC3: {hc3_factor.mean():.4f} (mean), max={hc3_factor.max():.4f}")

In [None]:
# Visualize leverage and HC adjustments
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Leverage distribution
ax = axes[0]
ax.hist(leverage, bins=30, edgecolor='black', alpha=0.7, color='steelblue')
ax.axvline(leverage.mean(), color='red', linestyle='--', linewidth=2, 
           label=f'Mean: {leverage.mean():.4f}')
ax.axvline(2 * X_design.shape[1] / len(X_design), color='orange', 
           linestyle='--', linewidth=2, label='High leverage threshold (2k/n)')
ax.set_xlabel('Leverage (h)', fontsize=12, fontweight='bold')
ax.set_ylabel('Frequency', fontsize=12, fontweight='bold')
ax.set_title('Distribution of Leverage Values', fontsize=13, fontweight='bold')
ax.legend()
ax.grid(alpha=0.3, axis='y')

# Plot 2: HC adjustment factors vs leverage
ax = axes[1]
sorted_idx = np.argsort(leverage)
ax.plot(leverage[sorted_idx], hc2_factor[sorted_idx], 
        label='HC2: 1/‚àö(1-h)', linewidth=2.5, color='green')
ax.plot(leverage[sorted_idx], hc3_factor[sorted_idx], 
        label='HC3: 1/(1-h)', linewidth=2.5, color='red')
ax.axhline(hc1_factor, color='blue', linestyle='--', linewidth=2, 
           label=f'HC1: {hc1_factor:.3f} (constant)')

ax.set_xlabel('Leverage (h)', fontsize=12, fontweight='bold')
ax.set_ylabel('Adjustment Factor', fontsize=12, fontweight='bold')
ax.set_title('HC2 and HC3 Leverage Adjustments', fontsize=13, fontweight='bold')
ax.legend()
ax.grid(alpha=0.3)

plt.tight_layout()
plt.savefig(FIG_PATH + 'leverage_adjustments.png', dpi=300, bbox_inches='tight')
plt.show()

print("\n" + "=" * 70)
print("INTERPRETATION")
print("=" * 70)
print("‚Ä¢ HC1: Constant adjustment (simple degrees-of-freedom correction)")
print("‚Ä¢ HC2: Moderate leverage adjustment (‚àö correction)")
print("‚Ä¢ HC3: Aggressive leverage adjustment (more conservative)")
print("\nAs leverage increases:")
print("  ‚Üí HC2/HC3 adjustment factors increase")
print("  ‚Üí Residuals get weighted more heavily")
print("  ‚Üí Standard errors become larger (more conservative)")
print("\n‚úì Visualization saved")

### Summary of Findings

**When to Use Robust Standard Errors:**

‚úÖ **Always use** when:
- Heteroskedasticity is suspected or detected
- Sample size is large (n > 50)
- You want inference robust to misspecification

‚ö†Ô∏è **Be cautious** when:
- Sample size is very small (n < 30)
- Model is severely misspecified
- High leverage points are present

**Which Variant?**

- **HC0**: Original White, can underestimate SEs in small samples
- **HC1**: Simple df correction, better in small samples
- **HC2**: Leverage adjustment, good theoretical properties
- **HC3**: **Recommended default** - best small-sample performance

**PanelBox Default**: HC3 (when `cov_type='robust'`)

---

<a id='practices'></a>
## 5.2 Best Practices and Reporting Guidelines

### When to Use Robust Standard Errors

‚úÖ **ALWAYS use robust SEs** when:
- Working with cross-sectional data
- Heteroskedasticity is detected (White/BP test)
- You want inference robust to misspecification
- Sample size is moderate to large (n > 50)
- Publishing in applied economics/finance journals

‚ö†Ô∏è **Be cautious** when:
- Sample size is very small (n < 30) ‚Äì consider bootstrap
- Model is severely misspecified ‚Äì fix model first
- Many high leverage points ‚Äì investigate outliers

### Which HC Variant to Choose?

**Decision Tree**:

1. **Large samples (n > 500)**:
   - Use HC0 or HC1 (similar performance)
   - Stata default: HC1

2. **Moderate samples (100 < n < 500)**:
   - **Use HC1** (recommended default)
   - Matches Stata `robust` option
   - Good balance of properties

3. **Small samples (50 < n < 100)**:
   - Use HC2 or HC3
   - HC3 is more conservative (recommended)

4. **Very small samples (n < 50)**:
   - **Use HC3** (Long & Ervin 2000 recommendation)
   - Or consider bootstrap methods

**PanelBox Default**: `cov_type='robust'` ‚Üí HC3

---

### How to Report in Academic Papers

#### Table Notes Example:

```
Note: Robust standard errors (HC1) in parentheses.  
*, **, *** denote significance at 10%, 5%, and 1% levels.
```

#### Text Reporting (when results are robust):

> "Our results are robust to the choice of standard error method. Table A2 in the Appendix presents results using nonrobust, HC1, HC2, and HC3 standard errors. While robust standard errors are larger than classical (as expected given evidence of heteroskedasticity; White test: œá¬≤=28.4, p<0.001), all key coefficients remain statistically significant at conventional levels."

#### Text Reporting (when results are SENSITIVE):

> "We note that the coefficient on firm value is significant using classical standard errors (Œ≤=0.110, SE=0.011, p=0.03) but not with robust standard errors (Œ≤=0.110, SE=0.018, p=0.12). Given strong evidence of heteroskedasticity (White test: œá¬≤=45.3, p<0.001; residual plots show clear fan pattern), we rely on robust inference and conclude that the effect of firm value on investment is **not statistically significant**."

### Common Pitfalls to Avoid

‚ùå **Pitfall 1**: "Robust = Better Model"
- Wrong: "My model is robust because I used robust SEs"
- Right: "My inference is valid under heteroskedasticity due to robust SEs"
- **Robustness** = trying different specifications, samples, estimation methods

‚ùå **Pitfall 2**: Ignoring Large SE Differences
- If robust SEs are 2x-3x larger than classical:
  - Investigate why (severe heteroskedasticity? outliers?)
  - May indicate model misspecification
  - Consider: omitted variables, wrong functional form, outliers

‚ùå **Pitfall 3**: Selective Reporting (p-hacking)
- Don't try HC0, HC1, HC2, HC3 and report only the one with p<0.05
- **Pre-specify** SE choice in analysis plan
- Or report all variants as robustness check

‚ùå **Pitfall 4**: Using Robust SEs as a "Fix" for Bad Models
- Robust SEs correct inference, not bad modeling
- If heteroskedasticity is severe, consider:
  - Transform dependent variable (log, sqrt)
  - Weighted least squares (WLS)
  - Model variance explicitly

---

---

<a id='exercises'></a>
## 6. Exercises

### Exercise 1: Heteroskedasticity Diagnosis (Easy)

**Task**: Generate a dataset with known heteroskedasticity and diagnose it.

**Requirements**:
1. Generate data with multiplicative heteroskedasticity: Œµ ~ N(0, œÉ¬≤¬∑x¬≤)
2. Estimate model and create residual plot
3. Perform White test
4. Compare classical vs robust (HC3) SEs

**Starter Code**:

In [None]:
# Exercise 1: Your code here
np.random.seed(123)
n = 200

# Step 1: Generate heteroskedastic data
x = np.random.uniform(1, 5, n)
epsilon = np.random.normal(0, x**2, n)  # Variance ‚àù x¬≤
y = 1 + 2*x + epsilon

ex1_data = pd.DataFrame({'y': y, 'x': x})

# Step 2: Estimate model
# YOUR CODE: Estimate PooledOLS model

# Step 3: Create residual plot
# YOUR CODE: Plot residuals vs fitted values

# Step 4: White test
# YOUR CODE: Perform White test

# Step 5: Compare SEs
# YOUR CODE: Estimate with classical and HC3, compare results

### Exercise 2: Confidence Interval Coverage (Moderate)

**Task**: Verify empirically that 95% CIs have correct coverage under heteroskedasticity.

**Requirements**:
1. Simulate 1000 datasets with heteroskedastic errors
2. For each, construct 95% CI using classical and robust SEs
3. Calculate coverage rate (% of times CI contains true Œ≤)
4. **Expected**: Classical < 95%, Robust ‚âà 95%

**Starter Code**:

In [None]:
# Exercise 2: Your code here
from scipy import stats

n_sims = 1000
n_obs = 200
true_beta = 1.5
coverage_classical = []
coverage_robust = []

for sim in range(n_sims):
    # Generate heteroskedastic data
    x = np.random.uniform(1, 5, n_obs)
    eps = np.random.normal(0, x**1.5, n_obs)
    y = 1 + true_beta * x + eps
    
    sim_data = pd.DataFrame({'y': y, 'x': x})
    
    # YOUR CODE:
    # 1. Estimate model with classical and robust SEs
    # 2. Construct 95% CIs for both
    # 3. Check if true_beta is in CI
    # 4. Append to coverage_classical and coverage_robust lists
    
    pass  # Remove this and add your code

# Calculate coverage rates
# YOUR CODE: Calculate mean of coverage lists

print(f"Classical CI coverage: {np.mean(coverage_classical):.3f} (should be 0.95)")
print(f"Robust CI coverage: {np.mean(coverage_robust):.3f} (should be 0.95)")

### Exercise 3: Real Data Application (Challenging)

**Task**: Apply robust inference to wage panel data.

**Requirements**:
1. Load `wage_panel.csv` dataset
2. Estimate wage equation: `wage ~ education + experience + tenure`
3. Diagnose heteroskedasticity (visual + formal tests)
4. Compare all HC variants (HC0, HC1, HC2, HC3)
5. Create publication-ready table
6. Write 1-paragraph interpretation

**Starter Code**:

In [None]:
# Exercise 3: Your code here

# Load wage data
wage_data = pd.read_csv(DATA_PATH + 'wage_panel.csv')

print("Wage Panel Data:")
print(f"Shape: {wage_data.shape}")
print(f"\nVariables: {list(wage_data.columns)}")
print(f"\nSample:")
display(wage_data.head())

# YOUR CODE:
# 1. Estimate PooledOLS: wage ~ education + experience + tenure
# 2. Create residual plots
# 3. Run heteroskedasticity tests
# 4. Estimate with all HC variants
# 5. Use StandardErrorComparison to create comparison table
# 6. Plot results
# 7. Write interpretation

# Example structure:
# model_wage = PooledOLS("wage ~ education + experience + tenure", wage_data, "entity_id", "time_id")
# result = model_wage.fit(cov_type='nonrobust')
# ...

print("\n" + "=" * 70)
print("INTERPRETATION (write your own!):")
print("=" * 70)
print("""
[Your interpretation here - discuss:]
- Economic meaning of coefficients
- Evidence of heteroskedasticity
- Impact of SE choice on inference
- Recommendation for preferred specification
""")

---

<a id='summary'></a>
## 7. Summary and Key Takeaways

### What We Learned

1. **Heteroskedasticity** is common in real data and invalidates classical standard errors
   - OLS coefficients remain unbiased
   - But classical SEs are biased ‚Üí invalid t-tests and CIs

2. **Diagnosis is important** but robust SEs are safe even when tests are negative
   - Visual: Residual plots (look for cone/funnel patterns)
   - Formal: White test, Breusch-Pagan test

3. **Monte Carlo evidence** confirms:
   - Classical SEs: Liberal tests (over-rejection) under heteroskedasticity
   - Robust SEs: Maintain correct test size (~5%)

4. **HC variants** (HC0-HC3) differ in leverage adjustment:
   - HC0: Baseline (White 1980)
   - HC1: DF correction **(Stata default, recommended for general use)**
   - HC2: Moderate leverage adjustment
   - HC3: Aggressive leverage adjustment **(PanelBox default, best for small samples)**

5. **Reporting** multiple SE types demonstrates robustness of findings

### Key Formulas

**Classical Variance** (WRONG under heteroskedasticity):
$$\\text{Var}(\\hat{\\beta}) = \\sigma^2 (X'X)^{-1}$$

**Robust Variance (HC1)** (CORRECT under heteroskedasticity):
$$\\text{Var}(\\hat{\\beta}) = \\frac{n}{n-k} (X'X)^{-1} \\left[\\sum_{i=1}^n \\hat{\\epsilon}_i^2 x_i x_i' \\right] (X'X)^{-1}$$

### Decision Rules

**When to use robust SEs?**
- ‚úÖ Always in cross-sectional data
- ‚úÖ When heteroskedasticity detected
- ‚úÖ When publishing applied research

**Which variant?**
- Large samples (n>500): HC0 or HC1
- Moderate samples (100<n<500): **HC1** (recommended)
- Small samples (n<100): **HC3** (more conservative)

**In PanelBox**:
```python
# Default robust (HC3)
result = model.fit(cov_type='robust')

# Specific variant
result = model.fit(cov_type='hc1')  # Matches Stata
```

---

### Connection to Next Tutorials

‚û°Ô∏è **Tutorial 02: Clustered Standard Errors**

**Why?** Robust SEs (HC0-HC3) handle **heteroskedasticity** but assume **independence** across observations.

**Problem in panel data**: Observations within the same entity (firm, individual, country) are often **correlated over time**.

**Solution**: **Clustered standard errors** account for:
- Within-cluster correlation (e.g., observations from same firm)
- Both heteroskedasticity AND correlation

**Example**:
- Firm-level data: Cluster by firm_id
- Country-year data: Cluster by country (or two-way clustering)

**Preview**:
```python
# Next tutorial: Clustering
result = model.fit(cov_type='clustered', cluster_entity=True)

# Or two-way clustering
result = model.fit(cov_type='twoway')
```

---

‚û°Ô∏è **Tutorial 03: HAC Standard Errors (Newey-West)**

For time-series and panel data with **autocorrelation**, we'll learn:
- Heteroskedasticity and Autocorrelation Consistent (HAC) SEs
- Newey-West estimator
- Choosing optimal lag length

---

**Learning Path**:
1. ‚úÖ **Tutorial 01**: Robust SEs (heteroskedasticity)
2. ‚è≠Ô∏è **Tutorial 02**: Clustered SEs (correlation within clusters)
3. ‚è≠Ô∏è **Tutorial 03**: HAC SEs (autocorrelation over time)
4. ‚è≠Ô∏è **Tutorial 04**: Spatial SEs (spatial correlation)

---

---

<a id='references'></a>
## 7. References

### Key Papers

1. **White, H. (1980)**. "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity". *Econometrica*, 48(4), 817-838.

2. **MacKinnon, J. G., & White, H. (1985)**. "Some Heteroskedasticity-Consistent Covariance Matrix Estimators with Improved Finite Sample Properties". *Journal of Econometrics*, 29(3), 305-325.

3. **Long, J. S., & Ervin, L. H. (2000)**. "Using Heteroscedasticity Consistent Standard Errors in the Linear Regression Model". *The American Statistician*, 54(3), 217-224.

### Software Documentation

- [PanelBox Documentation](https://panelbox.readthedocs.io/)
- [Robust Covariance Guide](https://panelbox.readthedocs.io/robust-inference.html)

### Next Tutorial

‚û°Ô∏è **Tutorial 02**: Clustered Standard Errors for Panel Data

---

**End of Tutorial 01**