# Honest DiD: Sensitivity Analysis for Parallel Trends

The **parallel trends assumption** is crucial for difference-in-differences (DiD) validity, but it is fundamentally untestable. **Honest DiD** (Rambachan & Roth 2023) provides a framework for:

1. Relaxing the parallel trends assumption
2. Computing bounds on treatment effects under potential violations
3. Constructing robust confidence intervals that remain valid even if parallel trends is violated
4. Computing "breakdown values" showing how much violation is needed to nullify results

This notebook covers:
1. Motivation: Why standard event studies can be misleading
2. Basic usage with `HonestDiD`
3. Interpreting bounds and breakdown values
4. Sensitivity analysis over a grid of M values
5. Visualization
6. Advanced: Smoothness restrictions

In [None]:
import numpy as np
import pandas as pd
from diff_diff import MultiPeriodDiD
from diff_diff.honest_did import (
    HonestDiD,
    compute_honest_did,
    DeltaSD,
    DeltaRM,
)

# For plots
try:
    import matplotlib.pyplot as plt
    plt.style.use('seaborn-v0_8-whitegrid')
    HAS_MATPLOTLIB = True
except ImportError:
    HAS_MATPLOTLIB = False
    print("matplotlib not installed - visualization examples will be skipped")

## 1. Motivation: The Problem with Pre-trend Testing

Researchers often test for parallel trends by checking if pre-treatment coefficients are statistically significant. However, this approach has serious problems:

1. **Low power**: With typical sample sizes, we may fail to detect real violations
2. **Pre-test bias**: Conditioning on passing a pre-trends test biases inference
3. **Post-treatment violations**: Even if pre-trends look good, post-treatment violations can occur

**Honest DiD addresses these issues by:**
- Not requiring parallel trends to hold exactly
- Allowing for bounded violations related to observed pre-trends
- Providing valid inference under these weaker assumptions

## 2. Generate Example Data

We'll create panel data with:
- A true treatment effect of 5.0
- Some pre-trend violations (to make results interesting)

In [None]:
def generate_did_data(n_units=200, n_periods=10, true_att=5.0, 
                      pre_trend_violation=0.3, seed=42):
    """
    Generate panel data with potential parallel trends violations.
    
    Parameters
    ----------
    pre_trend_violation : float
        Magnitude of differential pre-trend between treated and control.
        0 = perfect parallel trends, larger = more violation.
    """
    np.random.seed(seed)
    treatment_time = n_periods // 2
    
    data = []
    for unit in range(n_units):
        is_treated = unit < n_units // 2
        unit_effect = np.random.normal(0, 2)
        
        for period in range(n_periods):
            # Common time trend
            time_effect = period * 1.0
            
            # Add differential pre-trend for treated (parallel trends violation)
            if is_treated:
                time_effect += pre_trend_violation * (period - treatment_time)
            
            y = 10.0 + unit_effect + time_effect
            
            # Treatment effect
            post = period >= treatment_time
            if is_treated and post:
                y += true_att
            
            y += np.random.normal(0, 1)
            
            data.append({
                'unit': unit,
                'period': period,
                'treated': int(is_treated),
                'post': int(post),
                'outcome': y
            })
    
    return pd.DataFrame(data)

# Generate data with mild pre-trend violation
df = generate_did_data(pre_trend_violation=0.2)
print(f"Generated {len(df)} observations")
print(f"Treatment time: period 5")
print(f"True ATT: 5.0")

## 3. Fit Standard Event Study

First, let's estimate a standard event study using `MultiPeriodDiD`.

In [None]:
# Fit event study
mp_did = MultiPeriodDiD()
event_results = mp_did.fit(
    df,
    outcome='outcome',
    treatment='treated',
    time='period',
    post_periods=[5, 6, 7, 8, 9]
)

print(event_results.summary())

In [None]:
from diff_diff.visualization import plot_event_study

if HAS_MATPLOTLIB:
    fig, ax = plt.subplots(figsize=(10, 6))
    plot_event_study(
        event_results,
        ax=ax,
        title='Standard Event Study',
        show=False
    )
    plt.tight_layout()
    plt.show()

## 4. Basic Honest DiD: Relative Magnitudes

The **relative magnitudes** approach bounds post-treatment violations by M times the maximum observed pre-treatment violation:

$$|\delta_{post}| \leq \bar{M} \times \max(|\delta_{pre}|)$$

Where:
- $\delta_t$ is the violation of parallel trends at time $t$
- $\bar{M} = 1$ means post-treatment violations can be as bad as the worst pre-treatment violation
- $\bar{M} = 0$ is equivalent to assuming parallel trends holds exactly

In [None]:
# Create HonestDiD estimator
honest = HonestDiD(
    method='relative_magnitude',
    M=1.0,  # Post-treatment violations up to 1x max pre-treatment violation
    alpha=0.05
)

# Compute bounds
honest_results = honest.fit(event_results)

print(honest_results.summary())

### Interpreting the Results

The output shows:

1. **Original Estimate**: The point estimate assuming parallel trends (standard DiD)

2. **Identified Set**: The range of treatment effects consistent with the data *and* our assumptions about violations. Wider with larger M.

3. **Robust CI**: A confidence interval that covers the true effect with 95% probability *regardless* of which value in the identified set is correct.

4. **Effect robust to violations**: Whether the robust CI excludes zero. If yes, the effect is significant even under potential violations.

In [None]:
# Key results
print(f"Original estimate: {honest_results.original_estimate:.4f}")
print(f"Identified set: [{honest_results.lb:.4f}, {honest_results.ub:.4f}]")
print(f"Robust 95% CI: [{honest_results.ci_lb:.4f}, {honest_results.ci_ub:.4f}]")
print(f"CI width: {honest_results.ci_width:.4f}")
print(f"")
print(f"Effect robust to M={honest_results.M} violations: {honest_results.is_significant}")

## 5. Sensitivity Analysis

A key feature of Honest DiD is examining how results change as we allow larger violations. This helps answer: "How much would parallel trends need to be violated to overturn our conclusions?"

In [None]:
# Run sensitivity analysis over a grid of M values
sensitivity = honest.sensitivity_analysis(
    event_results,
    M_grid=[0, 0.25, 0.5, 0.75, 1.0, 1.5, 2.0, 3.0]
)

print(sensitivity.summary())

In [None]:
# Key takeaway: the breakdown value
print(f"Breakdown value: {sensitivity.breakdown_M}")
print("")
if sensitivity.breakdown_M is not None:
    print(f"The result is robust to violations up to M = {sensitivity.breakdown_M:.2f}")
    print(f"This means post-treatment trend violations could be up to ")
    print(f"{sensitivity.breakdown_M:.1f}x the worst pre-treatment violation ")
    print(f"and we'd still conclude the effect is positive.")
else:
    print("No breakdown found - effect is always significant!")

In [None]:
# Visualize the sensitivity analysis
if HAS_MATPLOTLIB:
    fig, ax = plt.subplots(figsize=(10, 6))
    sensitivity.plot(ax=ax, show=False)
    plt.tight_layout()
    plt.show()

### Reading the Sensitivity Plot

- **X-axis (M)**: How much we allow post-treatment violations relative to pre-treatment violations
- **Shaded region**: The identified set (range of possible treatment effects)
- **Blue lines**: Robust confidence interval
- **Red dashed line**: Breakdown value (where CI first includes zero)
- **Black line**: Original estimate (under parallel trends)

As M increases:
- The identified set widens (more possible violations)
- Eventually, the CI includes zero (we can no longer rule out no effect)

## 6. Different Restriction Parameters

Let's compare results for different values of M:

In [None]:
# Compare different M values
M_values = [0, 0.5, 1.0, 2.0]

print(f"{'M':<8} {'CI Lower':>12} {'CI Upper':>12} {'Significant':>12}")
print("-" * 48)

for M in M_values:
    result = honest.fit(event_results, M=M)
    sig = "Yes" if result.is_significant else "No"
    print(f"{M:<8.2f} {result.ci_lb:>12.4f} {result.ci_ub:>12.4f} {sig:>12}")

## 7. Breakdown Value

The **breakdown value** is the smallest M where the robust CI first includes zero. It tells us how robust our conclusion is to parallel trends violations.

In [None]:
# Compute breakdown value directly
breakdown = honest.breakdown_value(event_results, tol=0.01)

if breakdown is not None:
    print(f"Breakdown value: M = {breakdown:.3f}")
    print("")
    print("Interpretation:")
    print(f"  - For M < {breakdown:.2f}: Effect is statistically significant")
    print(f"  - For M >= {breakdown:.2f}: Cannot rule out zero effect")
    print("")
    print("Is this robust enough?")
    if breakdown >= 1.0:
        print(f"  Yes! Result holds even if post-treatment violations ")
        print(f"  are as bad as observed pre-treatment violations.")
    else:
        print(f"  Somewhat. Result requires post-treatment violations ")
        print(f"  to be smaller than pre-treatment violations.")
else:
    print("No breakdown found - effect is always significant!")

## 8. Smoothness Restrictions

An alternative approach restricts the **second differences** of the trend violations:

$$|\delta_{t+1} - 2\delta_t + \delta_{t-1}| \leq M$$

This says violations must change smoothly over time:
- $M = 0$: Violations must follow a linear trend (linear extrapolation of pre-trends)
- $M > 0$: Allows some non-linearity in how violations evolve

In [None]:
# Smoothness restriction
honest_smooth = HonestDiD(
    method='smoothness',
    M=0.5,  # Allow some curvature
    alpha=0.05
)

smooth_results = honest_smooth.fit(event_results)
print(smooth_results.summary())

In [None]:
# Compare smoothness vs relative magnitudes
print("Comparison of Methods (M=1.0)")
print("=" * 60)

rm_result = HonestDiD(method='relative_magnitude', M=1.0).fit(event_results)
sd_result = HonestDiD(method='smoothness', M=1.0).fit(event_results)

print(f"{'Method':<25} {'CI Lower':>12} {'CI Upper':>12} {'Width':>10}")
print("-" * 60)
print(f"{'Relative Magnitudes':<25} {rm_result.ci_lb:>12.4f} {rm_result.ci_ub:>12.4f} {rm_result.ci_width:>10.4f}")
print(f"{'Smoothness':<25} {sd_result.ci_lb:>12.4f} {sd_result.ci_ub:>12.4f} {sd_result.ci_width:>10.4f}")

## 9. Using the Convenience Function

For quick analysis, use `compute_honest_did()`:

In [None]:
# One-liner for quick bounds
quick_result = compute_honest_did(
    event_results,
    method='relative_magnitude',
    M=1.0
)

print(f"Quick bounds: [{quick_result.ci_lb:.3f}, {quick_result.ci_ub:.3f}]")

## 10. Converting Results to DataFrames

Results can be exported for further analysis:

In [None]:
# Single result to DataFrame
print("Single result:")
print(honest_results.to_dataframe())

In [None]:
# Sensitivity analysis to DataFrame
print("\nSensitivity analysis:")
sensitivity.to_dataframe()

## Summary

**Key Takeaways:**

1. **Honest DiD** provides robust inference without assuming parallel trends holds exactly

2. **Relative magnitudes** (M̄) bounds post-treatment violations by a multiple of observed pre-treatment violations
   - M̄=0: Standard parallel trends
   - M̄=1: Violations as bad as worst pre-period
   - M̄>1: Even larger violations allowed

3. **Smoothness** (M) bounds the curvature of violations over time
   - M=0: Linear extrapolation of pre-trends
   - M>0: Allows non-linear changes

4. **Breakdown value** tells you how robust your conclusion is

5. **Best practices:**
   - Report results for multiple M values
   - Include the sensitivity plot in publications
   - Discuss what violation magnitudes are plausible in your setting
   - Use breakdown value to assess robustness

**Related Tutorials:**
- `04_parallel_trends.ipynb` - Standard parallel trends testing
- `06_power_analysis.ipynb` - Power analysis for study design
- `07_pretrends_power.ipynb` - Pre-trends power analysis (Roth 2022) - assess what violations your pre-trends test could have detected

**Reference:**

Rambachan, A., & Roth, J. (2023). A More Credible Approach to Parallel Trends. 
*The Review of Economic Studies*, 90(5), 2555-2591. 
https://doi.org/10.1093/restud/rdad018