Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
fa8115e
Add ChaisemartinDHaultfoeuille (dCDH) DID_M estimator (Phase 1)
igerber Apr 11, 2026
be5a516
Address review: bootstrap inference, placebo doc/code, twowayfeweight…
igerber Apr 11, 2026
fc881fb
docs: align practitioner decision tree placebo note with REGISTRY/README
igerber Apr 11, 2026
0d70f4d
Round 2: cell counts, full influence function, variance-only filter
igerber Apr 11, 2026
df7a0a3
Round 3: ragged panels, validation refactor, metadata fixes
igerber Apr 11, 2026
787eae2
Round 4: doc/contract cleanups (joiners_leavers DataFrame, stale docs…
igerber Apr 11, 2026
1e05c9d
Round 5: degenerate-cohort SE NaN, placebo A11 mirroring
igerber Apr 11, 2026
cf639d4
Round 6: document ragged-panel contract + never-switching doc drift fix
igerber Apr 11, 2026
83cc093
Round 7: cluster gate + TWFE diagnostic order + singleton-baseline la…
igerber Apr 12, 2026
8cafae1
Round 8: bootstrap divisor naming + llms-full.txt cluster snippet
igerber Apr 12, 2026
63edb3d
Round 9: TWFE diagnostic sample-contract clarification + warning + tests
igerber Apr 12, 2026
819ba57
Round 10: propagate bootstrap p-value and CI to top-level results
igerber Apr 12, 2026
3824de8
Round 11: summary() footer branches on inference mode
igerber Apr 12, 2026
a8b161c
Round 12: reject within-cell mixed treatment + fix flaky slow test
igerber Apr 12, 2026
b5e1847
Round 13: honor rank_deficient_action='error' on fitted TWFE path
igerber Apr 12, 2026
81871d4
Round 14: document cell-weighting deviation + empty-input guard
igerber Apr 12, 2026
8956d13
Round 15: document period-index semantics for global calendar gaps
igerber Apr 12, 2026
639f5dc
Round 16: reject NaN group/time + fix calendar-gap workaround text
igerber Apr 12, 2026
62b0f93
Round 17: fix sigma_fe formula + soften paper-literal + fix placebo c…
igerber Apr 12, 2026
dd52d27
Round 18: fix sigma_fe to use paper w_{g,t} (not contribution weights)
igerber Apr 12, 2026
5b46f40
Round 19: clarify TWFE diagnostic docstring weight distinction
igerber Apr 12, 2026
c215447
Fix CI: TWFE diagnostic guard for < 2 groups/periods
igerber Apr 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added
- **`ChaisemartinDHaultfoeuille`** (alias `DCDH`) — Phase 1 of the de Chaisemartin-D'Haultfœuille estimator family, the only modern staggered DiD estimator in the library that handles **non-absorbing (reversible) treatments**. Treatment can switch on AND off over time (marketing campaigns, seasonal promotions, on/off policy cycles). Implements `DID_M` from de Chaisemartin & D'Haultfœuille (2020) AER, equivalently `DID_1` (horizon `l = 1`) of the dynamic companion paper (NBER WP 29873). Ships:
- Headline `DID_M` point estimate with cohort-recentered analytical SE from Web Appendix Section 3.7.3 of the dynamic companion paper
- Joiners-only (`DID_+`) and leavers-only (`DID_-`) decompositions with their own inference
- Single-lag placebo `DID_M^pl` point estimate (AER 2020 placebo specification). Placebo SE / inference fields are intentionally `NaN` in Phase 1: the dynamic companion paper Section 3.7.3 derives the cohort-recentered analytical variance for `DID_l` only, not for the placebo. Phase 2 will add multiplier-bootstrap support for the placebo. The bootstrap path in Phase 1 covers `DID_M`, `DID_+`, and `DID_-` only.
- Optional multiplier bootstrap clustered at group level with Rademacher / Mammen / Webb weights for `DID_M`, `DID_+`, and `DID_-` (placebo bootstrap deferred to Phase 2)
- TWFE decomposition diagnostic from Theorem 1 of AER 2020 (per-cell weights, fraction negative, `sigma_fe`, `beta_fe`)
- Multi-switch group filtering (`drop_larger_lower=True` default, matches R `DIDmultiplegtDYN`); singleton-baseline filter (footnote 15 of dynamic paper, variance computation only); consolidated A11 zero-retention warnings — all with explicit warnings (no silent failures). Never-switching groups participate in the variance via stable-control roles after the Round 2 full-IF fix; the `n_groups_dropped_never_switching` field is retained as backwards-compatibility metadata only.
- Phase 1 requires balanced-baseline panels with no interior period gaps. Late-entry groups (missing the first global period) raise `ValueError`; interior-gap groups are dropped with a `UserWarning`; terminally-missing groups (early exit / right-censoring) are retained and contribute from their observed periods only. This is a documented deviation from R `DIDmultiplegtDYN`'s unbalanced-panel support — see `docs/methodology/REGISTRY.md` for rationale and workarounds.
- Forward-compatible `fit()` signature: Phase 2 (multi-horizon event study, `aggregate`, `L_max`) and Phase 3 (covariate adjustment via `controls`, group-specific linear trends, HonestDiD) parameters present from day one, raising `NotImplementedError` with phase pointers
- Validated against R `DIDmultiplegtDYN` v2.3.3 at horizon `l = 1` via `tests/test_chaisemartin_dhaultfoeuille_parity.py`
- **`twowayfeweights()`** — standalone helper function for the TWFE decomposition diagnostic (Theorem 1 of de Chaisemartin & D'Haultfœuille 2020), available without instantiating the full estimator. Returns a `TWFEWeightsResult` with per-cell weights, fraction negative, `sigma_fe`, and `beta_fe`.
- **`generate_reversible_did_data()`** — new generator in `diff_diff.prep` producing reversible-treatment panel data for testing and tutorials. Patterns: `single_switch` (default, A5-safe), `joiners_only`, `leavers_only`, `mixed_single_switch`, `random`, `cycles`, `marketing`. Returns columns `group`, `period`, `treatment`, `outcome`, `true_effect`, `d_lag`, `switcher_type`.
- **REGISTRY.md `## ChaisemartinDHaultfoeuille` section** — single canonical source for dCDH methodology, equations, edge cases, and all documented deviations from the R `DIDmultiplegtDYN` reference implementation. Cites the AER 2020 paper and the dynamic companion paper (NBER WP 29873) by reference; primary papers are upstream sources, not in-repo files.

## [3.0.1] - 2026-04-07

### Added
Expand Down
83 changes: 83 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ Already know DiD? The [academic quickstart](docs/quickstart.rst) and [estimator
- **Panel data support**: Two-way fixed effects estimator for panel designs
- **Multi-period analysis**: Event-study style DiD with period-specific treatment effects
- **Staggered adoption**: Callaway-Sant'Anna (2021), Sun-Abraham (2021), Borusyak-Jaravel-Spiess (2024) imputation, Two-Stage DiD (Gardner 2022), Stacked DiD (Wing, Freedman & Hollingsworth 2024), Efficient DiD (Chen, Sant'Anna & Xie 2025), and Wooldridge ETWFE (2021/2023) estimators for heterogeneous treatment timing
- **Reversible (non-absorbing) treatments**: de Chaisemartin-D'Haultfœuille `DID_M` estimator for treatments that switch on AND off over time (marketing campaigns, seasonal promotions, on/off policy cycles) — the only library option for non-absorbing treatments
- **Triple Difference (DDD)**: Ortiz-Villavicencio & Sant'Anna (2025) estimators with proper covariate handling
- **Synthetic DiD**: Combined DiD with synthetic control for improved robustness
- **Triply Robust Panel (TROP)**: Factor-adjusted DiD with synthetic weights (Athey et al. 2025)
Expand Down Expand Up @@ -130,6 +131,7 @@ All estimators have short aliases for convenience:
| `Bacon` | `BaconDecomposition` | Goodman-Bacon decomposition |
| `EDiD` | `EfficientDiD` | Efficient DiD |
| `ETWFE` | `WooldridgeDiD` | Wooldridge ETWFE (2021/2023) |
| `DCDH` | `ChaisemartinDHaultfoeuille` | de Chaisemartin & D'Haultfœuille (2020) — reversible treatments |

`TROP` already uses its short canonical name and needs no alias.

Expand Down Expand Up @@ -1151,6 +1153,87 @@ EfficientDiD(
| Covariates | Not yet (Phase 2) | Supported (OR, IPW, DR) |
| When to choose | Maximum efficiency, PT-All credible | Covariates needed, weaker PT |

### de Chaisemartin-D'Haultfœuille (dCDH) for Reversible Treatments

`ChaisemartinDHaultfoeuille` (alias `DCDH`) is the only library estimator that handles **non-absorbing (reversible) treatments** — treatment can switch on AND off over time. This is the natural fit for marketing campaigns, seasonal promotions, on/off policy cycles.

Phase 1 ships the contemporaneous-switch estimator `DID_M` from the AER 2020 paper, which is mathematically identical to `DID_1` (horizon `l = 1`) of the dynamic companion paper (NBER WP 29873). Phase 2 will add multi-horizon event-study output `DID_l` for `l > 1` on the same class; Phase 3 will add covariate adjustment.

```python
from diff_diff import ChaisemartinDHaultfoeuille
from diff_diff.prep import generate_reversible_did_data

# Generate a reversible-treatment panel
data = generate_reversible_did_data(
n_groups=80, n_periods=6, pattern="single_switch", seed=42,
)

# Fit the estimator
est = ChaisemartinDHaultfoeuille()
results = est.fit(
data,
outcome="outcome",
group="group",
time="period",
treatment="treatment",
)
results.print_summary()

# Decomposition
print(f"DID_M (overall): {results.overall_att:.3f}")
print(f"DID_+ (joiners): {results.joiners_att:.3f}")
print(f"DID_- (leavers): {results.leavers_att:.3f}")
print(f"Placebo (DID^pl): {results.placebo_effect:.3f}")
```

**Parameters:**

```python
ChaisemartinDHaultfoeuille(
alpha=0.05, # Significance level
n_bootstrap=0, # 0 = analytical SE only; >0 = multiplier bootstrap
bootstrap_weights="rademacher", # 'rademacher', 'mammen', or 'webb'
seed=None, # Random seed for bootstrap
placebo=True, # Auto-compute single-lag placebo
twfe_diagnostic=True, # Auto-compute TWFE decomposition diagnostic
drop_larger_lower=True, # Drop multi-switch groups (matches R DIDmultiplegtDYN)
rank_deficient_action="warn", # Used by TWFE diagnostic OLS
)
```

**What you get back on the results object:**

| Field | Description |
|-------|-------------|
| `overall_att`, `overall_se`, `overall_conf_int` | `DID_M` and inference (cohort-recentered analytical SE by default; multiplier-bootstrap percentile inference when `n_bootstrap > 0`) |
| `joiners_att`, `leavers_att` | Decomposition into the joiners (`DID_+`) and leavers (`DID_-`) views |
| `placebo_effect` | Single-lag placebo (`DID_M^pl`) point estimate |
| `per_period_effects` | Per-period decomposition with explicit A11-violation flags |
| `twfe_weights`, `twfe_fraction_negative`, `twfe_sigma_fe`, `twfe_beta_fe` | Theorem 1 decomposition diagnostic |
| `n_groups_dropped_crossers`, `n_groups_dropped_singleton_baseline` | Filter counts (multi-switch groups dropped before estimation; singleton-baseline groups excluded from variance) |
| `n_groups_dropped_never_switching` | Backwards-compatibility metadata. Never-switching groups participate in the variance via stable-control roles; this field is no longer a filter count. |

**Standalone TWFE decomposition diagnostic** (without fitting the full estimator):

```python
from diff_diff import twowayfeweights

diagnostic = twowayfeweights(
data, outcome="outcome", group="group", time="period", treatment="treatment",
)
print(f"Plain TWFE coefficient: {diagnostic.beta_fe:.3f}")
print(f"Fraction of negative weights: {diagnostic.fraction_negative:.3f}")
print(f"sigma_fe (sign-flipping threshold): {diagnostic.sigma_fe:.3f}")
```

> **Note:** The Phase 1 placebo SE is intentionally `NaN` with a warning. The dynamic companion paper Section 3.7.3 derives the cohort-recentered analytical variance for `DID_l` only — not for the placebo `DID_M^pl`. Phase 2 will add multiplier-bootstrap support for the placebo via the dynamic paper's machinery. Until then, the placebo point estimate is meaningful but its inference fields are NaN-consistent (and `results.placebo_se`, `results.placebo_p_value`, etc. remain `NaN` even when `n_bootstrap > 0`).

> **Note:** By default (`drop_larger_lower=True`), the estimator drops groups whose treatment switches more than once before estimation. This matches R `DIDmultiplegtDYN`'s default and is required for the analytical variance formula to be consistent with the point estimate. Each drop emits an explicit warning.

> **Note:** Phase 1 requires panels with a **balanced baseline** (every group observed at the first global period) and **no interior period gaps**. Late-entry groups (missing the baseline) raise `ValueError`; interior-gap groups are dropped with a warning; terminally-missing groups (early exit / right-censoring) are retained and contribute from their observed periods only. This is a documented deviation from R `DIDmultiplegtDYN`, which supports unbalanced panels — see [`docs/methodology/REGISTRY.md`](docs/methodology/REGISTRY.md) for the rationale, the defensive guards that make terminal missingness safe, and workarounds for unbalanced inputs.

> **Note:** Survey design (`survey_design`), event-study aggregation (`aggregate`), covariate adjustment (`controls`), and HonestDiD integration (`honest_did`) are not yet supported. They raise `NotImplementedError` with phase pointers — see [`ROADMAP.md`](ROADMAP.md) for the full multi-phase rollout.

### Triple Difference (DDD)

Triple Difference (DDD) is used when treatment requires satisfying two criteria: belonging to a treated **group** AND being in an eligible **partition**. The `TripleDifference` class implements the methodology from Ortiz-Villavicencio & Sant'Anna (2025), which correctly handles covariate adjustment (unlike naive implementations).
Expand Down
Loading
Loading