Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]

### Added
- **EfficientDiD methodology paper review on file (PR-A).** Added `docs/methodology/papers/chen-santanna-xie-2025-review.md`, a faithful, paper-sourced fidelity review of Chen, Sant'Anna & Xie (2025, arXiv:2506.17729v1) — the Step-1 artifact of the `EfficientDiD` methodology-review validation. Transcribes (paper-sourced only, no code-deviation verdicts) the identifying assumptions (S/O/NA/PT-Post/PT-All), the single-treatment-date EIF (Theorem 3.1, Eqs 3.2–3.6; Corollary 3.1) and the staggered analog (Theorem 3.2, Eqs 3.9–3.14; Corollary 3.2), the no-covariate closed form and the covariate sieve/kernel doubly-robust estimation path of §4 (Eqs 4.1–4.5), the efficiency + standard-error results (Theorem 4.1; SE = `sqrt(mean(EIF²)/n)` or multiplier bootstrap), the Hausman PT-All-vs-PT-Post pretest (Appendix A, Theorem A.1), and the HRS Table 6 / ARE cross-language anchor for the follow-up validation pass. Docs-only; no code change. `docs/references.rst` updated from "Working Paper" to the arXiv URL; REGISTRY `## EfficientDiD` gains a `Paper review on file:` pointer; the review is registered in `docs/doc-deps.yaml`. The `METHODOLOGY_REVIEW.md` row stays **In Progress** — the source-validation pass (PR-B) flips it.
- **PowerAnalysis methodology-review-tracker promotion: In Progress → Complete, with a panel-variance correction (behavior change).** Closes the Bloom (1995) + Burlig, Preonas & Woerman (2020) source audits on the tracker (PR-A #506 added both paper reviews + under-review Notes; this PR validates the source against the code and reconciles the discrepancies). **Behavior change:** the analytical *panel* DiD variance was the Moulton design-effect factor `(1+(T−1)·rho)/T`, wrong two ways versus the source — wrong period-scaling (~4× too small at `rho=0`, `m=r=5` versus the iid DiD benchmark) and the **opposite `rho`-sign** (it *raised* the MDE as within-unit correlation grew). It is replaced by the within-unit equicorrelated special case of Burlig et al. Eq. 2, `Var(ATT) = sigma² · (1/n_T + 1/n_C) · (1/n_pre + 1/n_post) · (1 − rho)`, in which within-unit (serial) correlation *lowers* the MDE because the difference-in-differences cancels the shared within-unit component. So `PowerAnalysis.mde` / `power` / `sample_size` (and the `compute_*` wrappers) now return a **smaller** MDE / required N as `rho` rises for **all** designs; the 2×2 path matches Bloom's `2σ²` at the default `rho = 0` and is continuous with the panel form at `n_pre = n_post = 1`. New input validation, enforced for **all** designs *before* the 2×2-vs-panel router: `n_pre >= 1`, `n_post >= 1`, `rho ∈ [−1/(T−1), 1)` (`T = n_pre + n_post`), finite `sigma >= 0`, positive group counts, and `treat_frac ∈ (0, 1)` now raise `ValueError` (previously invalid two-period shapes and out-of-range `rho` fell through to `basic_did` silently). The `(1 − rho)` factor applies at `T = 2` too — the 2×2 path is Burlig's `m = r = 1` special case (footnote 11), so a nonzero `rho` is no longer silently ignored there, while `rho = 0` still recovers Bloom's `2σ²`. The MDE multiplier stays the **normal (z)** Bloom multiplier (a deliberate large-sample approximation to Burlig's t, documented as `**Deviation from R:**`) — unchanged. New `tests/test_methodology_power.py` (Bloom Table 1 multipliers; 2×2 + panel closed forms; a literal-equicorrelated Monte-Carlo validation of the panel variance; `sample_size`↔`mde` round-trip; input-guard + `rho`-at-`T=2` + `compute_*` wrapper validation; base-R `qnorm` parity at `benchmarks/data/r_power_golden.json`, generator `benchmarks/R/generate_power_golden.R`); the two `tests/test_power.py` ICC-direction tests were inverted to Burlig's sign. REGISTRY `## PowerAnalysis` equation block rewritten (z not t; corrected 2×2 / panel SE + sample-size; removed the cluster-`m` and inverted-`R²` terms that matched neither code nor source); `docs/references.rst` adds Frison & Pocock (1992) + McKenzie (2012) as the equicorrelated lineage; tutorial `06_power_analysis.ipynb` corrected. `METHODOLOGY_REVIEW.md` row promoted to **Complete** (`Last Review = 2026-05-31`); priority queue pruned; the PR-A under-review Notes removed across REGISTRY / `power.py` / `references.rst`.
- **`WooldridgeDiD` outcome-fit hint:** `WooldridgeDiD(method="ols")` now emits a `UserWarning` when the outcome is binary (`{0, 1}`) or a non-negative integer count, noting that a matching nonlinear model (`method="logit"` / `method="poisson"`) is often the **more appropriate specification** for such outcomes. Following Wooldridge (2023): the nonlinear paths impose parallel trends on the link/index scale rather than in levels (level-PT is only valid for continuous/unbounded outcomes), and the paper's Section 5 simulations show the linear model both biased and less precise where the nonlinear mean holds. It is a **different identifying assumption** than linear OLS — which one fits depends on which parallel-trends restriction holds — so the warning frames it as a recommended comparison, not an automatic switch or free efficiency upgrade. OLS remains a valid QMLE for *any* response (Table 1). Always-on (suppress via `warnings.filterwarnings`); detection is high-signal (binary requires exactly `{0, 1}`; the count branch suggests Poisson — the natural unbounded-count model — for *any* non-negative integers with >2 distinct values, so bounded binomial / known-upper-bound integer outcomes are not separately distinguished from unbounded counts; fractional / continuous outcomes are not flagged).
- **New tutorial: `docs/tutorials/24_staggered_vs_collapsed_power.ipynb` — "Staggered Rollout or a Simple 2×2? A Power-Analysis Decision Guide".** A practitioner walkthrough for geo experiments (framed on a 50-state staggered rollout) on when to reach for Callaway-Sant'Anna vs collapsing to a familiar pre/post 2×2. Shows, with live paired Monte Carlo on `generate_staggered_data`, that the collapsed 2×2 silently targets a *diluted* estimand (reports ~60–94% of the true effect-on-treated as the rollout staggers, with near-zero CI coverage of the truth under a slow rollout), and that CS's minimum-detectable-lift penalty is a *fast-rollout* phenomenon that shrinks to parity as the rollout becomes more staggered. Fully self-contained (runs live, no committed data files); ends with a CS-vs-2×2 decision guide.
Expand Down
8 changes: 4 additions & 4 deletions METHODOLOGY_REVIEW.md
Original file line number Diff line number Diff line change
Expand Up @@ -639,9 +639,9 @@ and covariate-adjusted specifications.)
- Implementation: 130 unit tests in `tests/test_efficient_did.py` + 12 validation tests in `tests/test_efficient_did_validation.py`
- Hausman pretest: implemented per Theorem A.1 with Moore-Penrose pseudoinverse for finite-sample non-PSD variance-difference matrix
- Survey support: pweight + strata/PSU/FPC via TSL on EIF scores; covariates DR path with WLS outcome regression and weighted sieve normal equations
- Paper review on file: `docs/methodology/papers/chen-santanna-xie-2025-review.md` (PR-A, 2026-05-31) — faithful paper-sourced transcription of arXiv:2506.17729v1 (assumptions S/O/NA/PT-Post/PT-All; Theorem 3.1/3.2 EIFs + Corollaries 3.1/3.2; §4 sieve/kernel DR estimation; Theorem 4.1 SEs; Theorem A.1 Hausman; HRS Table 6 anchor)

**Outstanding for promotion:**
- **No paper review on file** under `docs/methodology/papers/` — write one
**Outstanding for promotion (PR-B source validation; paper review now on file):**
- Dedicated `tests/test_methodology_efficient_did.py` with Theorem 3.2 / Equation 3.5 / Equation 4.3 numbered Verified Components walk-through
- Cross-language anchor: the paper's empirical replication uses HRS data following Sun-Abraham (2021); a same-data benchmark against the paper's reported numbers (or a same-DGP MC against R alternatives) would substantiate the EIF construction
- Documented deviations: linear OLS working models for outcome regressions vs. paper's general nonparametric specification (DR safety net acknowledged but not separately validated); fixed-weight bootstrap aggregation vs. WIF-corrected analytical aggregation
Expand Down Expand Up @@ -1410,10 +1410,10 @@ more graceful handling of edge cases while still signaling invalid inference to

Promotion priority for the **In Progress** entries, ordered by what's blocked on substantive review work (top of list = needs review next) vs. consolidation pass (bottom of list = mostly tracker walk-through):

**Substantive-review-blocked (no methodology test file, no paper review, no R parity):**
**Substantive-review-blocked (still missing a methodology test file / R parity — and, except for EfficientDiD, a paper review):**

1. **PlaceboTests** — decide first whether to keep standalone or absorb into per-estimator diagnostic sections; methodologically lightweight either way.
2. **EfficientDiD** — no paper review on file; substantial implementation work (`tests/test_efficient_did.py` + validation tests) needs paper-vs-code audit against Chen, Sant'Anna & Xie (2025).
2. **EfficientDiD** — **paper review on file** (PR-A, `chen-santanna-xie-2025-review.md`); remaining PR-B work is the source-validation pass — `tests/test_methodology_efficient_did.py` (Theorem 3.1/3.2 / Eq 3.5 / Eq 4.3 Verified Components), the HRS Table 6 cross-language anchor, and the documented deviations against Chen, Sant'Anna & Xie (2025).
3. **ImputationDiD / TwoStageDiD** — natural pair (both single-treatment-effect-imputation methods). Each needs paper review, methodology file, R parity fixture against `didimputation` / `did2s`.

**Consolidation-pass-blocked (already has paper review or methodology file or R parity; mostly Verified Components walk-through):**
Expand Down
2 changes: 2 additions & 0 deletions docs/doc-deps.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,8 @@ sources:
- path: docs/methodology/REGISTRY.md
section: "EfficientDiD"
type: methodology
- path: docs/methodology/papers/chen-santanna-xie-2025-review.md
type: methodology
- path: docs/api/efficient_did.rst
type: api_reference
- path: docs/tutorials/15_efficient_did.ipynb
Expand Down
2 changes: 1 addition & 1 deletion docs/methodology/REGISTRY.md
Original file line number Diff line number Diff line change
Expand Up @@ -881,7 +881,7 @@ labels.*

## EfficientDiD

**Primary source:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient Difference-in-Differences and Event Study Estimators.
**Primary source:** [Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient Difference-in-Differences and Event Study Estimators. arXiv:2506.17729v1.](https://arxiv.org/abs/2506.17729v1) (Cowles Foundation Discussion Paper No. 2470). Paper review on file: `docs/methodology/papers/chen-santanna-xie-2025-review.md` (theorem/equation numbering pinned to arXiv v1, currently the only version).

**Key implementation requirements:**

Expand Down
Loading
Loading