Add EfficientDiD documentation and tutorial notebook by igerber · Pull Request #199 · igerber/diff-diff

igerber · 2026-03-15T15:36:03Z

Summary

Create RST API page (docs/api/efficient_did.rst) with autoclass directives, usage examples, and estimator comparison table
Create tutorial notebook (docs/tutorials/15_efficient_did.ipynb) covering basic usage, PT-All vs PT-Post, efficiency gains Monte Carlo, event study visualization, group aggregation, bootstrap inference, diagnostics, anticipation, and three-way estimator comparison (EDiD vs CS vs ImputationDiD)
Integrate EfficientDiD into docs/api/index.rst (estimators, results classes, module toctree)
Add EfficientDiD to docs/choosing_estimator.rst (decision flowchart, quick reference table, detailed guidance section)
Add tutorial 15 listing to docs/tutorials/README.md
Add EDiD to README.md features bullet, tutorials table, and detailed estimator section

Methodology references (required if estimator / math changes)

Method name(s): N/A - no methodology changes (documentation only)
Paper / source link(s): N/A
Any intentional deviations from the source (and why): None

Validation

Tests added/updated: No test changes (documentation only)
Backtest / simulation / notebook evidence (if applicable): Tutorial notebook validated via jupyter nbconvert --execute - all cells pass

Security / privacy

Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

- Create docs/api/efficient_did.rst with API reference, examples, and comparison table (EDiD vs CS vs ImputationDiD) - Create docs/tutorials/15_efficient_did.ipynb covering basic usage, PT-All vs PT-Post, efficiency gains, event study, bootstrap, diagnostics, anticipation, and three-way estimator comparison - Add EfficientDiD to docs/api/index.rst (estimators, results, toctree) - Add EfficientDiD to docs/choosing_estimator.rst (flowchart, table, detailed guidance section) - Add tutorial 15 to docs/tutorials/README.md - Add EDiD to README.md features, tutorials table, and estimator section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-15T15:41:06Z

Overall Assessment

⚠️ Needs changes

Executive Summary

This is a documentation-only PR, but the main new docs/tutorial content for EfficientDiD overstates the scope of PT-Post equivalence to Callaway-Sant’Anna.
The blocker is methodological: multiple files say pt_assumption="post" is simply “= CS”, while the current implementation only aligns cleanly with the never-treated, post-treatment CS path; CS’s default pre-treatment/event-study diagnostics use different base-period logic.
The tutorial also imports “few clusters” bootstrap guidance into EfficientDiD, even though clustered inference is explicitly not implemented for this estimator.
The new API page and notebook use a different bibliographic citation than the Methodology Registry’s canonical EfficientDiD reference.
I did not find code, security, or performance regressions in the changed files. The notebook JSON parses and its code cells compile syntactically.

Methodology

Severity P1. Impact: the new docs repeatedly state or imply that PT-Post is fully equivalent to Callaway-Sant’Anna, including in contexts that use aggregate='all' or event-study output: README.md:1093, README.md:1103, docs/api/efficient_did.rst:12, docs/api/efficient_did.rst:97, docs/api/efficient_did.rst:147, docs/choosing_estimator.rst:68, docs/choosing_estimator.rst:241, docs/tutorials/15_efficient_did.ipynb:159, docs/tutorials/15_efficient_did.ipynb:200, docs/tutorials/15_efficient_did.ipynb:554. That is broader than the current implementation. EfficientDiD fixes the PT-Post baseline at g-1-anticipation for all target periods (diff_diff/efficient_did.py:304), whereas CallawaySantAnna defaults to base_period="varying" and switches pre-treatment periods to t-1 (diff_diff/staggered.py:470). The repo’s own regression test only asserts exact PT-Post matching for post-treatment effects (tests/test_efficient_did.py:901). Concrete fix: scope the claim to “matches CallawaySantAnna(control_group="never_treated") for post-treatment ATT(g,t) and derived post-treatment aggregations”; explicitly note that pre-treatment/event-study diagnostics differ from CS’s default base_period="varying" unless the comparison is limited or base_period="universal" is used.
Severity P2. Impact: the tutorial recommends Webb weights “when clusters are very few” in the EfficientDiD bootstrap section (docs/tutorials/15_efficient_did.ipynb:328), but EfficientDiD currently rejects clustered inference outright (diff_diff/efficient_did.py:87), and the registry documents Phase 1 as multiplier bootstrap on EIF values rather than clustered bootstrap (docs/methodology/REGISTRY.md:579). Concrete fix: remove the few-cluster recommendation from the EfficientDiD tutorial or explicitly say these are available multiplier distributions and that clustered inference is not yet supported for this estimator.

Code Quality

No findings in the changed files.

Performance

No findings in the changed files.

Maintainability

No findings in the changed files.

Tech Debt

No new tech-debt blocker. This PR appears to address the existing tracked docs/tutorial item at TODO.md:49.

Security

No findings. I did not see secrets, tokens, or PII introduced in the changed files.

Documentation/Tests

Severity P3. Impact: the new EfficientDiD API page and notebook use a different bibliographic record than the Methodology Registry: docs/api/efficient_did.rst:29, docs/tutorials/15_efficient_did.ipynb:574, versus the canonical registry citation at docs/methodology/REGISTRY.md:483. Concrete fix: use the registry citation consistently across the new docs/notebook.
No additional test finding on the diff itself. The notebook JSON is valid and its code cells compile, but I could not execute it in this review environment because numpy is not installed.

Path to Approval

Replace the blanket = CS / “equivalent to Callaway-Sant’Anna” language in the README, API page, estimator guide, and notebook with scoped wording that is true for the current implementation: post-treatment equivalence to CallawaySantAnna(control_group="never_treated"), not universal equivalence across pre-trend/event-study outputs.
Update the tutorial comparison narrative/output so it does not promise that PT-All has “the same ATT” as CS in finite samples; phrase that as targeting the same estimand or compare only the post-treatment quantities the implementation actually matches.
Remove or qualify the “Webb for very few clusters” guidance in the EfficientDiD bootstrap tutorial until clustered inference exists for this estimator.
Align the EfficientDiD reference text in the API page and notebook with the canonical citation in docs/methodology/REGISTRY.md.

- P1: Scope PT-Post/CS equivalence to post-treatment ATT(g,t) across all doc files (rst, README, notebook) — pre-treatment diagnostics may differ - P2: Remove inappropriate cluster guidance from Webb bootstrap description - P3: Fix citation initials (Chen X., Xie H.) and add full paper title Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

igerber · 2026-03-15T15:56:10Z

/ai-review

github-actions · 2026-03-15T16:01:44Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 41d6c846e7ec202c56299d8f1b790090004b230f

Overall Assessment

⚠️ Needs changes

Executive Summary

Re-review: the prior cluster/Webb guidance issue appears resolved. The new tutorial now describes multiplier-bootstrap weight options without importing unsupported clustered-inference advice for EfficientDiD.
Re-review: most of the earlier PT-Post/Callaway-Sant’Anna equivalence overstatement has been corrected, but two residual phrases still present PT-Post as generally “CS-equivalent.”
That is still broader than the documented and tested behavior: PT-Post in this library fixes the baseline at g-1-anticipation, CallawaySantAnna still defaults to base_period="varying", and the exact-match regression test only asserts post-treatment ATT(g,t) equality.
The new API page and notebook still use a citation variant that does not match the Methodology Registry’s canonical EfficientDiD reference.
I did not find code quality, performance, maintainability, tech-debt, or security blockers in the changed files.

Methodology

Severity P1. Impact: the re-review still leaves residual PT-Post equivalence wording in docs/tutorials/15_efficient_did.ipynb:L77 and docs/choosing_estimator.rst:L241. That wording is broader than the method described in docs/methodology/REGISTRY.md:L493, the PT-Post implementation in diff_diff/efficient_did_weights.py:L59 and diff_diff/efficient_did.py:L313, the default CS base-period behavior in diff_diff/staggered.py:L475, and the exact-match test scope in tests/test_efficient_did.py:L908. Users can still reasonably read “CS-equivalent” / “same point estimates as CS” as full equivalence, even though the implementation only aligns cleanly with post-treatment ATT(g,t) under the never-treated comparison. Concrete fix: rewrite those remaining phrases to say “matches CallawaySantAnna(control_group='never_treated') for post-treatment ATT(g,t) / post-treatment aggregations,” and note that pre-treatment/event-study comparisons differ under CS’s default base_period="varying" unless CS is compared with base_period="universal".

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings. This PR appears to address the tracked EfficientDiD docs/tutorial item in TODO.md:L49.

Security

No findings.

Documentation/Tests

Severity P3. Impact: the new EfficientDiD API page and tutorial still use a bibliography form that differs from the Methodology Registry’s canonical reference in docs/methodology/REGISTRY.md:L484; see docs/api/efficient_did.rst:L29 and docs/tutorials/15_efficient_did.ipynb:L502. This is informational, but it makes methodology references less consistent to audit. Concrete fix: reuse the registry citation text verbatim in the API page and notebook, or centralize the citation so the docs cannot drift.

Path to Approval

Replace the remaining unscoped PT-Post/CS-equivalence wording in the tutorial intro and chooser example with wording limited to post-treatment ATT(g,t) / post-treatment aggregations, and explicitly preserve the existing caveat that CS’s default pre-treatment/event-study outputs differ because of base_period="varying".

- Scope "same point estimates as CS" → "matching post-treatment ATT(g,t) with CS" - Scope "CS-equivalent" → "post-treatment CS match" in code comment - Remove "*Working Paper*" suffix from citations to match REGISTRY.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

igerber · 2026-03-15T16:21:59Z

/ai-review

github-actions · 2026-03-15T16:26:49Z

🔁 AI review rerun (requested by @igerber)

Head SHA: e13c358f1618f445d51eefe6c11aa142d874701d

Overall Assessment

⚠️ Needs changes

Executive Summary

Re-review: the prior P1 around PT-Post/Callaway-Sant’Anna equivalence wording appears resolved. The updated docs now scope the match to post-treatment outputs and keep the pre-treatment caveat in docs/tutorials/15_efficient_did.ipynb:L144 and docs/choosing_estimator.rst:L241-L244.
[Newly identified] The tutorial’s intuition section still misstates the PT-All comparison set by treating period_1 as an additional valid baseline. The registry, implementation, and tests all exclude period_1 from the overidentifying (g', t_pre) index set because Y_1 is already the universal reference.
The canonical EfficientDiD citation now matches the Methodology Registry in the new API page and tutorial summary, resolving the earlier citation-consistency note in docs/api/efficient_did.rst:L29-L30 and docs/tutorials/15_efficient_did.ipynb:L488.
I did not find code-quality, performance, maintainability, tech-debt, security, or test blockers in the changed files.

Methodology

Severity P1 [Newly identified]. Impact: docs/tutorials/15_efficient_did.ipynb:L66 says that for ATT(g=5, t=6), “periods 1, 2, 3 can all serve as valid baselines.” That is not the method implemented here. The registry explicitly notes that PT-All excludes period_1 from the candidate (g', t_pre) set because Y_1 is the universal reference, and the implementation/tests enforce exactly that rule in docs/methodology/REGISTRY.md:L607-L609, diff_diff/efficient_did_weights.py:L87-L99, and tests/test_efficient_did.py:L543-L583. This is an undocumented mismatch between the tutorial and the source-backed implementation. Concrete fix: rewrite that paragraph so it states that period_1 is the fixed Y_1 reference, while additional PT-All baselines start after period_1 and must satisfy t_pre < g' for the chosen comparison cohort.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings. This PR appears to address the tracked docs/tutorial item in TODO.md:L49.

Security

No findings.

Documentation/Tests

No findings.

Path to Approval

Correct the PT-All intuition paragraph in docs/tutorials/15_efficient_did.ipynb:L66 so it no longer presents period_1 as an extra valid baseline and instead describes the implemented valid-pair rule (Y_1 fixed as the universal reference; overidentifying t_pre values begin after period_1 and must be pre-treatment for the comparison cohort).

The tutorial incorrectly stated "periods 1, 2, 3 can all serve as valid baselines". Period 1 is the universal Y_1 reference and is excluded per REGISTRY.md and efficient_did_weights.py. Corrected to "periods 2 and 3". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

igerber · 2026-03-15T17:09:36Z

/ai-review

github-actions · 2026-03-15T17:16:00Z

🔁 AI review rerun (requested by @igerber)

Head SHA: dba7d2f82f19ddf4200f0fdf44147517d1714771

Overall Assessment

✅ Looks good

Executive Summary

Re-review: the prior PT-All baseline wording issue is resolved. The tutorial now correctly treats period_1 as the fixed reference rather than an extra baseline in docs/tutorials/15_efficient_did.ipynb:L66, consistent with the registry note in docs/methodology/REGISTRY.md:L607.
I did not find any unmitigated P0/P1 issues in the changed EfficientDiD docs.
One P2 documentation-accuracy issue remains: the notebook describes the synthetic DGP as homogeneous with a constant true effect of 2.0, but the actual generator call uses dynamic treatment effects by default.
The new API/index entries are consistent with the exported public symbols for EfficientDiD, EfficientDiDResults, and EDiDBootstrapResults in diff_diff/init.py:L131-L135 and diff_diff/init.py:L240-L244.
I could not independently rerun the notebook in this environment because numpy is unavailable.

Methodology

Severity P2. Impact: the tutorial’s narrative for the three-estimator comparison is internally inconsistent with the code. It says the notebook uses a “known treatment effect of 2.0,” later describes the DGP as having “homogeneous effects,” and prints True effect = 2.0 in the comparison section, but the actual generate_staggered_data() calls do not override dynamic_effects=True / effect_growth=0.1, so treatment effects grow with event time rather than staying constant. See docs/tutorials/15_efficient_did.ipynb:L75-L85, docs/tutorials/15_efficient_did.ipynb:L401-L421, diff_diff/prep_dgp.py:L125-L162, and diff_diff/prep_dgp.py:L245-L251. Concrete fix: either rewrite the prose/headings to say the example uses common dynamic effects (for example, “effect starts at 2.0 and grows over time”), or set dynamic_effects=False anywhere the tutorial intends to demonstrate a constant-effect/homogeneous DGP.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings. This PR appears to address the tracked EfficientDiD docs/tutorial follow-up in TODO.md:L48-L49.

Security

No findings.

Documentation/Tests

No findings beyond the P2 tutorial-accuracy item above. Residual risk: I was not able to independently execute the new notebook in this review environment because numpy is not installed.

The tutorial describes a homogeneous DGP with "True effect = 2.0" but generate_staggered_data() defaults to dynamic_effects=True, making effects grow with event time. Add dynamic_effects=False to both calls so the DGP matches the prose. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

igerber · 2026-03-15T17:24:39Z

/ai-review

github-actions · 2026-03-15T17:29:39Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 85b76f4fe676b4bd013f89578dd46add7b1fdbbe

Overall Assessment

✅ Looks good

Executive Summary

Re-review: the prior tutorial DGP mismatch is resolved. The notebook now explicitly sets dynamic_effects=False while describing a constant true effect of 2.0 in docs/tutorials/15_efficient_did.ipynb:L99, which matches the generator API in diff_diff/prep_dgp.py:L130 and diff_diff/prep_dgp.py:L158.
I did not find any unmitigated P0/P1 issues in the changed EfficientDiD docs.
The methodology-sensitive claims in the new docs are consistent with the Methodology Registry and implementation for PT-All vs PT-Post, PT-Post matching CS post-treatment ATT, multiplier bootstrap, and the Phase 1 no-covariates limitation in docs/methodology/REGISTRY.md:L493, docs/methodology/REGISTRY.md:L584, docs/methodology/REGISTRY.md:L607, and diff_diff/efficient_did.py:L42.
The new API/index entries are consistent with the exported public symbols in docs/api/index.rst:L26, docs/api/index.rst:L53, and diff_diff/__init__.py:L131.
Residual risk: I could not independently execute the notebook in this environment because numpy and pandas are unavailable.

Methodology

No findings. The new README/API/tutorial language aligns with the registry on PT-All/PT-Post, the period_1 exclusion and overidentifying moments note, multiplier bootstrap as the implemented inference path, and the Phase 1 no-covariates scope in docs/methodology/REGISTRY.md:L493, docs/methodology/REGISTRY.md:L584, docs/methodology/REGISTRY.md:L607, docs/api/efficient_did.rst:L7, and docs/tutorials/15_efficient_did.ipynb:L73.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings. The new public docs are consistent with the actual exported estimator/results surface in docs/api/index.rst:L26, docs/api/index.rst:L53, and diff_diff/__init__.py:L241.

Tech Debt

No findings. This PR appears to address the tracked EfficientDiD docs/tutorial follow-up in TODO.md:L49.

Security

No findings.

Documentation/Tests

No findings in the changed content.
Residual risk: notebook execution was not independently verified here because the review environment lacks numpy and pandas.

igerber merged commit bc23fa0 into main Mar 15, 2026

igerber deleted the efficient-did-notebook branch March 15, 2026 17:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add EfficientDiD documentation and tutorial notebook#199

Add EfficientDiD documentation and tutorial notebook#199
igerber merged 5 commits intomainfrom
efficient-did-notebook

igerber commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026

Uh oh!

igerber commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026

Uh oh!

igerber commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026

Uh oh!

igerber commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026

Uh oh!

igerber commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Mar 15, 2026

Summary

Methodology references (required if estimator / math changes)

Validation

Security / privacy

Uh oh!

github-actions bot commented Mar 15, 2026

Uh oh!

igerber commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026

Uh oh!

igerber commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026

Uh oh!

igerber commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026

Uh oh!

igerber commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant