Skip to content

Add EfficientDiD documentation and tutorial notebook#199

Merged
igerber merged 5 commits intomainfrom
efficient-did-notebook
Mar 15, 2026
Merged

Add EfficientDiD documentation and tutorial notebook#199
igerber merged 5 commits intomainfrom
efficient-did-notebook

Conversation

@igerber
Copy link
Owner

@igerber igerber commented Mar 15, 2026

Summary

  • Create RST API page (docs/api/efficient_did.rst) with autoclass directives, usage examples, and estimator comparison table
  • Create tutorial notebook (docs/tutorials/15_efficient_did.ipynb) covering basic usage, PT-All vs PT-Post, efficiency gains Monte Carlo, event study visualization, group aggregation, bootstrap inference, diagnostics, anticipation, and three-way estimator comparison (EDiD vs CS vs ImputationDiD)
  • Integrate EfficientDiD into docs/api/index.rst (estimators, results classes, module toctree)
  • Add EfficientDiD to docs/choosing_estimator.rst (decision flowchart, quick reference table, detailed guidance section)
  • Add tutorial 15 listing to docs/tutorials/README.md
  • Add EDiD to README.md features bullet, tutorials table, and detailed estimator section

Methodology references (required if estimator / math changes)

  • Method name(s): N/A - no methodology changes (documentation only)
  • Paper / source link(s): N/A
  • Any intentional deviations from the source (and why): None

Validation

  • Tests added/updated: No test changes (documentation only)
  • Backtest / simulation / notebook evidence (if applicable): Tutorial notebook validated via jupyter nbconvert --execute - all cells pass

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

- Create docs/api/efficient_did.rst with API reference, examples, and
  comparison table (EDiD vs CS vs ImputationDiD)
- Create docs/tutorials/15_efficient_did.ipynb covering basic usage,
  PT-All vs PT-Post, efficiency gains, event study, bootstrap, diagnostics,
  anticipation, and three-way estimator comparison
- Add EfficientDiD to docs/api/index.rst (estimators, results, toctree)
- Add EfficientDiD to docs/choosing_estimator.rst (flowchart, table,
  detailed guidance section)
- Add tutorial 15 to docs/tutorials/README.md
- Add EDiD to README.md features, tutorials table, and estimator section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link

Overall Assessment

⚠️ Needs changes

Executive Summary

  • This is a documentation-only PR, but the main new docs/tutorial content for EfficientDiD overstates the scope of PT-Post equivalence to Callaway-Sant’Anna.
  • The blocker is methodological: multiple files say pt_assumption="post" is simply “= CS”, while the current implementation only aligns cleanly with the never-treated, post-treatment CS path; CS’s default pre-treatment/event-study diagnostics use different base-period logic.
  • The tutorial also imports “few clusters” bootstrap guidance into EfficientDiD, even though clustered inference is explicitly not implemented for this estimator.
  • The new API page and notebook use a different bibliographic citation than the Methodology Registry’s canonical EfficientDiD reference.
  • I did not find code, security, or performance regressions in the changed files. The notebook JSON parses and its code cells compile syntactically.

Methodology

Code Quality

  • No findings in the changed files.

Performance

  • No findings in the changed files.

Maintainability

  • No findings in the changed files.

Tech Debt

  • No new tech-debt blocker. This PR appears to address the existing tracked docs/tutorial item at TODO.md:49.

Security

  • No findings. I did not see secrets, tokens, or PII introduced in the changed files.

Documentation/Tests

  • Severity P3. Impact: the new EfficientDiD API page and notebook use a different bibliographic record than the Methodology Registry: docs/api/efficient_did.rst:29, docs/tutorials/15_efficient_did.ipynb:574, versus the canonical registry citation at docs/methodology/REGISTRY.md:483. Concrete fix: use the registry citation consistently across the new docs/notebook.
  • No additional test finding on the diff itself. The notebook JSON is valid and its code cells compile, but I could not execute it in this review environment because numpy is not installed.

Path to Approval

  1. Replace the blanket = CS / “equivalent to Callaway-Sant’Anna” language in the README, API page, estimator guide, and notebook with scoped wording that is true for the current implementation: post-treatment equivalence to CallawaySantAnna(control_group="never_treated"), not universal equivalence across pre-trend/event-study outputs.
  2. Update the tutorial comparison narrative/output so it does not promise that PT-All has “the same ATT” as CS in finite samples; phrase that as targeting the same estimand or compare only the post-treatment quantities the implementation actually matches.
  3. Remove or qualify the “Webb for very few clusters” guidance in the EfficientDiD bootstrap tutorial until clustered inference exists for this estimator.
  4. Align the EfficientDiD reference text in the API page and notebook with the canonical citation in docs/methodology/REGISTRY.md.

- P1: Scope PT-Post/CS equivalence to post-treatment ATT(g,t) across all
  doc files (rst, README, notebook) — pre-treatment diagnostics may differ
- P2: Remove inappropriate cluster guidance from Webb bootstrap description
- P3: Fix citation initials (Chen X., Xie H.) and add full paper title

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Mar 15, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: 41d6c846e7ec202c56299d8f1b790090004b230f


Overall Assessment

⚠️ Needs changes

Executive Summary

  • Re-review: the prior cluster/Webb guidance issue appears resolved. The new tutorial now describes multiplier-bootstrap weight options without importing unsupported clustered-inference advice for EfficientDiD.
  • Re-review: most of the earlier PT-Post/Callaway-Sant’Anna equivalence overstatement has been corrected, but two residual phrases still present PT-Post as generally “CS-equivalent.”
  • That is still broader than the documented and tested behavior: PT-Post in this library fixes the baseline at g-1-anticipation, CallawaySantAnna still defaults to base_period="varying", and the exact-match regression test only asserts post-treatment ATT(g,t) equality.
  • The new API page and notebook still use a citation variant that does not match the Methodology Registry’s canonical EfficientDiD reference.
  • I did not find code quality, performance, maintainability, tech-debt, or security blockers in the changed files.

Methodology

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings. This PR appears to address the tracked EfficientDiD docs/tutorial item in TODO.md:L49.

Security

  • No findings.

Documentation/Tests

Path to Approval

  1. Replace the remaining unscoped PT-Post/CS-equivalence wording in the tutorial intro and chooser example with wording limited to post-treatment ATT(g,t) / post-treatment aggregations, and explicitly preserve the existing caveat that CS’s default pre-treatment/event-study outputs differ because of base_period="varying".

- Scope "same point estimates as CS" → "matching post-treatment ATT(g,t) with CS"
- Scope "CS-equivalent" → "post-treatment CS match" in code comment
- Remove "*Working Paper*" suffix from citations to match REGISTRY.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Mar 15, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: e13c358f1618f445d51eefe6c11aa142d874701d


Overall Assessment

⚠️ Needs changes

Executive Summary

  • Re-review: the prior P1 around PT-Post/Callaway-Sant’Anna equivalence wording appears resolved. The updated docs now scope the match to post-treatment outputs and keep the pre-treatment caveat in docs/tutorials/15_efficient_did.ipynb:L144 and docs/choosing_estimator.rst:L241-L244.
  • [Newly identified] The tutorial’s intuition section still misstates the PT-All comparison set by treating period_1 as an additional valid baseline. The registry, implementation, and tests all exclude period_1 from the overidentifying (g', t_pre) index set because Y_1 is already the universal reference.
  • The canonical EfficientDiD citation now matches the Methodology Registry in the new API page and tutorial summary, resolving the earlier citation-consistency note in docs/api/efficient_did.rst:L29-L30 and docs/tutorials/15_efficient_did.ipynb:L488.
  • I did not find code-quality, performance, maintainability, tech-debt, security, or test blockers in the changed files.

Methodology

  • Severity P1 [Newly identified]. Impact: docs/tutorials/15_efficient_did.ipynb:L66 says that for ATT(g=5, t=6), “periods 1, 2, 3 can all serve as valid baselines.” That is not the method implemented here. The registry explicitly notes that PT-All excludes period_1 from the candidate (g', t_pre) set because Y_1 is the universal reference, and the implementation/tests enforce exactly that rule in docs/methodology/REGISTRY.md:L607-L609, diff_diff/efficient_did_weights.py:L87-L99, and tests/test_efficient_did.py:L543-L583. This is an undocumented mismatch between the tutorial and the source-backed implementation. Concrete fix: rewrite that paragraph so it states that period_1 is the fixed Y_1 reference, while additional PT-All baselines start after period_1 and must satisfy t_pre < g' for the chosen comparison cohort.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings. This PR appears to address the tracked docs/tutorial item in TODO.md:L49.

Security

No findings.

Documentation/Tests

No findings.

Path to Approval

  1. Correct the PT-All intuition paragraph in docs/tutorials/15_efficient_did.ipynb:L66 so it no longer presents period_1 as an extra valid baseline and instead describes the implemented valid-pair rule (Y_1 fixed as the universal reference; overidentifying t_pre values begin after period_1 and must be pre-treatment for the comparison cohort).

The tutorial incorrectly stated "periods 1, 2, 3 can all serve as valid
baselines". Period 1 is the universal Y_1 reference and is excluded per
REGISTRY.md and efficient_did_weights.py. Corrected to "periods 2 and 3".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Mar 15, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: dba7d2f82f19ddf4200f0fdf44147517d1714771


Overall Assessment

✅ Looks good

Executive Summary

  • Re-review: the prior PT-All baseline wording issue is resolved. The tutorial now correctly treats period_1 as the fixed reference rather than an extra baseline in docs/tutorials/15_efficient_did.ipynb:L66, consistent with the registry note in docs/methodology/REGISTRY.md:L607.
  • I did not find any unmitigated P0/P1 issues in the changed EfficientDiD docs.
  • One P2 documentation-accuracy issue remains: the notebook describes the synthetic DGP as homogeneous with a constant true effect of 2.0, but the actual generator call uses dynamic treatment effects by default.
  • The new API/index entries are consistent with the exported public symbols for EfficientDiD, EfficientDiDResults, and EDiDBootstrapResults in diff_diff/init.py:L131-L135 and diff_diff/init.py:L240-L244.
  • I could not independently rerun the notebook in this environment because numpy is unavailable.

Methodology

  • Severity P2. Impact: the tutorial’s narrative for the three-estimator comparison is internally inconsistent with the code. It says the notebook uses a “known treatment effect of 2.0,” later describes the DGP as having “homogeneous effects,” and prints True effect = 2.0 in the comparison section, but the actual generate_staggered_data() calls do not override dynamic_effects=True / effect_growth=0.1, so treatment effects grow with event time rather than staying constant. See docs/tutorials/15_efficient_did.ipynb:L75-L85, docs/tutorials/15_efficient_did.ipynb:L401-L421, diff_diff/prep_dgp.py:L125-L162, and diff_diff/prep_dgp.py:L245-L251. Concrete fix: either rewrite the prose/headings to say the example uses common dynamic effects (for example, “effect starts at 2.0 and grows over time”), or set dynamic_effects=False anywhere the tutorial intends to demonstrate a constant-effect/homogeneous DGP.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings. This PR appears to address the tracked EfficientDiD docs/tutorial follow-up in TODO.md:L48-L49.

Security

No findings.

Documentation/Tests

No findings beyond the P2 tutorial-accuracy item above. Residual risk: I was not able to independently execute the new notebook in this review environment because numpy is not installed.

The tutorial describes a homogeneous DGP with "True effect = 2.0" but
generate_staggered_data() defaults to dynamic_effects=True, making
effects grow with event time. Add dynamic_effects=False to both calls
so the DGP matches the prose.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Mar 15, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: 85b76f4fe676b4bd013f89578dd46add7b1fdbbe


Overall Assessment

✅ Looks good

Executive Summary

Methodology

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

Tech Debt

  • No findings. This PR appears to address the tracked EfficientDiD docs/tutorial follow-up in TODO.md:L49.

Security

  • No findings.

Documentation/Tests

  • No findings in the changed content.
  • Residual risk: notebook execution was not independently verified here because the review environment lacks numpy and pandas.

@igerber igerber merged commit bc23fa0 into main Mar 15, 2026
@igerber igerber deleted the efficient-did-notebook branch March 15, 2026 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant