Skip to content

HeterogeneousAdoptionDiD methodology-review-tracker promotion: In Progress -> Complete#473

Merged
igerber merged 14 commits into
mainfrom
feature/had-methodology-review-tracker-promotion
May 20, 2026
Merged

HeterogeneousAdoptionDiD methodology-review-tracker promotion: In Progress -> Complete#473
igerber merged 14 commits into
mainfrom
feature/had-methodology-review-tracker-promotion

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 20, 2026

Summary

Promotes the HeterogeneousAdoptionDiD (HAD) row in METHODOLOGY_REVIEW.md from In Progress to Complete for the de Chaisemartin, Ciccia, D'Haultfœuille & Knau (2026) Weighted-Average-Slope estimator (arXiv:2405.04465v6).

  • New tests/test_methodology_had.py (6 classes, 36 tests) with paper-equation-numbered Verified Components walk-through covering Equations 3 / 7 / 11 / 18 / 29 and Theorems 1 / 3 / 4 / 7. Key fixtures: Eq. 3 boundary-subtracted recovery on both zero-boundary and nonzero-boundary-intercept DGPs (locking the att = (mean(ΔY) - τ_bc) / mean(D) subtraction); Eq. 11 mass-point Wald-IV closed-form equivalence at atol=1e-9; Theorem 4 QUG distributional match against closed-form F(t)=t/(1+t) at KS-stat ≤ 0.05, n_draws=5000; Eq. 29 paper-literal σ²_diff=1/(2G) normalization lock; joint Stute H0 fail-to-reject on both pre-trends and homogeneity surfaces plus H1 reject for joint homogeneity under a nonlinear (D + D²) DGP; library-deviation locks (equal-weighting via selective low-dose region replication, sup-t bootstrap gating, staggered-timing fail-closed ValueError, Assumption 5/6 UserWarning lock, safe_inference joint-NaN invariant, last-cohort auto-filter via result.filter_info).
  • Added "Non-testable assumptions (paper Section 3.1.2)" Notes block to the HeterogeneousAdoptionDiD class docstring + "Scope (what this test does NOT cover)" clauses to qug_test / stute_test / yatchew_hr_test / did_had_pretest_workflow Notes sections, explicitly stating QUG tests the support-infimum null d_lower=0 (adjacent evidence on one clause of Assumption 4 only); stute_test / yatchew_hr_test target Assumption 8 linearity; joint_pretrends_test targets Assumption 7 mean-independence; none test Assumptions 5 or 6 directly. Reinforced by the existing fit-time UserWarning on Design 1 family paths.
  • Updated the fit() docstring's staggered-timing contract to explicitly document both branches: first_treat_col supplied → auto-filter to last-cohort + never-treated with UserWarning per Appendix B.2; omitted on multi-cohort panel → fail-closed ValueError. Cross-referenced REGISTRY Deviations § "Library extension: Staggered-timing fail-closed" for the rationale.
  • Phase-4 validation-harness items (Pierce-Schott 2016 Figure 2 replication, Table 1 coverage-rate reproduction across 3 DGPs × G ∈ {100, 500, 2500}) waived with documented rationale: R parity at atol=1e-8 in tests/test_did_had_parity.py (3 DGPs × 5 method combos, bit-exact via rtol=0) is a strictly stronger correctness anchor than coverage-rate Monte Carlo. The paper itself self-acknowledges (Section 5.2) that NP estimators are too noisy on the LBD-restricted PNTR panel.
  • REGISTRY HAD section gains a consolidated Deviations block (5 entries with framing header categorizing implementation choices vs validation-harness waivers vs library extension). 2 of 3 unchecked Implementation Checklist items closed (staggered fail-closed + Assumption 5/6 docs); covariates= Theorem 6 follow-up and the extensive-margin "consider running standard DiD" main-fit() warning explicitly tracked in TODO.md as Low-priority follow-ups. Paper-review checklist L182-194 closes Phase 1a/1b/1c implementation-status items plus the Assumption 5/6 documentation closure; the extensive-margin item is left explicitly open (partial coverage).
  • 1,137 implementation-detail tests across tests/test_had.py / test_had_pretests.py / test_had_mc.py / test_had_dual_knob_deprecation.py remain unchanged; 5 R-direct parity tests in test_did_had_parity.py at atol=1e-8 are the documented R-parity anchor; nprobust port + bias-corrected port tests at machine precision (atol=1e-12 / 1e-14) cover Eq. 7 separately.

Methodology references (required if estimator / math changes)

  • Method name(s): HeterogeneousAdoptionDiD, qug_test, stute_test, yatchew_hr_test, joint_pretrends_test, joint_homogeneity_test, did_had_pretest_workflow.
  • Paper / source link(s): de Chaisemartin, C., Ciccia, D., D'Haultfœuille, X., & Knau, F. (2026). Difference-in-Differences Estimators When No Unit Remains Untreated. arXiv:2405.04465v6. Paper review on file: docs/methodology/papers/dechaisemartin-2026-review.md. R reference: chaisemartin::did_had (Credible-Answers/did_had v2.0.0, SHA edc09197); nprobust v0.5.0 (Calonico-Cattaneo-Farrell) for bandwidth selection.
  • Any intentional deviations from the source (and why): documented in REGISTRY ## HeterogeneousAdoptionDiD § "Deviations and library extensions" (5 entries):
    1. Equal-weighting on the continuous path (paper-permitted implementation choice; matches _nprobust_port.lprobust default).
    2. Sup-t bootstrap gating to aggregate="event_study" + (weights= or survey_design=) + cband=True (stability invariant; unweighted event-study bit-exactly preserves pre-Phase 4.5 B output).
    3. Pierce-Schott (2016) Figure 2 replication harness waived — R parity at atol=1e-8 is stronger; paper self-acknowledges NP estimators too noisy on LBD-restricted panel.
    4. Table 1 coverage-rate reproduction waived — R parity covers the same 3 DGPs at stricter tolerance than the asymptotic-coverage MC.
    5. Staggered-timing fail-closed ValueError (paper prescribes "Warn"; library raises — stricter-safety library extension to prevent silent misuse of the last-cohort-only identification).

Validation

  • Tests added/updated: NEW tests/test_methodology_had.py (6 classes, 36 tests, ~960 LoC). All 36 pass; full HAD test sweep (test_methodology_had.py + test_had.py + test_had_pretests.py + test_did_had_parity.py + T20/T21/T22 drift tests) reports 665 passed, 2 skipped, 0 failures.
  • Backtest / simulation / notebook evidence (if applicable): no tutorial changes; T20/T21/T22 drift tests verify no regression in HAD tutorial-pinned outputs.

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

igerber and others added 11 commits May 20, 2026 07:24
Add tests/test_methodology_had.py (6 classes, 34 tests) with paper-
equation-numbered Verified Components walk-through against de
Chaisemartin, Ciccia, D'Haultfoeuille & Knau (2026) arXiv:2405.04465v6
covering Equations 3 / 7 / 11 / 18 / 29 and Theorems 1 / 3 / 4 / 7:

- TestHADTheorem1Design1Prime: Eq. 3 Design 1' WAS recovery + N(0,1)
  coverage check at n_replicates=200, G=1000 with KS-stat <= 0.05 and
  empirical 95% coverage >= 0.90
- TestHADTheorem3MassPoint: Eq. 11 / Theorem 3 mass-point WAS_{d_lower}
  recovery + Wald-IV closed-form equivalence at atol=1e-9
- TestHADTheorem4QUG: Theorem 4 limit-law distributional match against
  closed-form F(t) = t/(1+t) at KS-stat <= 0.05, n_draws=5000, G=2000
- TestHADTheorem7YatchewHR: Eq. 29 standard-normal limit, paper-literal
  sigma2_diff = 1/(2G) normalization lock
- TestHADJointStute: Section 4.2 step 2 + 4.3 mean-independence variant
  H0 fail-to-reject + H1 reject under nonlinear DGP
- TestHADDeviations: equal-weighting invariance, sup-t bootstrap gating,
  staggered-timing fail-closed ValueError, safe_inference joint NaN

Add Assumption 5/6 non-testability documentation:
- HeterogeneousAdoptionDiD class docstring: new "Non-testable assumptions
  (paper Section 3.1.2)" Notes block citing Section 3.1.2 + cross-
  referencing the existing fit-time UserWarning at had.py:3372-3390
- qug_test / stute_test / yatchew_hr_test / did_had_pretest_workflow:
  "Scope (what this test does NOT cover)" clauses in Notes sections
  explicitly stating tests verify ADJACENT assumptions (4 / 7 / 8) and
  CANNOT test Assumptions 5 or 6

Close paper-review checklist L182-L194 + REGISTRY HAD Implementation
Checklist L2602-L2604: Phase 1a/1b/1c implementation closures (panel
validator, design paths, local-linear backend, bias-corrected CI),
staggered-timing fail-closed ValueError, zero-dose UserWarning filter,
Assumption 5/6 non-testability documentation. L2604 (covariates=
Theorem 6 NotImplementedError) remains [ ] with explicit TODO.md
cross-reference (currently a Python TypeError, fail-closed).

Waive Phase-4 validation-harness items #1 (Pierce-Schott 2016 Figure 2)
+ #2 (Table 1 coverage rates) with documented rationale: R parity at
atol=1e-8 in test_did_had_parity.py (3 DGPs x 5 method combos, bit-exact
via rtol=0) is a strictly stronger correctness anchor than coverage-rate
MC. Paper Section 5.2 itself self-acknowledges NP estimators too noisy
to be informative on the LBD-restricted PNTR panel.

REGISTRY HAD section gains a consolidated Deviations block (5 entries
with framing header distinguishing Notes #1-#2 = implementation choices
from Notes #3-#4 = waived validation-harness work from #5 = Library
extension for staggered-timing fail-closed). Existing scattered Note
entries at L2313 (equal-weighting) and L2398 (sup-t gating) referenced
from the new block.

METHODOLOGY_REVIEW.md HAD row promoted In Progress -> Complete, detail
section rewritten with Verified Components / Test Coverage / Corrections
Made / Deviations / Outstanding Concerns structure mirroring the Bacon /
TripleDifference Complete-row layout.

TODO.md: existing Phase 4 Pierce-Schott row annotated with the 2026-05-20
waiver decision + rationale; new follow-up row for covariates= Theorem 6
NotImplementedError +Theorem 6 pointer (Low priority).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…test

- P2 (Maintainability): fit() docstring on first_treat_col and aggregate="event_study"
  conflated two staggered-timing branches. Now explicitly documents both: supplied →
  auto-filter + UserWarning; omitted → fail-closed ValueError + DCDH redirect. Keeps
  Appendix B.2 wording aligned with the REGISTRY Library extension #5 note.

- P2 (Documentation/Tests): rebuilt the equal-weighting deviation test. Old test
  duplicated the entire panel uniformly — invariant under both equal and cell-size
  weighting. New test (test_equal_weighting_is_per_row_not_per_dose_cell) replicates
  only low-D units (D <= 0.15) 4x on a nonlinear DGP (delta_Y = 0.5*D + 1.0*D²) and
  asserts the att shifts by > 1.5*max(se) AND moves downward. Per-row equal weighting
  predicts the shift; cell-size weighting (counterfactual) would predict att invariant.

- P2 (Methodology): downgraded the paper-review L191 closure note ("Warnings for
  extensive-margin effects"). Original text overclaimed REGISTRY had a "suggests
  running existing DiD" recommendation that does not exist. Now describes the
  actual library state: qug_test surfaces zero-dose UserWarning; explicit
  main-path "fall back to DiD" recommendation is a Low-priority follow-up.

- P3 (line refs): swapped hard-coded "had.py:3372-3390" references to a search
  string ("---- Assumption 5/6 warning on Design 1 paths ----") so they survive
  future docstring edits. 3 surfaces updated: METHODOLOGY_REVIEW, REGISTRY,
  paper review.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- P1 (Methodology): Eq. 3 / Theorem 1 was previously written as the simplified
  WAS = E[ΔY] / E[D] in test docstring + METHODOLOGY_REVIEW.md. The paper and
  the in-code HAD docs use the boundary-subtracted form WAS = [E(ΔY) - lim_{d↓0}
  E(ΔY | D ≤ d)] / E(D); the library implements
  att = (mean(ΔY) - τ_bc) / mean(D). Old DGP set τ_bc ~ 0 so the subtraction
  term was untested. Fix:

  - Restated Eq. 3 in test_methodology_had.py module + class docstrings,
    METHODOLOGY_REVIEW.md, and REGISTRY Deviations Note #1.
  - Added boundary_intercept kwarg to _make_two_period_panel so DGP can be
    parameterized with delta_Y = c + β*D + ε (c != 0).
  - New test_eq3_was_recovery_nonzero_boundary_intercept: c=0.2, β=0.3 →
    att should recover 0.3 (not 0.7 = 0.35/0.5, the wrong-formula answer).
    Test passes locally; explicit anti-guard against the no-subtraction
    failure mode (abs(att - 0.7) > 5 * se).

- P3 (Maintainability): METHODOLOGY_REVIEW.md cited the fit-time UserWarning
  as inside _fit_continuous / _fit_mass_point_2sls. Actual emission point is
  the outer HeterogeneousAdoptionDiD.fit() dispatch (search anchor preserved).
  Also updated the equal-weighting test reference to the new test name.

- P3 (Tech Debt): paper-review L191 (extensive-margin warning) was marked [x]
  but described as partial / unimplemented. Flipped to [ ] with a status note
  pointing to TODO.md; added a corresponding follow-up row in TODO.md for the
  fit-time "consider running standard DiD" warning.

All 35 methodology tests pass; full HAD sweep clean (664 passed).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…im accuracy

- Test file L42 class-structure bullet still summarized Theorem 1 as the
  simplified WAS = E[delta_Y] / E[D] shorthand. Rewritten to describe the
  boundary-subtracted identification + both DGP variants exercised.

- paper-review L193 (multi-period event-study closure) still said staggered
  panels auto-filter to last cohort with UserWarning. Updated to align with
  L190 / the implementation: auto-filter only when first_treat_col supplied;
  ValueError when omitted.

- METHODOLOGY_REVIEW.md test counts updated: 35 methodology tests (was 34;
  added test_eq3_was_recovery_nonzero_boundary_intercept in R2). T21 drift
  17 (was 16); T22 drift 32 (was 28); T20 drift 14 (was unspecified).

- CHANGELOG bullet reworded: was "closes the 3 unchecked Implementation
  Checklist items at L2684-L2686" which overclaimed. Now: "closes 2 of 3
  (staggered fail-closed + Assumption 5/6 docs); covariates= Theorem 6 and
  extensive-margin warning explicitly tracked in TODO.md as follow-ups."
  Boundary-subtracted DGP variant explicitly named in the bullet.

All 35 methodology tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…unts

- P2 (Methodology): the new Scope notes claimed QUG "targets Assumption 4
  boundary density". The paper's Assumption 4 is broader (positive boundary
  density + twice-differentiable conditional mean + continuous-positive
  conditional variance + bandwidth regularity). QUG / Theorem 4 actually
  tests only the support-infimum null d_lower = 0, which is one clause of
  Assumption 4. Reworded in 4 surfaces: qug_test Notes, did_had_pretest_workflow
  Notes, HeterogeneousAdoptionDiD class docstring, paper-review L192 closure.
  Now phrased as "QUG tests the Theorem 4 / Design 1' support-infimum null
  d_lower = 0 — adjacent evidence on the d_lower = 0 clause of Assumption 4
  only, NOT a test of the full statement".

- P3 (Documentation/Tests): T21/T22 drift-test counts fixed in the remaining
  stale references. METHODOLOGY_REVIEW.md "Verified Components" row updated
  to 17/32 (was 16/28) + 14 for T20. REGISTRY HAD §"Phase 5 wave 2 first
  slice" (PR #409) updated to 17 (was 16). The Test Coverage block (already
  at 17/32) and CHANGELOG (already accurate after R3) unchanged.

All 35 methodology tests pass; lint clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…curacy

- P2 (Methodology): tightened stute_test / yatchew_hr_test / class docstring
  to correctly attribute Assumption 7 (mean-independence pre-trends) to
  joint_pretrends_test (intercept-only residual form via
  null_form="mean_independence") rather than to the raw stute_test helper.
  The raw stute_test always fits dy ~ 1 + d and tests Assumption 8 linearity.
  Updated all 5 surfaces: stute_test Notes, yatchew_hr_test Notes (now also
  documents null="linearity" vs null="mean_independence" kwarg correctly,
  no longer references nonexistent "residual_form"), HeterogeneousAdoptionDiD
  class docstring (split into 4 distinct ADJACENT condition bullets), REGISTRY
  HAD checklist L2694 closure, paper-review L192 closure.

- P3 (Documentation/Tests): the new workflow / REGISTRY / paper-review prose
  said the composite verdict surfaces the Assumption 5/6 caveat. Actually
  the verdict string only flags the Assumption 7 step-2 gap on the
  aggregate="overall" path. Reworded in 4 surfaces (workflow Notes, HAD class
  docstring, REGISTRY L2694, paper-review L192) to clarify that the
  Assumption 5/6 caveat is surfaced by (a) the Design 1 fit-time UserWarning
  and (b) T21 tutorial prose — NOT by the workflow verdict string.

- P3 (Documentation/Tests): yatchew_hr_test Notes referenced a nonexistent
  "residual_form" selector. Replaced with the correct kwarg name "null"
  ({"linearity", "mean_independence"}) and described both branches.

All 35 methodology tests pass; full HAD + drift sweep 665 passed; lint clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- P3 (Methodology): the promoted HAD materials described the Eq. 17/18
  `trends_lin=True` linear-trend-detrended variant as "deferred per Phase 4".
  This conflated TWO different things: (a) the FEATURE — which is shipped
  via the `trends_lin: bool = False` keyword-only kwarg on HAD.fit(),
  joint_pretrends_test, and joint_homogeneity_test (PR #389; R-parity locked
  against DIDHAD::did_had(trends_lin=TRUE) v2.0.0 in test_did_had_parity.py);
  and (b) the PIERCE-SCHOTT NUMERICAL REPLICATION against the published
  p=0.51 anchor on the LBD-restricted panel, which IS waived per REGISTRY
  Deviations Note #3. Updated 3 surfaces (paper-review L194, METHODOLOGY_REVIEW
  Eq. 18 Verified-Components row, test_methodology_had.py module docstring +
  TestHADJointStute class docstring) to distinguish "feature shipped + R-parity
  locked elsewhere" from "Pierce-Schott numerical replication waived".

- P3 (Documentation/Tests): TestHADJointStute promotion narrative overstated
  H1 coverage as "H0 fail-to-reject and H1 reject on linear vs nonlinear DGPs"
  for both joint_pretrends_test and joint_homogeneity_test. Reality: H1
  rejection is tested only on joint_homogeneity_test via a quadratic post-
  DGP; joint_pretrends_test gets H0-only coverage in this file (H1 would
  require a violating-pretrends fixture that re-verifies bootstrap calibration
  covered by test_had_pretests.py). Narrowed wording in METHODOLOGY_REVIEW
  Verified-Components row + TestHADJointStute class docstring; CHANGELOG entry
  unchanged (the H1 reject claim in CHANGELOG explicitly cites the homogeneity
  side via "H1 reject under nonlinear DGP", which is accurate).

All 35 methodology tests pass; lint clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…1 scope

- R6 fix left METHODOLOGY_REVIEW.md Deviations item #6 stale (only updated
  the Verified-Components row). Item #6 still said "Eq. 18 linear-trend-
  detrended joint Stute deferred". Rewritten to match the rest of the
  HAD tracker: trends_lin=True is SHIPPED + R-parity-locked in
  test_did_had_parity.py; the methodology-walkthrough file deliberately
  doesn't duplicate that coverage; the Pierce-Schott published-value
  numerical replication is what's waived (Deviations Note #3).

- R6 narrowed the Verified-Components row + class docstring but missed the
  CHANGELOG bullet, which still claimed "joint Stute pre-trends + homogeneity
  H0 fail-to-reject + H1 reject under nonlinear DGP". Narrowed to:
  "H0 fail-to-reject on both surfaces and H1 reject for joint homogeneity
  under a nonlinear DGP" — matches the test file's actual scope.

All 35 methodology tests pass; lint clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- P3 (Maintainability): CHANGELOG hard-coded REGISTRY line references
  L2684-L2686. Those lines shifted as we edited REGISTRY across rounds.
  Replaced with stable item names ("staggered-timing fail-closed
  ValueError" / "Assumption 5/6 non-testability documentation" /
  "covariates= Theorem 6 follow-up").

- P3 (Documentation/Tests): two new methodology tests had docstrings
  describing a stronger contract than they asserted.
  - test_sup_t_bootstrap_skipped_when_cband_false: docstring said
    "all-NaN", assertion was "is None". Aligned docstring to the
    actual Optional[ndarray] None contract.
  - test_safe_inference_joint_nan_on_degenerate_panel: docstring said
    "all fields jointly NaN", assertion accepted either all-NaN OR
    all-finite (the no-partial-NaN invariant). Renamed test to
    test_safe_inference_no_partial_nan_on_degenerate_panel and
    rewrote the docstring to match the actual invariant.

All 35 methodology tests pass; lint clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…g lock

- P3 (Documentation/Tests): test_first_treat_col_activates_last_cohort_auto_filter
  only asserted n_units < G; that would still pass if never-treated controls
  were accidentally dropped and only the last cohort survived. Strengthened
  to exact-count assertion: with G=600 and 3 equal-sized cohorts (third=200
  each), kept = 200 never-treated + 200 last-cohort = 400. Added a cross-check
  via the panel's first_treat value set + a kept/dropped count identity
  (kept + 200 dropped = G).

- P3 (Documentation/Tests): the shared _fit_overall() helper suppressed the
  Design 1 Assumption 5/6 UserWarning with a comment claiming the warning
  was "covered by TestHADDeviations" — but no test in that class actually
  asserted the warning fires. Added
  test_assumption_5_6_userwarning_fires_on_design_1_family which uses
  pytest.warns(UserWarning, match=r"Assumption [56]") on a mass-point fit
  to lock the warning surface against silent regression. Also narrowed
  the helper's warning filter to the exact "Assumption [56]" pattern
  rather than the broad "(Assumption|continuous_near_d_lower|mass_point)"
  match — keeps test output clean without masking unrelated future warnings.

Methodology test count is now 36 (was 35); CHANGELOG + METHODOLOGY_REVIEW
counts updated.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
R9's strengthened test asserted the retained cohort set from the INPUT
panel ({0, 2, 3}), which is invariant to what the auto-filter actually
kept — the test would pass even if the estimator dropped the wrong
200 units. Switched to result.filter_info (the canonical source of
truth for the filter's kept/dropped metadata), asserting:

  - result.filter_info["F_last"] == 3 (last cohort kept)
  - result.filter_info["n_kept"] == 400 (200 never-treated + 200 last)
  - result.filter_info["n_dropped"] == 200
  - result.filter_info["dropped_cohorts"] == [2] (earlier cohort only)

This now genuinely locks the Appendix B.2 last-cohort + never-treated
contract against silent regression to {2, 3} or any other 400-unit
composition.

All 36 methodology tests pass; lint clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall Assessment
✅ Looks good

Executive Summary

  • No P0/P1 findings. This PR does not change HAD estimator logic; it adds methodology-facing documentation, tracker updates, and a dedicated methodology test file.
  • I cross-checked the new claims against the existing HAD implementation. The Design 1 Assumption 5/6 warning, staggered fail-closed behavior, last-cohort auto-filter, and weighted-event-study sup-t gating all exist as documented in diff_diff/had.py and diff_diff/had.py.
  • The source-material deviations called out here are explicitly documented with **Note:** / **Library extension:** labels in docs/methodology/REGISTRY.md, so under the review rubric they are informational rather than defects.
  • One P2 remains: the new methodology file adds several large Monte Carlo/asymptotic checks to the default suite without the @pytest.mark.slow / ci_params.bootstrap(...) gating used elsewhere in this repo.
  • The remaining HAD follow-ups (covariates= Theorem 6 pointer; fit-time extensive-margin warning) are properly tracked in TODO.md, so they do not block approval.
  • Static review only: this environment does not have numpy or pytest, so I could not execute the new tests.

Methodology

Code Quality

  • No findings.

Performance

  • Severity P2. Impact: tests/test_methodology_had.py (L277-L307), tests/test_methodology_had.py (L490-L515), and tests/test_methodology_had.py (L606-L626) add fixed-cost Monte Carlo checks to the always-on suite: 200 full HAD fits at G=1000, 5,000 QUG draws at G=2000, and 200 Yatchew draws at G=2000. Existing repo practice is to gate similar methodology/MC checks with @pytest.mark.slow or ci_params.bootstrap(...), see tests/test_had_mc.py (L1-L11, L91-L100, L163-L172) and tests/test_methodology_chaisemartin_dhaultfoeuille.py (L16-L18, L521-L548). This is likely to increase default CI latency noticeably. Concrete fix: mark the asymptotic/coverage tests slow, or route their replication counts through ci_params.bootstrap(...) so constrained CI can downshift them while preserving coverage of the code paths.

Maintainability

  • No findings.

Tech Debt

  • Severity P3-informational. Impact: the PR correctly leaves two non-blocking HAD gaps open instead of overstating the tracker promotion: the covariates= Theorem 6 pointer and the fit-time extensive-margin warning remain tracked in TODO.md (L132-L133), and the paper review still marks the extensive-margin warning as partial in docs/methodology/papers/dechaisemartin-2026-review.md. Concrete fix: none required for approval; resolve those follow-ups via the recorded TODO items when prioritized.

Security

  • No findings.

Documentation/Tests

  • No correctness findings. The new docstrings in diff_diff/had.py and diff_diff/had_pretests.py narrow the claims appropriately about what QUG, Stute, Yatchew, and the composite workflow can and cannot test.

… + ci_params

CI codex flagged that three new Monte Carlo / asymptotic tests in
tests/test_methodology_had.py add fixed-cost MC to the always-on suite
without the @pytest.mark.slow / ci_params.bootstrap(...) gating used
elsewhere in the repo (test_had_mc.py L88-101, test_methodology_dcdh.py
L521-548). Concrete fix: mark slow + route through ci_params.

Gated 3 tests with @pytest.mark.slow + ci_params fixture:
  - test_eq3_normal_pivot_coverage: 200 fits @ G=1000
    -> ci_params.bootstrap(200, min_n=25); coverage floor 0.85 / 0.65
  - test_theorem4_limit_law_distributional_match: 5000 QUG draws @ G=2000
    -> ci_params.bootstrap(5000, min_n=200); KS-tol 0.05 / 0.15
  - test_eq29_standard_normal_limit_under_linearity: 200 Yatchew draws @ G=2000
    -> ci_params.bootstrap(200, min_n=25); KS-tol 0.10 / 0.35

n-conditional tolerance bands per feedback_bootstrap_drift_tests_need_
backend_tolerance: stricter at full n (matches the original pre-gating
test contract), looser at reduced n (covers MC variance with min_n
replicates).

Default suite (no -m '') now: 33 passed + 3 deselected. Slow mode
(-m '') still: 36 passed. METHODOLOGY_REVIEW updated to document the
gating.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: eb5de23e6649c2687c34000de89ce5fd93c16b09


Overall Assessment

✅ Looks good

Executive Summary

  • No P0/P1 findings. This PR is documentation/tracker/test-focused; it does not change HAD estimator logic.
  • Re-review outcome: the prior P2 performance concern is addressed. The new methodology file’s 3 Monte Carlo checks are now @pytest.mark.slow and backend-scaled via ci_params.bootstrap(...) in tests/test_methodology_had.py:L277-L317, tests/test_methodology_had.py:L500-L537, and tests/test_methodology_had.py:L628-L658, matching tests/conftest.py:L192-L224.
  • The methodology-facing claims promoted to “Complete” match live code: staggered-without-first_treat_col still fail-closes in diff_diff/had.py:L1491-L1521; last-cohort auto-filter metadata remains in diff_diff/had.py:L1332-L1364; Design 1 Assumption 5/6 warnings exist on both overall and event-study paths in diff_diff/had.py:L3428-L3446 and diff_diff/had.py:L4271-L4280.
  • The HAD deviations/waivers introduced here are explicitly documented with **Note:** / **Library extension:** labels in docs/methodology/REGISTRY.md:L2638-L2646, so under the review rubric they are informational rather than defects.
  • Remaining HAD follow-ups (covariates= phase pointer; extensive-margin fit-time warning) are properly tracked in TODO.md:L131-L133 and left open in docs/methodology/papers/dechaisemartin-2026-review.md:L191-L194, which is compatible with ✅.
  • Static review only: pytest is not installed in this environment, so I could not execute the new tests. I did verify structurally that tests/test_methodology_had.py contains 36 test functions, 3 of them slow-marked.

Methodology

  • Severity P3-informational. Impact: the PR documents, rather than silently changes, the five HAD deviations/waivers used for tracker promotion; the implementation claims align with current code, and the deviations are properly labeled in REGISTRY.md, which makes them non-defects under the rubric. Concrete fix: none. References: docs/methodology/REGISTRY.md:L2638-L2646, diff_diff/had.py:L1491-L1521, diff_diff/had.py:L3428-L3446, diff_diff/had.py:L4515-L4557, diff_diff/had.py:L4271-L4280.

Code Quality

  • No findings.

Performance

  • No findings. The previous re-review P2 is resolved by slow-marking and CI-scaling the heavy Monte Carlo checks in tests/test_methodology_had.py:L277-L317, tests/test_methodology_had.py:L500-L537, and tests/test_methodology_had.py:L628-L658.

Maintainability

  • No findings.

Tech Debt

  • Severity P3-informational. Impact: the PR does not over-claim closure of the remaining HAD gaps; the covariates= Theorem 6 pointer and main-fit extensive-margin warning remain explicitly tracked as deferred follow-ups, so they are non-blocking under the rubric. Concrete fix: none in this PR. References: TODO.md:L131-L133, docs/methodology/papers/dechaisemartin-2026-review.md:L191-L194, METHODOLOGY_REVIEW.md:L730-L734.

Security

  • No findings.

Documentation/Tests

  • No findings on the changed documentation/tests. The new docstrings correctly narrow what the pretests do and do not establish, and the tracker promotion is backed by the new methodology file plus existing parity/registry anchors. Runtime execution was not possible here because pytest is unavailable. References: diff_diff/had.py:L2596-L2640, diff_diff/had.py:L2875-L2907, diff_diff/had_pretests.py:L1350-L1366, diff_diff/had_pretests.py:L1653-L1670, diff_diff/had_pretests.py:L2146-L2160, diff_diff/had_pretests.py:L4596-L4616, METHODOLOGY_REVIEW.md:L694-L729.

@igerber igerber added the ready-for-ci Triggers CI test workflows label May 20, 2026
…gy-review-tracker-promotion

# Conflicts:
#	CHANGELOG.md
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 8255231be7b1d4852e7aa010b449a46ec76d18cc


Overall Assessment

✅ Looks good

Executive Summary

  • P3 only: METHODOLOGY_REVIEW.md still says HAD has no R reference in the summary/field tables, but the same section now relies on chaisemartin::did_had parity and the REGISTRY/paper review both list did_had as a reference implementation. See METHODOLOGY_REVIEW.md:L61, METHODOLOGY_REVIEW.md:L690-L701, docs/methodology/REGISTRY.md:L2633-L2636, docs/methodology/papers/dechaisemartin-2026-review.md:L177-L179.
  • No P0/P1 findings. This PR does not change HAD estimator logic, and the tracker-promotion claims match live behavior for staggered fail-closed handling, Design 1 Assumption 5/6 warnings, and the new pretest-scope notes. See diff_diff/had.py:L1491-L1521, diff_diff/had.py:L2596-L2640, diff_diff/had.py:L2875-L2907, diff_diff/had.py:L3428-L3446, diff_diff/had.py:L4271-L4280, diff_diff/had_pretests.py:L1352-L1366, diff_diff/had_pretests.py:L1655-L1670, diff_diff/had_pretests.py:L2148-L2160, diff_diff/had_pretests.py:L4598-L4617.
  • The new deviations/waivers are properly documented in the registry, so under the rubric they are informational rather than defects. See docs/methodology/REGISTRY.md:L2638-L2646.
  • Re-review focus: the prior performance concern is addressed. The three Monte Carlo-heavy methodology tests are now slow-marked and backend-scaled through ci_params.bootstrap(...). See tests/test_methodology_had.py:L277-L317, tests/test_methodology_had.py:L500-L537, tests/test_methodology_had.py:L628-L658, tests/conftest.py:L192-L224.
  • Static review only: this environment lacks numpy/pytest, so I could not execute the new tests.

Methodology

  • Severity P3 Impact: The promotion-to-complete claims are consistent with the shipped methodology surface rather than masking new behavior. The staggered-timing fail-closed path is live, the Design 1 Assumption 5/6 warning exists on both overall and event-study paths, and the new docstrings correctly narrow what QUG/Stute/Yatchew/workflow do and do not test. Concrete fix: none. References: diff_diff/had.py:L1491-L1521, diff_diff/had.py:L2596-L2640, diff_diff/had.py:L3428-L3446, diff_diff/had.py:L4271-L4280, diff_diff/had_pretests.py:L1352-L1366, diff_diff/had_pretests.py:L1655-L1670, diff_diff/had_pretests.py:L2148-L2160, diff_diff/had_pretests.py:L4598-L4617, docs/methodology/REGISTRY.md:L2638-L2646, docs/methodology/papers/dechaisemartin-2026-review.md:L190-L194.

Code Quality

  • No findings.

Performance

  • Severity P3 Impact: The prior re-review P2 is resolved. The new methodology file’s expensive Monte Carlo checks are isolated behind @pytest.mark.slow and ci_params.bootstrap(...), so the tracker-promotion tests no longer force full-draw counts in constrained CI runs. Concrete fix: none. References: tests/test_methodology_had.py:L277-L317, tests/test_methodology_had.py:L500-L537, tests/test_methodology_had.py:L628-L658, tests/conftest.py:L192-L224.

Maintainability

  • No findings.

Tech Debt

  • Severity P3 Impact: Remaining HAD gaps are explicitly tracked instead of being over-claimed as closed: the Figure 2/Table 1 replication waiver, the covariates= Theorem 6 pointer, and the main-fit extensive-margin warning all remain open follow-ups. Concrete fix: none in this PR. References: TODO.md:L131-L133, docs/methodology/papers/dechaisemartin-2026-review.md:L191-L194, METHODOLOGY_REVIEW.md:L730-L734.

Security

  • No findings.

Documentation/Tests

  • Severity P3 Impact: The methodology tracker now gives inconsistent provenance for HAD. The summary table and field table still say there is no R reference, while the same section cites chaisemartin::did_had parity and the REGISTRY/paper review list did_had as a reference implementation. That can mislead readers about what external anchor backs the “Complete” status. Concrete fix: update the HAD row/field tables in METHODOLOGY_REVIEW.md to name did_had explicitly, while keeping nprobust as the bandwidth-specific auxiliary reference. References: METHODOLOGY_REVIEW.md:L61, METHODOLOGY_REVIEW.md:L690-L701, docs/methodology/REGISTRY.md:L2633-L2636, docs/methodology/papers/dechaisemartin-2026-review.md:L177-L179.

CI codex flagged that METHODOLOGY_REVIEW.md's HAD row + field table say
"no R reference" / "paper-direct implementation", but the rest of the
PR now relies on chaisemartin::did_had R parity for the Complete
designation. Inconsistent provenance for readers.

- Summary row (L61): swap "(paper-direct; nprobust for bandwidth)" for
  "chaisemartin::did_had (Credible-Answers/did_had v2.0.0); nprobust
  for bandwidth"
- Field table (L690): replace "None (paper-direct implementation)" with
  the explicit chaisemartin::did_had reference at atol=1e-8 R parity
  pin + nprobust auxiliary reference at atol=1e-14 machine precision.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: cd45928c46c63138b87b03f434a6271ea335bc05


Overall Assessment

✅ Looks good

Executive Summary

  • Previous re-review item is resolved: HAD now consistently names chaisemartin::did_had in both the summary table and the detailed tracker entry. METHODOLOGY_REVIEW.md:L61-L61, METHODOLOGY_REVIEW.md:L688-L690
  • No P0/P1 findings. This PR does not change HAD estimator/inference logic; the source changes in diff_diff/had.py and diff_diff/had_pretests.py are docstring-only, and the promoted deviations are explicitly documented in the Methodology Registry. diff_diff/had.py:L2596-L2640, diff_diff/had.py:L2875-L2907, diff_diff/had_pretests.py:L1352-L1366, diff_diff/had_pretests.py:L1655-L1670, diff_diff/had_pretests.py:L2148-L2160, diff_diff/had_pretests.py:L4598-L4617, docs/methodology/REGISTRY.md:L2638-L2646
  • The tracker-promotion claims are mostly well-supported by the new methodology file and existing parity anchors, and the prior CI-performance concern is addressed by @pytest.mark.slow plus ci_params.bootstrap(...). tests/test_methodology_had.py:L277-L317, tests/test_methodology_had.py:L501-L537, tests/test_methodology_had.py:L629-L658, tests/conftest.py:L192-L224
  • P3: the paper-review file still contains one stale sentence saying Eq. 18 detrending is a future Phase 4 extension, which now contradicts the same file’s updated checklist and the tracker/registry entries marking the detrended variant shipped and only the numerical Pierce-Schott replication waived. docs/methodology/papers/dechaisemartin-2026-review.md:L194-L200, docs/methodology/REGISTRY.md:L2626-L2646, METHODOLOGY_REVIEW.md:L700-L728
  • Static review only: pytest is not installed in this environment, so I could not execute the new tests.

Methodology

  • Affected methods: HeterogeneousAdoptionDiD, qug_test, stute_test, yatchew_hr_test, joint_pretrends_test, joint_homogeneity_test, did_had_pretest_workflow.
  • No findings. The PR’s method-facing changes are documentation/test promotion only, and the deviations it relies on are documented in docs/methodology/REGISTRY.md, so they are informational under the review rubric. docs/methodology/REGISTRY.md:L2638-L2646, diff_diff/had.py:L1491-L1521, diff_diff/had.py:L3428-L3446, diff_diff/had.py:L4271-L4280

Code Quality

  • No findings.

Performance

  • No findings. The earlier performance concern appears addressed: the Monte Carlo-heavy methodology checks are now slow-marked and backend-scaled. tests/test_methodology_had.py:L277-L317, tests/test_methodology_had.py:L501-L537, tests/test_methodology_had.py:L629-L658, tests/conftest.py:L192-L224

Maintainability

  • No findings.

Tech Debt

  • No findings. The remaining HAD follow-ups are explicitly tracked in TODO.md, so they are mitigated rather than blockers. TODO.md:L131-L133

Security

  • No findings.

Documentation/Tests

  • Severity: P3
  • Impact: docs/methodology/papers/dechaisemartin-2026-review.md is internally inconsistent. The updated checklist says the Eq. 18 detrended joint Stute variant is shipped and R-parity-locked, with only the published-number Pierce-Schott replication waived, but the explanatory prose immediately below still says detrending is a future Phase 4 extension. That can mislead readers using the paper review as the source of truth for methodology status. docs/methodology/papers/dechaisemartin-2026-review.md:L194-L200
  • Concrete fix: rewrite the sentence at docs/methodology/papers/dechaisemartin-2026-review.md:L200-L200 to past tense so it matches the rest of the file: the detrended Eq. 18 variant is already shipped/R-parity-locked; only the numerical Pierce-Schott replication remains waived/deferred.

@igerber igerber merged commit 729a376 into main May 20, 2026
26 checks passed
igerber added a commit that referenced this pull request May 20, 2026
…ueue + example refs

One informational P3 from CI codex R2 — METHODOLOGY_REVIEW.md still
described ContinuousDiD as "In Progress" in two surrounding surfaces
even after the status-table flip, creating conflicting status signals.

Fixed both sites:
1. L27 explanatory paragraph: removed the ContinuousDiD example from
   the In Progress band's "has methodology file but no paper review"
   illustration (it's now Complete).
2. L1289-1292 Priority Order queue: removed entry #9 (ContinuousDiD)
   and renumbered the remaining queue.

Retroactive fix per feedback_changelog_accuracy_fixes (CI review
catching one factual error in the queue means scanning for the same
mistake): PR #473 promoted HeterogeneousAdoptionDiD to Complete but
left entry #6 (HAD) in the same In Progress queue. Removed HAD's
entry too and renumbered, so the queue is now self-consistent with
the status table for all Complete entries.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
igerber added a commit that referenced this pull request May 22, 2026
Closes the WooldridgeDiD (ETWFE) methodology-review-tracker
promotion in METHODOLOGY_REVIEW.md (In Progress → Complete),
following the primary-source review for Wooldridge (2025) merged
in PR-A (#484). Adds two paper-driven implementation surfaces and
extends R-parity goldens to the nonlinear paths.

Implementation:
- `aggregate(weights="cohort_share")` on WooldridgeDiDResults
  implements paper Eqs. 7.4 (simple-overall) and 7.6 (event-time,
  restricted to k>=0) cohort-share aggregation weights as an
  opt-in alternative to the default cell-count weighting
  (matching Stata `jwdid_estat`). Inference fields fail-closed to
  NaN with UserWarning per paper Section 7.5 conditional-on-shares
  semantics; raises on `survey_design` (design-consistent totals
  deferred); raises on `type ∈ {"group","calendar"}` (no paper
  closed-form); raises on bootstrap fits (no matching bootstrap
  variant). Closes TODO row 95.
- `cohort_trends=True` on `WooldridgeDiD.__init__` adds linear
  `dg_i · t` cohort-specific trend interactions (paper Section 8
  / Eq. 8.1) for the OLS path. Rejects on logit/poisson per
  paper Section 8 OLS scope; rejects on survey_design pending
  full-dummy/TSL validation; enforces per-cohort pre-period
  identification check (≥ 2 observed pre-periods per treated
  cohort). Auto-routes to full-dummy mode regardless of
  vcov_type. Closes the PR-A Requirements Checklist
  heterogeneous-trends gap.

Tests:
- `tests/test_methodology_wooldridge.py` extended with 6
  paper-equation-numbered methodology classes (Theorem 3.1,
  Proposition 5.1, Section 6 event study, Section 7 aggregation
  paths, Section 8 heterogeneous trends, Section 10 unbalanced
  panels) + `TestW2025LibraryDeviations` consolidating 5 surviving
  deviations. Mirrors the HAD PR #473 precedent.
- Two new R-parity surface classes (`TestWooldridgeParityRPoisson`,
  `TestWooldridgeParityRLogit`) lock the structural surface
  against R `etwfe(family=...)` log-link goldens.
- 209 tests total (60 methodology + 149 R-parity + unit
  regressions).

R Goldens:
- `benchmarks/R/generate_wooldridge_golden.R` extended with
  Poisson + logit DGPs via R `etwfe`; augmented panel CSV
  retains the same seed-generated `y_pois` + `y_logit` columns
  for cross-language reproducibility.
- `benchmarks/R/requirements.R` pins `etwfe >= 0.5.0`.

Tracker promotion:
- METHODOLOGY_REVIEW.md L52 status flip with merge date; detail
  section L583-605 rewritten to the Verified Components / Test
  Coverage / Corrections Made / Deviations / Outstanding
  Concerns template mirroring HAD / ContinuousDiD / DCDH. L27
  example re-pointed; priority queue items #7-#10 renumbered to
  #6-#9.
- REGISTRY.md `## WooldridgeDiD (ETWFE)` extended with
  `### Deviations from the paper / from R / library extensions`
  block consolidating 7 surviving deviations + opt-in notes for
  cohort_share + cohort_trends + survey rejection + bootstrap
  cohort_share rejection contracts.
- CHANGELOG.md `[Unreleased]` `### Added` documents the new
  parameters, R-parity extension, and tracker flip.
- `docs/methodology/papers/wooldridge-2025-review.md`
  Requirements Checklist + Gaps & Uncertainties items 1 + 11
  marked `**Status:** Closed in PR-B`.
- `docs/api/wooldridge_etwfe.rst` updated with weighting-scheme
  notes alongside the existing aggregation table.

Second of two PRs for the WooldridgeDiD methodology-review-tracker
promotion. PR-A merged at e416aed (#484).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
HanomicsIMF pushed a commit to HanomicsIMF/diff-diff that referenced this pull request May 22, 2026
…omplete)

Flips the ContinuousDiD tracker row to **Complete** with full Verified
Components / Test Coverage / Corrections Made / Deviations / Outstanding
Concerns structure mirroring the HAD precedent (PR igerber#473). Consolidation
only — no source code changes, no new tests, no new docstrings.

- METHODOLOGY_REVIEW.md L59 row flipped In Progress -> Complete with
  Last Review 2026-05-20. L634-655 detail section rewritten with the
  five-block tracker template: 12 Verified Components rows backed by
  15 methodology tests + 80 unit tests + R parity at relative tolerance
  on 6 benchmark configurations.
- docs/methodology/REGISTRY.md ## ContinuousDiD gains a formal
  Deviations block (4 entries with framing header) before the
  Implementation Checklist: boundary-knots Deviation from R + three
  Phase 2 silent-failures audit fixes documented as library extensions
  with no R correspondence. Existing Edge Cases bullet and Note entries
  remain in place — Deviations is the canonical AI-review surface per
  CLAUDE.md "Documenting Deviations" labels.
- CHANGELOG.md [Unreleased] ### Added gains the ContinuousDiD
  tracker-promotion bullet at the top with per-benchmark tolerance
  language calling out the relative-tolerance scope caveat (NOT
  bit-exact like HAD) due to the boundary-knots deviation precluding
  algorithmic bit-equality.
- TODO.md gains one consolidated row tracking the three CGBS 2024
  feature deferrals (covariates kwarg, discrete-treatment saturated
  regression, lowest-dose-as-control Remark 3.1) — these mirror R
  contdid v0.1.0's omissions and are explicitly marked deferred in the
  REGISTRY Implementation Checklist L755-757.

R parity scope: 1% overall ATT on all 6 benchmarks; 1% max ATT(d) curve
and 2% max ACRT(d) curve on benchmarks 1-3 via _compare_with_r helper;
1% overall ACRT on benchmarks 4-5; benchmark 6 is event-study ATT-only.
NOT bit-exact (atol=1e-8) like HAD — boundary-knots divergence precludes
algorithmic bit-equality on aggregated dose-response curves.

89 regression tests pass (80 unit + 9 methodology, R benchmarks deselected
without R/contdid installed).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
HanomicsIMF pushed a commit to HanomicsIMF/diff-diff that referenced this pull request May 22, 2026
Release notes consolidate 8 PRs since 3.4.0 (2026-05-19):

Public-surface variance lifts:
- SpilloverDiD survey_design on HC1/CR1 via Binder TSL (Wave E.1, igerber#468)
- SpilloverDiD vcov_type=conley + survey_design via stratified-Conley
  on PSU totals (Wave E.2, igerber#474) + lag_cutoff>0 follow-up (igerber#477)
- SunAbraham vcov_type ∈ {classical, hc1, hc2, hc2_bm} (Phase 1b 1/8, igerber#472)
- WLS-CR2 Bell-McCaffrey gates lifted via clubSandwich port (igerber#475)

Methodology-review-tracker promotions (mostly docs/tests):
- PreTrendsPower R pretrends parity goldens (PR-C, igerber#471)
- HAD methodology-review-tracker promotion (igerber#473)
- ContinuousDiD methodology-review-tracker promotion (igerber#476)

All changes additive; bit-equal defaults preserved across the affected
estimators. No new estimators (patch-level per semver convention).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
HanomicsIMF pushed a commit to HanomicsIMF/diff-diff that referenced this pull request May 22, 2026
Flip the ChaisemartinDHaultfoeuille (DCDH) row from In Progress to
Complete. Adds the Verified Components / Test Coverage / Corrections
Made / Deviations / Outstanding Concerns detail section mirroring the
ContinuousDiD (PR igerber#476) and HAD (PR igerber#473) precedents. Consolidates 7
DCDH deviations from the paper, from R DIDmultiplegtDYN, and library
extensions into a labeled REGISTRY surface per the AI-review
"Documenting Deviations" convention. CHANGELOG [Unreleased] gains a
new Added entry. L27 In Progress example re-pointed to WooldridgeDiD;
L1289 priority-order queue item igerber#6 removed and items igerber#7-igerber#11
renumbered to igerber#6-igerber#10.

No source code changes, no new tests, no new docstrings —
documentation consolidation only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant