Skip to content

dCDH heterogeneity: per-path + global placebo predict_het R-parity + df threading#449

Merged
igerber merged 5 commits into
mainfrom
dcdh-heterogeneity-placebo-and-df
May 16, 2026
Merged

dCDH heterogeneity: per-path + global placebo predict_het R-parity + df threading#449
igerber merged 5 commits into
mainfrom
dcdh-heterogeneity-placebo-and-df

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 15, 2026

Summary

Closes TODO #422 (per-path placebo predict_het emission) and pilot-412 (heterogeneity OLS df threading) in a single bundle. Same surface, same R fixture, single clean R-parity story.

R-fixture probe (scenario 22) verified that did_multiplegt_dyn(by_path, predict_het, placebo) emits per-by_level heterogeneity OLS rows on backward (placebo) horizons via R's per-by_level dispatcher (DIDmultiplegtDYN:::did_multiplegt_main placebo block at the effect = matrix(-i, ...) rbind site). Scenario 23 confirmed the same emission on the global (non-by_path) surface. Both surfaces now mirror in Python.

R syntax note: did_multiplegt_dyn(predict_het=list(\"X\", c(...)), placebo=N) rejects positive-only horizons (c(1, 2, 3) errors with "specified numbers in predict_het that exceed the number of placebos" because R reuses the same horizon vector for both forward and backward indexing). The c(-1) sentinel triggers "compute heterogeneity for ALL forward (1..effects) AND ALL placebo (1..placebo) positions" — both new fixtures use this syntax.

Survey gate is warn + skip: when survey_design + placebo + heterogeneity are co-set, fit() emits a UserWarning and falls back to forward-horizon-only heterogeneity. The Binder TSL cell-period allocator's REGISTRY justification is tied to post-period attribution; backward-horizon attribution puts ψ_g mass on a pre-period cell, which is a separate library-extension claim deferred to a follow-up methodology PR. Forward-horizon predict_het + survey_design continues to work unchanged.

Phase 2 df threading: _compute_heterogeneity_test now passes df = n_obs - n_params to safe_inference on the non-survey OLS path (matches R qt(0.975, df.residual(model))); pre-PR Python used df=None (Z critical), producing 0.1-2% rtol gaps on p_value/conf_int vs R. Forward parity tests tightened from "unpinned" to INFERENCE_RTOL=1e-4. Rank-deficient designs use design.shape[1] as df denominator (pre-drop column count); fully rank-deficient is NaN-short-circuited by the existing guard. Near-rank-deficient edge case tracked as a Low TODO follow-up.

Methodology references (required if estimator / math changes)

  • Method name(s): ChaisemartinDHaultfoeuille.predict_het × placebo × by_path/paths_of_interest
  • Paper / source link(s): de Chaisemartin & D'Haultfœuille (NBER WP 29873) Web Appendix Section 1.5 (Lemma 7); R DIDmultiplegtDYN 2.3.3 did_multiplegt_main placebo + predict_het block
  • Any intentional deviations from the source (and why):
    • Survey + backward-horizon predict_het: warn + skip rather than compute. Pre-period Binder TSL allocator derivation is deferred (tracked in TODO.md). Forward + survey continues to work.
    • Joint Wald F-test across all predict_het rows: R aggregates; Python emits per-horizon inference only. Documented in REGISTRY heterogeneity Note.
    • Rank-deficient df: n_obs - n_params (pre-drop column count) vs R's df.residual = n - rank post-drop. Affects only near-rank-deficient designs that solve_ols retains rather than NaN-out. Tracked as Low TODO.

Validation

  • Tests added/updated:
    • R-parity (new): tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityHeterogeneityWithPlacebo (scenario 23, global) + ::TestDCDHDynRParityByPathHeterogeneityWithPlacebo (scenario 22, per-path × 3 paths) pinning all 6 inference fields at BETA_RTOL=1e-6/SE_RTOL=1e-5 for beta/se/t_stat/n_obs and INFERENCE_RTOL=1e-4 for p_value/conf_int.
    • Forward-horizon parity tightened: existing TestDCDHDynRParityHeterogeneity and TestDCDHDynRParityByPathHeterogeneity extended to assert p_value/conf_int at the new INFERENCE_RTOL=1e-4 (was unpinned pre-PR; replaced the Z-vs-t deviation note with the positive parity claim).
    • Cross-surface (new): tests/test_chaisemartin_dhaultfoeuille.py::TestByPathPredictHetPlacebo covers placebo het column population, survey-gate warn+skip, forward+survey anti-regression, out_idx<0 eligibility guard, single-path telescope (path_heterogeneity_effects[(only_path,)] == heterogeneity_effects bit-exactly), summary rendering, and direct-call NotImplementedError backstop.
    • Local-invariant tests refactored: test_*_inference_matches_safe_inferencetest_*_inference_local_invariants to verify SE-derivation wiring (t_stat = beta/se, symmetric conf_int, p_value in [0, 1]) without back-deriving n_params.
    • 314 dCDH tests pass (was 312 pre-PR).
  • Backtest / simulation / notebook evidence: scenarios 22 + 23 in benchmarks/data/dcdh_dynr_golden_values.json, regenerable via Rscript benchmarks/R/generate_dcdh_dynr_test_values.R.

Security / privacy

  • Confirm no secrets/PII in this PR: yes (canonical secret-pattern scan in pre-merge-check returned no hits).

🤖 Generated with Claude Code

igerber and others added 3 commits May 15, 2026 17:49
Closes TODO #422 + pilot-412 in a single PR (same surface, same R fixture,
same parity story).

Phase 0 probe verified R behavior: did_multiplegt_dyn(by_path, predict_het,
placebo) emits per-path heterogeneity OLS results on backward (placebo)
horizons via R's per-by_level dispatcher (DIDmultiplegtDYN:::did_multiplegt_main
placebo block at the `effect = matrix(-i, ...)` rbind site). New scenario 22
in benchmarks/R/generate_dcdh_dynr_test_values.R captures this with
predict_het=list("het_x", c(-1)) — the c(-1) sentinel triggers "compute
heterogeneity for ALL forward (1..effects) AND ALL placebo (1..placebo)
positions" per the R source path read at script time.

Phase 1A implementation (non-survey): _compute_heterogeneity_test gains a
placebo: int = 0 parameter and iterates forward (1..L_max) and backward
(-1..-placebo) horizons in a single loop. Explicit `if out_idx < 0: continue`
eligibility guard prevents numpy negative-index silent wrap on N_mat[g, out_idx]
when F_g - 1 + l_h < 0. _compute_path_heterogeneity_test forwards the param;
fit() passes placebo=L_max if self.placebo else 0 to both global and per-path
call sites. to_dataframe(level="by_path") placebo rows now read het_* values
from path_heterogeneity_effects negative-int keys (mirroring the existing
path_placebo_event_study negative-key convention) instead of the pre-PR
hardcoded NaN-fill.

Survey gate: when survey_design is active AND placebo > 0 + heterogeneity is
requested, _compute_heterogeneity_test raises NotImplementedError eagerly with
a documented message. The Binder TSL cell-period allocator's REGISTRY
justification is tied to post-period attribution; backward-horizon attribution
puts ψ_g mass on a pre-period cell, which is a separate library-extension
claim that needs its own derivation. Forward-horizon predict_het + survey
continues to work unchanged. Pre-period allocator derivation tracked as a
new follow-up TODO row.

Phase 2 (df threading): _compute_heterogeneity_test now passes
df = n_obs - n_params to safe_inference on the non-survey OLS path, matching
R did_multiplegt_dyn(predict_het=...)'s t-distribution inference
(qt(0.975, df.residual(model)) site). Pre-PR Python used df=None (Z critical),
producing 0.1-2% rtol gaps on p_value/conf_int vs R. Existing forward-horizon
parity tests now pin t/p/CI at INFERENCE_RTOL=1e-4 (was unpinned). Rank-
deficient designs use design.shape[1] as df denominator (pre-drop column
count); fully rank-deficient is NaN-short-circuited by the existing guard.
Near-rank-deficient edge case tracked as a new Low TODO follow-up.

R parity: scenario 22 (multi_path_reversible_predict_het_with_placebo,
placebo=2, effects=3, by_path=3) pinned at BETA_RTOL=1e-6/SE_RTOL=1e-5 for
beta/se/t_stat/n_obs and INFERENCE_RTOL=1e-4 for p_value/conf_int across
3 paths × (3 forward + 2 placebo) = 15 horizons.

Cross-surface tests (TestByPathPredictHetPlacebo): placebo het column
population, survey-gate NotImplementedError, forward+survey anti-regression,
out_idx<0 eligibility guard, single-path telescope
(path_heterogeneity_effects[(only_path,)] == heterogeneity_effects bit-
exactly), summary rendering. The two existing local-invariant tests
(test_*_inference_matches_safe_inference) refactored to verify SE-derivation
wiring (t_stat=beta/se, conf_int symmetric around beta, p_value in [0,1])
without back-deriving n_params.

REGISTRY: heterogeneity Z-vs-t deviation note replaced with positive
"R parity (post-2026-05-15 df threading)" framing including the rank-
deficient caveat. New "Per-path placebo heterogeneity" Note documents the
R parity, syntax requirements (c(-1) sentinel), survey gate, and test
anchors. CHANGELOG entry under [Unreleased]. llms-full.txt by_path entry
extended with placebo het composition + survey-gate mention. API rst extended
with the same.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_het parity

Two changes that resolve scope/contract issues with the previous commit on
this branch:

1. Survey gate (scope correctness). The eager `if use_survey and placebo > 0:
raise NotImplementedError` at function entry of `_compute_heterogeneity_test`
fired on EVERY survey + heterogeneity call when fit() passed
`placebo=L_max if self.placebo else 0` with default `placebo=True`. That
broke the previously-supported forward-horizon survey + predict_het path,
contradicting the changelog's "Forward-horizon predict_het + survey_design
is supported" claim. Refactored to a per-iteration backstop inside the
horizon loop (raises only when actually about to compute a backward
iteration under survey). fit() at both global and per-path heterogeneity
call sites now wraps the placebo arg with a warn-and-skip: if
`survey_design + placebo + heterogeneity` are all active, emit a
UserWarning explaining backward-horizon predict_het is deferred under
survey designs and pass `placebo=0` so only forward-horizon heterogeneity
is computed. The function-level NotImplementedError remains as a defensive
backstop for direct callers that bypass fit().

2. Global placebo predict_het parity (scope completeness). The previous
commit extended `_compute_heterogeneity_test` to compute backward horizons,
which means `results.heterogeneity_effects` (global, non-by_path surface)
also gains negative-int keys when `placebo=True + heterogeneity`. The R-
parity coverage was scoped to per-path only (scenario 22). Added scenario
23 (`multi_path_reversible_predict_het_with_placebo_global`) that calls
`did_multiplegt_dyn(predict_het=list("het_x", c(-1)), placebo=2, effects=3)`
WITHOUT by_path; verified R emits forward (effect=1,2,3) AND placebo
(effect=-1,-2) rows on the global surface with 3 paths × 30 switchers each
= n_obs=90 per row. New parity test class
`TestDCDHDynRParityHeterogeneityWithPlacebo` pins all 6 inference fields
across 5 horizons at the same tolerances as the per-path version.

Test changes:
- `test_predict_het_placebo_survey_design_raises` renamed to
  `test_predict_het_placebo_survey_design_warns_and_skips_backward`;
  asserts UserWarning is emitted, no exception raised, and both
  `heterogeneity_effects` and `path_heterogeneity_effects` contain only
  positive-int keys (forward-only emission).
- New `test_compute_heterogeneity_test_direct_call_raises_on_backward_survey`
  exercises the per-iteration backstop directly; locks the API contract
  for any future internal call site that bypasses fit().
- New `test_parity_multi_path_reversible_predict_het_with_placebo_global`
  pins scenario 23 R-parity.

Doc updates: REGISTRY heterogeneity Note clarifies "warn + skip" semantics
(was "raises NotImplementedError"), references both global and per-path
parity classes. CHANGELOG entry rewritten to reflect global + per-path
scope. llms-full.txt and API rst updated similarly.

Test count: 312 -> 314 (added 2; the renamed test still passes under its
new contract).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The top-level heterogeneity Note in REGISTRY.md still said "This
implementation provides only post-treatment regressions" — stale after
the placebo predict_het additions in this branch's earlier commits.
Updated to reflect:

1. Python now matches R on per-horizon placebo regressions (when
   `placebo=True` + `heterogeneity=` are co-set) — refers readers to
   the "Placebo predict_het" sub-note for the full contract (global +
   per-path scope, survey warn+skip behavior, R-parity classes).
2. The remaining gap from R is the joint null Wald F-test across all
   predict_het rows — Python emits per-horizon inference only.
3. The `controls` mutex stays unchanged.

Cross-references the existing render-time note in `_render_heterogeneity_section`
("Per-horizon regressions only (no joint F-test)") so the registry text
matches the user-facing summary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall Assessment

✅ Looks good

Executive Summary

  • No unmitigated P0/P1 issues in the estimator changes. The new placebo predict_het path, negative-index guard, and df threading are consistent with the updated methodology contract.
  • The remaining methodology gaps are explicitly documented/tracked rather than silent: survey backward-horizon heterogeneity is gated, and near-rank-deficient df still uses pre-drop column count.
  • Minor user-facing docs drift remains: to_dataframe(level="by_path") still says placebo het_* columns are NaN, but the implementation now fills them.
  • The regenerated golden-value schema is type-unstable for empty placebo_* mappings ([] when empty, object when populated), which is a maintainability/test-asset issue.
  • I could not run pytest in this environment because pytest is not installed.

Methodology

  • No unmitigated P0/P1 findings. The changed estimator path in diff_diff/chaisemartin_dhaultfoeuille.py:L5092-L5215 matches the updated registry note in docs/methodology/REGISTRY.md:L637-L643, and the remaining deviations are documented rather than silent.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • Severity: P3. Impact: the golden-value contract for placebo heterogeneity is type-unstable: empty placebo_predict_het / placebo_horizons serialize as arrays, while populated cases serialize as objects, which makes generic consumers fragile. See benchmarks/R/generate_dcdh_dynr_test_values.R:L631-L658, benchmarks/R/generate_dcdh_dynr_test_values.R:L665-L699, and the emitted examples at benchmarks/data/dcdh_dynr_golden_values.json:L1492-L1492, benchmarks/data/dcdh_dynr_golden_values.json:L1551-L1551. Concrete fix: normalize empty cases to object-shaped mappings before serialization, or normalize them on load in the parity helpers.

Tech Debt

  • Severity: P3. Impact: the PR intentionally leaves two edge cases deferred rather than silently wrong: survey placebo + predict_het is gated off, and near-rank-deficient non-survey heterogeneity still uses n_obs - n_params instead of n_obs - rank for t df. See diff_diff/chaisemartin_dhaultfoeuille.py:L3885-L3899, diff_diff/chaisemartin_dhaultfoeuille.py:L3936-L3951, diff_diff/chaisemartin_dhaultfoeuille.py:L5182-L5215, and TODO.md:L63-L64. Concrete fix: follow the existing TODO items when the pre-period survey allocator and rank-threading work are picked up.

Security

  • No findings.

Documentation/Tests

  • Severity: P3. Impact: to_dataframe(level="by_path") still documents het_* as populated only on positive horizons and NaN on placebo rows, but the implementation now fills placebo het_* fields from path_heterogeneity_effects. See the stale docstring at diff_diff/chaisemartin_dhaultfoeuille_results.py:L1530-L1558 versus the new behavior at diff_diff/chaisemartin_dhaultfoeuille_results.py:L1865-L1908. Concrete fix: update the docstring to match the new placebo-row contract and mention the survey forward-only fallback.
  • Verification note: I could not execute the added tests because pytest is not installed in this environment.

Two informational findings from the latest review:

1. Stale `to_dataframe(level="by_path")` docstring at
   `chaisemartin_dhaultfoeuille_results.py:1530-1558` still claimed
   placebo `het_*` columns are NaN. Updated to document the post-#422
   contract: positive-horizon AND negative-horizon (placebo) rows are
   both populated when `placebo=True + heterogeneity=` are co-set;
   placebo rows under `survey_design` remain NaN with a fit-time
   UserWarning.

2. JSON golden-fixture type instability for empty `placebo_predict_het`
   / `placebo_horizons` slots. R's `jsonlite::toJSON` serializes
   plain `list()` as `[]` (array) but populated named lists as `{}`
   (object), so consumers iterating `.items()` on the slot saw
   different shapes across scenarios. Fixed both ends:
   - R-side: extractors initialize empty slots with
     `structure(list(), names = character(0))` which jsonlite
     serializes as `{}` even when empty. Verified across 4 scenarios
     (20, 21, 22, 23) — all `placebo_predict_het` / `placebo_horizons`
     slots now serialize as objects regardless of population.
   - Python-side: added `_as_dict` helper in the parity test module
     as a defensive backstop coercing any non-dict (None / [] /
     missing) to {}. Used at the two call sites that read optional
     placebo slots so consumers can call `.items()` uniformly.

The golden JSON regenerated; type-stable across all scenarios (verified
via `jq` on each scenario's predict_het type).

314 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: bcdb00a19d0d673d29a890a148ebc836633201db


Overall Assessment

✅ Looks good

Executive Summary

  • No unmitigated P0/P1 issues found in the changed estimator paths. The new backward-horizon predict_het support and finite-df threading are consistent with the updated methodology contract in docs/methodology/REGISTRY.md:L637-L643.
  • The prior re-review P3s appear addressed: the by-path to_dataframe() docs now describe placebo het_* population in diff_diff/chaisemartin_dhaultfoeuille_results.py:L1546-L1566, and the R fixture extractor now stabilizes empty placebo mappings in benchmarks/R/generate_dcdh_dynr_test_values.R:L631-L707.
  • Remaining methodology caveats are explicitly documented/tracked rather than silent: survey-weighted backward-horizon heterogeneity is warn+skip, and near-rank-deficient non-survey df still uses n_obs - n_params; both are tracked in TODO.md:L63-L64.
  • One minor docs regression remains: the public fit() docstring still says heterogeneity has “no placebo regressions,” and the heterogeneity summary note does not mention the new survey forward-only fallback.
  • Verification note: I could not run pytest here because python -m pytest --version fails with No module named pytest.

Methodology

  • No unmitigated findings. The changed behavior is internally consistent across diff_diff/chaisemartin_dhaultfoeuille.py:L3878-L3915, diff_diff/chaisemartin_dhaultfoeuille.py:L5087-L5215, the updated registry note at docs/methodology/REGISTRY.md:L637-L643, and the new parity coverage in tests/test_chaisemartin_dhaultfoeuille_parity.py:L1558-L1811.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings. The previous golden-fixture shape instability appears resolved by the explicit empty-object serialization in benchmarks/R/generate_dcdh_dynr_test_values.R:L631-L707.

Tech Debt

  • Severity: P3-informational. Impact: survey_design + placebo + heterogeneity intentionally falls back to forward-horizon-only heterogeneity because the pre-period cell allocator is still underived; this is now explicit and tracked, not silent. Concrete fix: follow the tracked TODO to derive a pre-period survey allocator before enabling backward-horizon survey predict_het. References: diff_diff/chaisemartin_dhaultfoeuille.py:L3885-L3899, docs/methodology/REGISTRY.md:L637-L643, TODO.md:L63-L63.
  • Severity: P3-informational. Impact: near-rank-deficient non-survey heterogeneity still uses df = n_obs - n_params instead of post-drop rank, so R parity can still drift in that edge case; fully rank-deficient cases remain NaN-consistent. Concrete fix: thread effective rank from solve_ols() into _compute_heterogeneity_test, as already tracked. References: diff_diff/chaisemartin_dhaultfoeuille.py:L5195-L5215, TODO.md:L64-L64.

Security

  • No findings.

Documentation/Tests

  • Severity: P3. Impact: the public/in-code documentation is only partially updated. fit() still says heterogeneity is “post-treatment regressions only (no placebo regressions),” which now contradicts the implementation, API docs, registry, and new parity tests; the heterogeneity summary note also omits the survey forward-only fallback. This can mislead users and future reviewers who rely on in-code docstrings as methodology references. Concrete fix: update the heterogeneity parameter docstring in diff_diff/chaisemartin_dhaultfoeuille.py:L980-L989 and the summary note in diff_diff/chaisemartin_dhaultfoeuille_results.py:L1279-L1283 to match the contract already documented in docs/api/chaisemartin_dhaultfoeuille.rst:L33-L41 and docs/methodology/REGISTRY.md:L637-L643.
  • No other findings. The previous by-path to_dataframe() doc drift flagged in the last review looks fixed in diff_diff/chaisemartin_dhaultfoeuille_results.py:L1546-L1566, and the new global/per-path placebo parity tests are present in tests/test_chaisemartin_dhaultfoeuille_parity.py:L1558-L1811.

…tract

Two stale in-code documentation surfaces from the latest review:

1. `chaisemartin_dhaultfoeuille.py:980-989` — the `heterogeneity`
   parameter docstring on `fit()` still said "post-treatment regressions
   only (no placebo regressions)". Updated to document the post-#422
   contract: per-horizon OLS regressions on forward AND backward
   (placebo) horizons when `placebo=True`; survey_design composes with
   forward horizons but warns + skips backward horizons until the
   pre-period cell allocator is derived.

2. `chaisemartin_dhaultfoeuille_results.py:1279-1286` — the heterogeneity
   summary note didn't mention the survey forward-only fallback.
   Extended the note to cover the gating semantics so users reading
   `result.summary()` under `survey_design + heterogeneity` know what
   they're getting.

Both surfaces now match the contract already documented in the API rst,
REGISTRY, and CHANGELOG. No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: d20988da5d099dec24355c6b09e1afaf13469f5b


Overall Assessment

✅ Looks good

Executive Summary

  • No unmitigated P0/P1 issues found in the changed estimator paths.
  • The affected method is dCDH heterogeneity testing from Web Appendix Section 1.5 / Lemma 7; the new backward-horizon predict_het behavior and finite-df inference are aligned with the updated Methodology Registry and the added R-parity fixtures.
  • The prior re-review P3s appear addressed: the public fit() docstring, the heterogeneity summary note, the by-path dataframe docs, and the R fixture extractor’s empty-map handling are all updated.
  • The new out_idx < 0 guard closes the only obvious silent-indexing risk in the placebo heterogeneity loop at diff_diff/chaisemartin_dhaultfoeuille.py:L5136-L5148.
  • Remaining caveats are documented and tracked rather than silent: survey-weighted backward-horizon heterogeneity is intentionally deferred, and near-rank-deficient df still uses n_obs - n_params.
  • Verification note: pytest is not available in this workspace (python -m pytest --version fails with No module named pytest), so this is a diff-based re-review.

Methodology
No findings. The changed heterogeneity paths are consistent with the documented contract in docs/methodology/REGISTRY.md:L637-L647, the fit-time threading in diff_diff/chaisemartin_dhaultfoeuille.py:L3889-L3963, the core implementation in diff_diff/chaisemartin_dhaultfoeuille.py:L5098-L5354, and the new global/per-path R-parity coverage in tests/test_chaisemartin_dhaultfoeuille_parity.py:L1358-L1535 and tests/test_chaisemartin_dhaultfoeuille_parity.py:L1558-L1795.

Code Quality
No findings.

Performance
No findings.

Maintainability
No findings. The earlier fixture-shape instability from empty predict_het maps appears fixed by the extractor changes in benchmarks/R/generate_dcdh_dynr_test_values.R:L631-L707.

Tech Debt

  • Severity: P3-informational. Impact: survey_design + placebo + heterogeneity intentionally falls back to forward-only heterogeneity because the pre-period Binder TSL allocator is not yet derived. Concrete fix: derive the pre-period allocator, or a covariance-aware alternative, before enabling backward-horizon survey predict_het. References: diff_diff/chaisemartin_dhaultfoeuille.py:L3889-L3920, docs/methodology/REGISTRY.md:L637-L647, TODO.md:L63-L63.
  • Severity: P3-informational. Impact: near-rank-deficient non-survey heterogeneity can still drift from R because the t-distribution df uses pre-drop n_obs - n_params instead of post-drop rank. Concrete fix: thread effective rank from solve_ols() into _compute_heterogeneity_test. References: diff_diff/chaisemartin_dhaultfoeuille.py:L5193-L5226, TODO.md:L64-L64.

Security
No findings.

Documentation/Tests
No findings. The previous documentation drift appears resolved at diff_diff/chaisemartin_dhaultfoeuille.py:L980-L1000, diff_diff/chaisemartin_dhaultfoeuille_results.py:L1280-L1286, diff_diff/chaisemartin_dhaultfoeuille_results.py:L1555-L1568, and docs/api/chaisemartin_dhaultfoeuille.rst:L33-L41. The new tests cover the changed paths well, including global/per-path placebo parity, survey warn+skip behavior, and the negative-index eligibility guard at tests/test_chaisemartin_dhaultfoeuille.py:L11140-L11380 and tests/test_chaisemartin_dhaultfoeuille_parity.py:L1558-L1795.

@igerber igerber added the ready-for-ci Triggers CI test workflows label May 15, 2026
@igerber igerber merged commit 2cb8def into main May 16, 2026
33 of 34 checks passed
@igerber igerber deleted the dcdh-heterogeneity-placebo-and-df branch May 16, 2026 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant