dCDH by_path: lift trends_linear + trends_nonparam gates (Wave 3 #6+#7)#393
dCDH by_path: lift trends_linear + trends_nonparam gates (Wave 3 #6+#7)#393
Conversation
|
Overall Assessment
Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
…strap CI reviewer flagged that the cumulated layer was built before the bootstrap propagation block, so when n_bootstrap > 0 it surfaced stale analytical SEs / t-stats / p-values / CIs while path_effects[path][l] had been overwritten with bootstrap stats — a public-surface inconsistency with the library-wide bootstrap contract. Fix: move the _compute_path_cumulated_event_study() call to AFTER the bootstrap propagation block at chaisemartin_dhaultfoeuille.py:3034-3081 (mirrors the global linear_trends_effects placement at :3405-3454). The helper reads path_effects[path]["horizons"][l]["se"] which by then is bootstrap-overwritten under n_bootstrap > 0, so cumulated SE becomes a running sum of bootstrap per-horizon SEs. Also addresses the P2 test-coverage gap with two new regressions in TestByPathTrendsLinear: - test_bootstrap_cumulated_uses_post_bootstrap_per_horizon_se: positive case asserting cumulated SE == running sum of post-bootstrap per- horizon SEs (rtol 1e-12). - test_bootstrap_cumulated_nan_consistent_when_n_bootstrap_one: n_bootstrap=1 case asserting cumulated SE / t_stat / p_value / conf_int are all NaN per the library-wide NaN-on-invalid contract. REGISTRY note updated with the post-bootstrap recomputation contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Methodology
Resolved in this rereview: the earlier stale-bootstrap-cumulation issue is fixed by recomputing Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
CI reviewer flagged that by_path + trends_linear was ungated on all
panels but only warned on F_g<3 exclusions, while the analogous
by_path + controls path warns on multi-baseline switcher panels
(D_{g,1} multiplicity) where R's per-path full-pipeline call diverges
from Python's global-then-disaggregate architecture. The by_path
docstring already acknowledged the "same multi-baseline divergence
pattern as controls" (line 461-462), but the runtime warning was
missing — a docstring-vs-implementation gap.
Fix: mirror the controls warning at chaisemartin_dhaultfoeuille.py
:1565-1584 inside the trends_linear block (right after the F_g<3
warning). Same predicate (switcher mask on first_switch_idx_arr,
unique baseline check). Different message naming the trends_linear
mechanism (full-pipeline including first-differencing per path
subset).
REGISTRY note updated to spell out the warning explicitly.
Three regression tests added in TestByPathTrendsLinear, mirroring
TestByPathControls:
- test_multi_baseline_panel_emits_r_deviation_warning: panel with
joiners and leavers must emit the warning
- test_single_baseline_panel_does_not_emit_r_deviation_warning:
single-baseline panel must not warn
- test_single_baseline_heterogeneous_F_g_does_not_warn: pins the
precise predicate (baseline multiplicity, NOT F_g multiplicity)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
P1 fix — test coverage gaps for newly lifted parameter interactions: - test_per_period_effects_unaffected_by_trends_linear_by_path now actually inspects per_period_effects (was checking event_study_effects by mistake). Asserts no_bp == bp on did_plus_t / did_minus_t at every horizon since per-period DID uses raw Y_mat per the contract at chaisemartin_dhaultfoeuille.py:1493-1496. - Symmetric test_per_period_effects_unaffected_by_trends_nonparam_by_path added — set_ids doesn't touch the per-period path either. - test_per_path_placebos_with_trends_linear_present asserts path_placebo_event_study is populated under by_path + trends_linear + placebo=True with finite values. The R-parity path skips placebo rows due to documented divergence; this regression locks the Python-side surface population so the path doesn't silently regress. - test_sup_t_bands_with_trends_linear_finite_crit and the nonparam symmetric test pin the bootstrap-collector path: path_sup_t_bands populated, cband_lower/upper columns surface on to_dataframe(level="by_path") under both new combinations with n_bootstrap > 0. P3 fix — docstring schema staleness: - by_path docstring corrected: path_effects[path]["horizons"][l] (not path_effects[path][l]). - to_dataframe(level="by_path") schema now lists cumulated_effect / cumulated_se columns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good - the prior re-review P1 is addressed, and I found no unmitigated P0/P1 issues in the changed estimator paths. One P3 documentation inconsistency remains. Executive Summary
Methodology No findings. The affected methods are Phase 3 Code Quality No findings. Performance No findings. Maintainability No findings. Tech Debt No findings. Security No findings. Documentation/Tests The previous re-review P1 is resolved: per-period invariance, trends-linear negative-horizon placebo population, and trends_nonparam bootstrap/sup-t/cband coverage are now explicitly tested in
|
Final P3 cleanup round, no methodology / behavior changes: - REGISTRY.md by_path Note schema now lists cumulated_effect / cumulated_se in the to_dataframe(level="by_path") column list (was missing). - REGISTRY.md trends_linear paragraph: corrected path_effects[path][l] -> path_effects[path]["horizons"][l] to match the shipped result-class shape. - TestByPathTrendsLinear class docstring: same path_effects shape correction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall AssessmentExecutive Summary
MethodologyNo findings. The implementation now matches the documented registry contract for:
Code QualityNo findings. PerformanceNo findings. MaintainabilityNo findings. Tech DebtNo findings. I did not see an existing SecurityNo findings. Documentation/Tests
Path to Approval
|
…string P1 fixes — coverage gaps for newly reachable bootstrap+placebo paths: - test_per_path_placebos_with_trends_linear_bootstrap_inference: asserts negative-horizon SE differs between analytical and bootstrap fits, proving the placebo bootstrap propagation block runs through the first-differenced path. - test_per_path_placebos_with_trends_linear_bootstrap_nan_consistent: n_bootstrap=1 case asserting NaN-consistent inference on negative horizons (locks the library-wide NaN-on-invalid contract on this newly reachable surface). - test_per_path_placebos_with_trends_nonparam_bootstrap_inference: comparison fit with vs without trends_nonparam under bootstrap + placebo; asserts negative-horizon SE differs, proving set_ids reaches _collect_path_placebo_bootstrap_inputs. P3 fix — Results dataclass attribute documentation: - Added a path_cumulated_event_study attribute entry to ChaisemartinDHaultfoeuilleResults attributes docstring (was added as a field but missing from the public attribute table). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall AssessmentExecutive Summary
Methodology
Code QualityNo findings. PerformanceNo findings. MaintainabilityNo findings. Tech DebtNo findings. SecurityNo findings. Documentation/TestsNo findings on the PR itself. The last review’s two test-surface P1s appear resolved by the new negative-horizon bootstrap tests. Review limitation: Path to Approval
|
…ontract CI reviewer flagged that the by_path + trends_linear public contract overstated R parity. The shipped contract said "single-baseline panels" broadly, but the parity scenario was restricted to F_g >= 4 because F_g=3 switchers trigger a 30%+ point-estimate divergence between Python's global-then-disaggregate architecture and R's per-path full- pipeline call (only 1 valid pre-window Z value remains after first-differencing + time==1 filter, triggering different control- eligibility treatment). Fix: add a targeted UserWarning fired at fit-time when by_path + trends_linear is set AND any switcher has F_g==3 (predicate: first_switch_idx_arr == 2). Mirrors the existing F_g < 3 exclusion warning but for the boundary case rather than the exclusion case. Documentation: docstring at the by_path paragraph now narrows the parity claim to "single-baseline panels with sufficient pre-window depth (F_g >= 4 for every selected-path switcher)" and explicitly documents the F_g=3 boundary regime + warning. REGISTRY adds the same narrowing + warning text to the per-path linear-trends Note. CHANGELOG adds the divergence note to the Unreleased entry. Regression test test_F_g_three_boundary_case_emits_warning added in TestByPathTrendsLinear: 4 F_g=3 switchers + 4 F_g=4 switchers in a single-baseline panel, asserts the new warning fires (named "by_path + trends_linear" + "F_g=3"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Methodology
Code Quality Performance Maintainability Tech Debt Security Documentation/Tests |
…strap CI reviewer flagged that the cumulated layer was built before the bootstrap propagation block, so when n_bootstrap > 0 it surfaced stale analytical SEs / t-stats / p-values / CIs while path_effects[path][l] had been overwritten with bootstrap stats — a public-surface inconsistency with the library-wide bootstrap contract. Fix: move the _compute_path_cumulated_event_study() call to AFTER the bootstrap propagation block at chaisemartin_dhaultfoeuille.py:3034-3081 (mirrors the global linear_trends_effects placement at :3405-3454). The helper reads path_effects[path]["horizons"][l]["se"] which by then is bootstrap-overwritten under n_bootstrap > 0, so cumulated SE becomes a running sum of bootstrap per-horizon SEs. Also addresses the P2 test-coverage gap with two new regressions in TestByPathTrendsLinear: - test_bootstrap_cumulated_uses_post_bootstrap_per_horizon_se: positive case asserting cumulated SE == running sum of post-bootstrap per- horizon SEs (rtol 1e-12). - test_bootstrap_cumulated_nan_consistent_when_n_bootstrap_one: n_bootstrap=1 case asserting cumulated SE / t_stat / p_value / conf_int are all NaN per the library-wide NaN-on-invalid contract. REGISTRY note updated with the post-bootstrap recomputation contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
f38c703 to
7c7b790
Compare
CI reviewer flagged that by_path + trends_linear was ungated on all
panels but only warned on F_g<3 exclusions, while the analogous
by_path + controls path warns on multi-baseline switcher panels
(D_{g,1} multiplicity) where R's per-path full-pipeline call diverges
from Python's global-then-disaggregate architecture. The by_path
docstring already acknowledged the "same multi-baseline divergence
pattern as controls" (line 461-462), but the runtime warning was
missing — a docstring-vs-implementation gap.
Fix: mirror the controls warning at chaisemartin_dhaultfoeuille.py
:1565-1584 inside the trends_linear block (right after the F_g<3
warning). Same predicate (switcher mask on first_switch_idx_arr,
unique baseline check). Different message naming the trends_linear
mechanism (full-pipeline including first-differencing per path
subset).
REGISTRY note updated to spell out the warning explicitly.
Three regression tests added in TestByPathTrendsLinear, mirroring
TestByPathControls:
- test_multi_baseline_panel_emits_r_deviation_warning: panel with
joiners and leavers must emit the warning
- test_single_baseline_panel_does_not_emit_r_deviation_warning:
single-baseline panel must not warn
- test_single_baseline_heterogeneous_F_g_does_not_warn: pins the
precise predicate (baseline multiplicity, NOT F_g multiplicity)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P1 fix — test coverage gaps for newly lifted parameter interactions: - test_per_period_effects_unaffected_by_trends_linear_by_path now actually inspects per_period_effects (was checking event_study_effects by mistake). Asserts no_bp == bp on did_plus_t / did_minus_t at every horizon since per-period DID uses raw Y_mat per the contract at chaisemartin_dhaultfoeuille.py:1493-1496. - Symmetric test_per_period_effects_unaffected_by_trends_nonparam_by_path added — set_ids doesn't touch the per-period path either. - test_per_path_placebos_with_trends_linear_present asserts path_placebo_event_study is populated under by_path + trends_linear + placebo=True with finite values. The R-parity path skips placebo rows due to documented divergence; this regression locks the Python-side surface population so the path doesn't silently regress. - test_sup_t_bands_with_trends_linear_finite_crit and the nonparam symmetric test pin the bootstrap-collector path: path_sup_t_bands populated, cband_lower/upper columns surface on to_dataframe(level="by_path") under both new combinations with n_bootstrap > 0. P3 fix — docstring schema staleness: - by_path docstring corrected: path_effects[path]["horizons"][l] (not path_effects[path][l]). - to_dataframe(level="by_path") schema now lists cumulated_effect / cumulated_se columns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final P3 cleanup round, no methodology / behavior changes: - REGISTRY.md by_path Note schema now lists cumulated_effect / cumulated_se in the to_dataframe(level="by_path") column list (was missing). - REGISTRY.md trends_linear paragraph: corrected path_effects[path][l] -> path_effects[path]["horizons"][l] to match the shipped result-class shape. - TestByPathTrendsLinear class docstring: same path_effects shape correction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…string P1 fixes — coverage gaps for newly reachable bootstrap+placebo paths: - test_per_path_placebos_with_trends_linear_bootstrap_inference: asserts negative-horizon SE differs between analytical and bootstrap fits, proving the placebo bootstrap propagation block runs through the first-differenced path. - test_per_path_placebos_with_trends_linear_bootstrap_nan_consistent: n_bootstrap=1 case asserting NaN-consistent inference on negative horizons (locks the library-wide NaN-on-invalid contract on this newly reachable surface). - test_per_path_placebos_with_trends_nonparam_bootstrap_inference: comparison fit with vs without trends_nonparam under bootstrap + placebo; asserts negative-horizon SE differs, proving set_ids reaches _collect_path_placebo_bootstrap_inputs. P3 fix — Results dataclass attribute documentation: - Added a path_cumulated_event_study attribute entry to ChaisemartinDHaultfoeuilleResults attributes docstring (was added as a field but missing from the public attribute table). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ontract CI reviewer flagged that the by_path + trends_linear public contract overstated R parity. The shipped contract said "single-baseline panels" broadly, but the parity scenario was restricted to F_g >= 4 because F_g=3 switchers trigger a 30%+ point-estimate divergence between Python's global-then-disaggregate architecture and R's per-path full- pipeline call (only 1 valid pre-window Z value remains after first-differencing + time==1 filter, triggering different control- eligibility treatment). Fix: add a targeted UserWarning fired at fit-time when by_path + trends_linear is set AND any switcher has F_g==3 (predicate: first_switch_idx_arr == 2). Mirrors the existing F_g < 3 exclusion warning but for the boundary case rather than the exclusion case. Documentation: docstring at the by_path paragraph now narrows the parity claim to "single-baseline panels with sufficient pre-window depth (F_g >= 4 for every selected-path switcher)" and explicitly documents the F_g=3 boundary regime + warning. REGISTRY adds the same narrowing + warning text to the per-path linear-trends Note. CHANGELOG adds the divergence note to the Unreleased entry. Regression test test_F_g_three_boundary_case_emits_warning added in TestByPathTrendsLinear: 4 F_g=3 switchers + 4 F_g=4 switchers in a single-baseline panel, asserts the new warning fires (named "by_path + trends_linear" + "F_g=3"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wave 3 #6+#7 of the dCDH by_path follow-up sequence (after PR #378 shipped #5 by_path + controls). Removes the two NotImplementedError gates at chaisemartin_dhaultfoeuille.py:1014-1023 and adds: - New `path_cumulated_event_study` Results field surfacing the cumulated level effect delta_l per path under trends_linear=True (mirrors the global linear_trends_effects cumulation; this is what R returns under did_multiplegt_dyn(..., by_path, trends_lin)). - `set_ids` parameter threaded through the four per-path IF helpers so trends_nonparam's set-restricted control pool reaches per-path analytical SE, bootstrap, placebos, and sup-t bands automatically. - to_dataframe(level="by_path") gains cumulated_effect / cumulated_se columns (always present, NaN-when-None — mirrors cband_*). - summary() renders a per-path "Cumulated Level Effects" sub-section. Validated against R DIDmultiplegtDYN 2.3.3 via two new golden scenarios: - single_baseline_multi_path_by_path_trends_lin (custom DGP: F_g >= 4, cohort-single-path, n_periods=13) — per-path cumulated point estimates match R bit-exactly (POINT_RTOL=1e-9), cumulated SE within ±20% - multi_path_reversible_by_path_trends_nonparam — per-path point estimates AND placebos match R bit-exactly, SE within ±15% Placebo parity for trends_linear is intentionally skipped: R's per-path placebo re-runs on the path-restricted subsample with different control eligibility than Python's global-then-disaggregate architecture, so the divergence is methodological, not a tolerance issue. Internal regression covers placebo + trends_linear (finite values, bootstrap inheritance). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…strap CI reviewer flagged that the cumulated layer was built before the bootstrap propagation block, so when n_bootstrap > 0 it surfaced stale analytical SEs / t-stats / p-values / CIs while path_effects[path][l] had been overwritten with bootstrap stats — a public-surface inconsistency with the library-wide bootstrap contract. Fix: move the _compute_path_cumulated_event_study() call to AFTER the bootstrap propagation block at chaisemartin_dhaultfoeuille.py:3034-3081 (mirrors the global linear_trends_effects placement at :3405-3454). The helper reads path_effects[path]["horizons"][l]["se"] which by then is bootstrap-overwritten under n_bootstrap > 0, so cumulated SE becomes a running sum of bootstrap per-horizon SEs. Also addresses the P2 test-coverage gap with two new regressions in TestByPathTrendsLinear: - test_bootstrap_cumulated_uses_post_bootstrap_per_horizon_se: positive case asserting cumulated SE == running sum of post-bootstrap per- horizon SEs (rtol 1e-12). - test_bootstrap_cumulated_nan_consistent_when_n_bootstrap_one: n_bootstrap=1 case asserting cumulated SE / t_stat / p_value / conf_int are all NaN per the library-wide NaN-on-invalid contract. REGISTRY note updated with the post-bootstrap recomputation contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI reviewer flagged that by_path + trends_linear was ungated on all
panels but only warned on F_g<3 exclusions, while the analogous
by_path + controls path warns on multi-baseline switcher panels
(D_{g,1} multiplicity) where R's per-path full-pipeline call diverges
from Python's global-then-disaggregate architecture. The by_path
docstring already acknowledged the "same multi-baseline divergence
pattern as controls" (line 461-462), but the runtime warning was
missing — a docstring-vs-implementation gap.
Fix: mirror the controls warning at chaisemartin_dhaultfoeuille.py
:1565-1584 inside the trends_linear block (right after the F_g<3
warning). Same predicate (switcher mask on first_switch_idx_arr,
unique baseline check). Different message naming the trends_linear
mechanism (full-pipeline including first-differencing per path
subset).
REGISTRY note updated to spell out the warning explicitly.
Three regression tests added in TestByPathTrendsLinear, mirroring
TestByPathControls:
- test_multi_baseline_panel_emits_r_deviation_warning: panel with
joiners and leavers must emit the warning
- test_single_baseline_panel_does_not_emit_r_deviation_warning:
single-baseline panel must not warn
- test_single_baseline_heterogeneous_F_g_does_not_warn: pins the
precise predicate (baseline multiplicity, NOT F_g multiplicity)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P1 fix — test coverage gaps for newly lifted parameter interactions: - test_per_period_effects_unaffected_by_trends_linear_by_path now actually inspects per_period_effects (was checking event_study_effects by mistake). Asserts no_bp == bp on did_plus_t / did_minus_t at every horizon since per-period DID uses raw Y_mat per the contract at chaisemartin_dhaultfoeuille.py:1493-1496. - Symmetric test_per_period_effects_unaffected_by_trends_nonparam_by_path added — set_ids doesn't touch the per-period path either. - test_per_path_placebos_with_trends_linear_present asserts path_placebo_event_study is populated under by_path + trends_linear + placebo=True with finite values. The R-parity path skips placebo rows due to documented divergence; this regression locks the Python-side surface population so the path doesn't silently regress. - test_sup_t_bands_with_trends_linear_finite_crit and the nonparam symmetric test pin the bootstrap-collector path: path_sup_t_bands populated, cband_lower/upper columns surface on to_dataframe(level="by_path") under both new combinations with n_bootstrap > 0. P3 fix — docstring schema staleness: - by_path docstring corrected: path_effects[path]["horizons"][l] (not path_effects[path][l]). - to_dataframe(level="by_path") schema now lists cumulated_effect / cumulated_se columns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final P3 cleanup round, no methodology / behavior changes: - REGISTRY.md by_path Note schema now lists cumulated_effect / cumulated_se in the to_dataframe(level="by_path") column list (was missing). - REGISTRY.md trends_linear paragraph: corrected path_effects[path][l] -> path_effects[path]["horizons"][l] to match the shipped result-class shape. - TestByPathTrendsLinear class docstring: same path_effects shape correction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…string P1 fixes — coverage gaps for newly reachable bootstrap+placebo paths: - test_per_path_placebos_with_trends_linear_bootstrap_inference: asserts negative-horizon SE differs between analytical and bootstrap fits, proving the placebo bootstrap propagation block runs through the first-differenced path. - test_per_path_placebos_with_trends_linear_bootstrap_nan_consistent: n_bootstrap=1 case asserting NaN-consistent inference on negative horizons (locks the library-wide NaN-on-invalid contract on this newly reachable surface). - test_per_path_placebos_with_trends_nonparam_bootstrap_inference: comparison fit with vs without trends_nonparam under bootstrap + placebo; asserts negative-horizon SE differs, proving set_ids reaches _collect_path_placebo_bootstrap_inputs. P3 fix — Results dataclass attribute documentation: - Added a path_cumulated_event_study attribute entry to ChaisemartinDHaultfoeuilleResults attributes docstring (was added as a field but missing from the public attribute table). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ontract CI reviewer flagged that the by_path + trends_linear public contract overstated R parity. The shipped contract said "single-baseline panels" broadly, but the parity scenario was restricted to F_g >= 4 because F_g=3 switchers trigger a 30%+ point-estimate divergence between Python's global-then-disaggregate architecture and R's per-path full- pipeline call (only 1 valid pre-window Z value remains after first-differencing + time==1 filter, triggering different control- eligibility treatment). Fix: add a targeted UserWarning fired at fit-time when by_path + trends_linear is set AND any switcher has F_g==3 (predicate: first_switch_idx_arr == 2). Mirrors the existing F_g < 3 exclusion warning but for the boundary case rather than the exclusion case. Documentation: docstring at the by_path paragraph now narrows the parity claim to "single-baseline panels with sufficient pre-window depth (F_g >= 4 for every selected-path switcher)" and explicitly documents the F_g=3 boundary regime + warning. REGISTRY adds the same narrowing + warning text to the per-path linear-trends Note. CHANGELOG adds the divergence note to the Unreleased entry. Regression test test_F_g_three_boundary_case_emits_warning added in TestByPathTrendsLinear: 4 F_g=3 switchers + 4 F_g=4 switchers in a single-baseline panel, asserts the new warning fires (named "by_path + trends_linear" + "F_g=3"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7c7b790 to
41d54b9
Compare
Summary
Lifts the two
NotImplementedErrorgates blockingChaisemartinDHaultfoeuille.by_pathfrom combining withtrends_linear(DID^{fd} group-specific linear trends) andtrends_nonparam(state-set trends), Wave 3 #6+#7 of the dCDH by_path follow-up sequence (PR #378 shipped #5).path_cumulated_event_study(new Results field): surfaces the cumulated level effectdelta_l = sum_{l'=1..l} DID^{fd}_{path, l'}per path undertrends_linear=True. This is what R returns underdid_multiplegt_dyn(..., by_path, trends_lin). Mirrors the globallinear_trends_effectscumulation; raw per-horizonDID^{fd}_lper path still surfaces onpath_effects[path][l].set_idsthreading:set_idsparameter added to the four per-path IF helpers (_compute_path_effects,_compute_path_placebos,_collect_path_bootstrap_inputs,_collect_path_placebo_bootstrap_inputs) sotrends_nonparam's set-restricted control pool reaches per-path analytical SE, bootstrap, placebos, and sup-t bands automatically.to_dataframe(level="by_path")gainscumulated_effect/cumulated_secolumns (always present, NaN-when-None — mirrors thecband_*convention).summary()renders a per-path "Cumulated Level Effects" sub-section under each path block.R parity validated via two new golden-value scenarios (
single_baseline_multi_path_by_path_trends_linfor trends_linear andmulti_path_reversible_by_path_trends_nonparamfor trends_nonparam). Per-path point estimates match R bit-exactly (POINT_RTOL=1e-9); cumulated SE withinCUM_SE_RTOL=0.20(widened vs the 0.12 used for non-cumulated by_path parity because the conservative upper-bound SE compounds the cross-path cohort-sharing deviation under summation). Placebo parity fortrends_linearis intentionally skipped (documented architectural divergence between R's per-path full-pipeline call and Python's global-then-disaggregate architecture).Test plan
TestByPathTrendsLinear(10 non-slow + 1 slow): gate removal,path_cumulated_event_studypopulated under trends_linear,is Nonewithout trends_linear, conservative SE upper bound formula,to_dataframecolumns,summary()rendering, bootstrap finite SE inheritance, per-period unaffectedTestByPathTrendsNonparam(5 non-slow + 1 slow): gate removal, set restriction changes per-path estimates (sanity check thatset_idsthreading reached the IF helpers), per-path SE finite, validation passthrough (time-varying / missing column), bootstrap finite SETestDCDHDynRParityByPathTrendsLinear: per-path cumulated point estimates match R bit-exactly on event horizons; cumulated SE within ±20%TestDCDHDynRParityByPathTrendsNonparam: per-path point estimates AND placebos match R bit-exactly; per-path SE within ±15%TestLinearTrends(non-by_path),TestStateSetTrends(non-by_path),TestCovariateAdjustment(non-by_path) remain unchangedpytest tests/test_chaisemartin_dhaultfoeuille.py tests/test_chaisemartin_dhaultfoeuille_parity.py tests/test_methodology_chaisemartin_dhaultfoeuille.py(239 + 10 = 249 tests passing locally)🤖 Generated with Claude Code