Add HonestDiD integration for dCDH, summary() Phase 3 blocks#303
Add HonestDiD integration for dCDH, summary() Phase 3 blocks#303
Conversation
…regression test - Add ChaisemartinDHaultfoeuilleResults extraction to _extract_event_study_params() in honest_did.py (maps placebo horizons to pre-periods, event study to post-periods) - Lift honest_did gate in fit(), add early L_max>=1 validation, post-computation compute_honest_did() call with fallback warning on solver failures - Add 5 new summary() sections: covariate diagnostics, cumulated level effects, heterogeneity test, design-2 descriptive, HonestDiD sensitivity bounds - Update honest_did_results field type annotation and docstring - 17 new tests: 7 HonestDiD integration, 5 summary rendering, 3 honest_did.py extraction/integration, 1 trends_nonparam unequal-support, 1 gate update - REGISTRY.md: HonestDiD note, checklist update; ROADMAP: 3g shipped Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-case test - Add UserWarning in dCDH HonestDiD extraction about placebo-based pre-periods - Update REGISTRY.md to explicitly document library extension semantics - Update fit() docstring for honest_did (was "Reserved for Phase 3") - Include exception class name in HonestDiD failure warning - Factor summary() Phase 3 blocks into 5 private helper methods - Add test_dcdh_emits_placebo_warning and test_dcdh_empty_consecutive_block_raises Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Overall Assessment Executive Summary
Path to Approval
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…ocstrings
P0: _largest_consecutive_block now raises ValueError when boundary horizon
(-1 or +1) is missing after finite-SE filtering instead of silently
returning the full list (would produce wrong HonestDiD bounds).
P1: honest_did=True now rejects placebo=False early instead of silently
returning honest_did_results=None with no warning.
P2: Added 3 regression tests (boundary -1 missing, boundary +1 missing,
placebo=False + honest_did).
P3: Updated docstrings in honest_did.py (6 locations) and docs/llms.txt
to include ChaisemartinDHaultfoeuilleResults alongside MultiPeriodDiD/CS.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Path to Approval
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
- Update fit() docstring to specify equal-weight average over post horizons (l_vec=None default) and note R's HonestDiD targets on-impact instead - Update REGISTRY.md with l_vec deviation from R's default - Add "Target: Equal-weight avg over post horizons" line to summary - Add test_honest_did_original_estimate_is_post_average regression Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Path to Approval
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
- Add l_vec parameter to compute_honest_did() so the advertised custom-target path actually works (was missing from wrapper) - Add test_honest_did_custom_l_vec_on_impact: l_vec=[1,0] targets on-impact effect, asserts original_estimate matches DID_1 - Add test_honest_did_with_trends_nonparam: end-to-end trends_nonparam + honest_did=True integration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Path to Approval
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…namically - Add target_label field to HonestDiDResults (default: equal-weight avg) - HonestDiD.fit() detects common l_vec patterns and sets human-readable label (on-impact, equal-weight, or custom with vector) - Summary renders hd.target_label instead of hard-coded string - Add test_honest_did_custom_l_vec_summary_label: attaches custom-target results and asserts summary shows "on-impact" not "Equal-weight" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
1 similar comment
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Path to Approval
Methodology
Code Quality No material findings. Performance No material findings. Maintainability No separate findings beyond the retained-horizon metadata propagation issue above. Tech Debt
Security No findings. Documentation/Tests
|
- Add pre_periods_used and post_periods_used fields to HonestDiDResults so the retained horizon set is always available on the results object - HonestDiD.fit() populates both fields from the extracted period lists - Summary renders retained horizons below the target label - Add test_honest_did_retains_period_metadata asserting fields populated and summary shows "Post horizons used:" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Path to Approval
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…mary/export - HonestDiDResults.summary() now renders target_label, pre/post_periods_used - HonestDiDResults.to_dict() includes target_label, pre/post_periods_used - to_dataframe() inherits from to_dict() automatically - Add test_dcdh_standalone_surfaces_target_metadata verifying all three surfaces include target metadata for custom l_vec Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Looks good. This re-review resolves the prior P1 on missing standalone HonestDiD estimand metadata; the highest remaining unmitigated item is P2 test coverage. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
test_honest_did_trends_nonparam_trimming: State B's early-switching controls vanish at far horizons, causing N_l=0 at h=3 and h=-3. HonestDiD extraction drops NaN-SE horizons and retains [-2,-1,1,2]. Asserts n_obs=0 at trimmed horizons, finite bounds on retained block, and post_periods_used excludes h=3. This exercises the real trends_nonparam support-trimming path through HonestDiD (not SE mutation), addressing the recurring P2 finding. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ⛔ Blocker — one unmitigated Executive Summary
Methodology
Code Quality No additional findings beyond the Performance No findings. Maintainability No findings. The Tech Debt
Security No findings. Documentation/Tests
Path to Approval
|
compute_honest_did() call in fit() now passes alpha=self.alpha instead of inheriting the default 0.05. Added regression test asserting honest_did_results.alpha matches estimator alpha. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Methodology Affected method: HonestDiD (Rambachan-Roth 2023) applied to the dCDH placebo + event-study surface. The new
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
P1: Add test_honest_did_with_bootstrap - fits with n_bootstrap=49 and
honest_did=True, asserts finite bounds and retained horizons.
P2: test_honest_did_trends_nonparam_trimming now captures warnings and
asserts the placebo-based pre-period warning is emitted (was
suppressing all warnings before).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good — the prior bootstrap/HonestDiD interaction Executive Summary
Methodology
Code Quality No findings. Performance No findings. Maintainability No findings. Tech Debt
Security No findings. Documentation/Tests
Static only: I could not run the changed tests in this environment because the sandbox lacks |
|
/ai-review |
Add test_dcdh_interior_gap_triggers_trimming_warning: corrupts h=-2 SE to create an interior gap [-3, -1], asserts the "Dropping non-consecutive horizons" warning is emitted and pre_periods_used == [-1]. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good No unmitigated Executive Summary
Methodology
Code Quality No findings. Performance No findings. Maintainability No findings. Tech Debt
Security No findings. Documentation/Tests No findings. The previous warning-coverage gap is resolved by tests/test_honest_did.py:L1449-L1470, and the fit-time integration path now has direct coverage for validation, standalone parity, trimming, bootstrap, and summary rendering in tests/test_chaisemartin_dhaultfoeuille.py:L3253-L3311, tests/test_chaisemartin_dhaultfoeuille.py:L3515-L3550, and tests/test_chaisemartin_dhaultfoeuille.py:L3558-L3654. |
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good No unmitigated P0/P1 findings in the changed dCDH HonestDiD or Executive Summary
Methodology
Code Quality No findings. Performance No findings. Maintainability No findings. Tech Debt
Security No findings. Documentation/Tests No findings. The new tests cover fit-time HonestDiD integration, precondition errors, placebo-warning emission, consecutive-grid trimming, and the new summary blocks at tests/test_chaisemartin_dhaultfoeuille.py:3253, tests/test_chaisemartin_dhaultfoeuille.py:3515, tests/test_chaisemartin_dhaultfoeuille.py:3558, tests/test_honest_did.py:1402, and tests/test_honest_did.py:1449. I could not run them here because |
Summary
honest_did=Trueinfit()and standalonecompute_honest_did(results)pathsummary()rendering sections for Phase 3 results: covariate diagnostics, cumulated level effects, heterogeneity test, design-2 descriptive, HonestDiD sensitivity boundstrends_nonparamunequal-support regression test (Assumption 14 support-trimming)Methodology references (required if estimator / math changes)
DID^{pl}_lplacebo estimates as pre-period coefficients rather than standard event-study pre-treatment coefficients. Runtime warning emitted. Documented in REGISTRY.md.diag(SE^2). Documented in REGISTRY.md.trends_nonparamsupport-trimming are handled by filtering to largest consecutive block with warning (CS branch errors on gaps).Validation
tests/test_chaisemartin_dhaultfoeuille.py: +13 tests (7 HonestDiD integration, 5 summary rendering, 1 unequal-support)tests/test_honest_did.py: +5 tests (integration, extraction, no-placebos error, placebo warning, empty-block edge case)Security / privacy
Generated with Claude Code