SyntheticControl: leave-one-out + in-time placebo (ADH 2015 §4)#514
Merged
Conversation
…5 §4) Adds the two ADH-2015 §4 robustness diagnostics to the classic SyntheticControl estimator — the agreed ship-blockers before launch — as opt-in results methods that re-run the validated solver and leave the no-analytical-inference contract intact (se/t_stat/p_value/conf_int/is_significant stay bound to the NaN p_value). - leave_one_out(): drops each reportably-weighted donor (the >1e-6 support, frozen on the fit snapshot at fit time so it is immune to post-fit donor_weights mutation) and re-fits the treated unit; returns a baseline + per-drop ATT/delta_att table; the headline single-donor-dependence metric is the baseline-relative max_abs_delta_att. - in_time_placebo(): reassigns the intervention to an earlier pre-date and measures the placebo effect over the held-out window (~0 if no real pre-period effect). TRUNCATE windowing re-cuts predictor specs to the pre-fake window (custom_v subset in lockstep, raveled to support array-like inputs), excludes the true post-periods entirely (no peeking), and requires >=2 pre-fake periods (documented Note). Sweeps feasible dates by default; explicit dates are validated, de-duplicated, and canonicalized; empty explicit input raises. Statuses distinguish ran / infeasible / failed and the mixed all_dates_unusable case with n_failed / n_infeasible counts. Both fail closed (non-converged treated fit, too-few donors/pre-periods, all-failed refits). Wired into DiagnosticReport (_scm_native opt-in blocks with machine-readable reason_code), BusinessReport, and practitioner_next_steps. Validation: deterministic self-consistency (each diagnostic == a fresh fit on the equivalent sub-problem, 1e-7) plus an R Synth drop-donor LOO golden on Basque; the custom-V solver's existing Basque R-parity transitively anchors both. R Synth has no in-time/LOO function (documented). Remaining ADH-2015 items (CV V-selection, W_reg extrapolation, sparse-SC) deferred in TODO.md. Docs: REGISTRY §SyntheticControl, REPORTING.md, api/synthetic_control.rst, LLM guides, README, CHANGELOG. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…odex review) P2: _scm_native's in_time_placebo "ran" block now surfaces n_ran / n_infeasible (not just n_dates / n_failed), so a partially-usable sweep (some dates ran, some infeasible) is not summarized as full coverage. Regression test added. P3: aligned the remaining "positively-weighted" copy (docs/api/synthetic_control.rst, CHANGELOG, llms-full) to the documented "reportably-weighted (>1e-6)" contract, and refreshed the _check_estimator_native SCM summary to mention the leave_one_out / in_time_placebo blocks. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SyntheticControlestimator (the agreed ship-blockers before launch), as opt-inSyntheticControlResultsmethods that re-run the validated solver and leave the no-analytical-inference contract intact (se/t_stat/p_value/conf_int/is_significantstay bound to the NaNp_value).leave_one_out()— drops each reportably-weighted donor (the>1e-6support, frozen on the fit snapshot so it is immune to post-fitdonor_weightsmutation) and re-fits the treated unit; returns a baseline + per-drop ATT/delta_atttable; the reporting headline is the baseline-relativemax_abs_delta_att.get_leave_one_out_df()/get_leave_one_out_gaps()accessors.in_time_placebo()— reassigns the intervention to an earlier pre-date and measures the placebo effect over the held-out window (~0 if no real pre-period effect). TRUNCATE windowing re-cuts predictor specs to the pre-fake window (custom_vsubset in lockstep, raveled for array-like inputs), excludes the true post-periods entirely (no peeking), requires ≥2 pre-fake periods. Sweeps feasible dates by default; explicit dates validated/de-duplicated/canonicalized; empty input raises. Statuses:ran/infeasible/failed/ mixedall_dates_unusablewithn_failed/n_infeasible.get_in_time_placebo_df()/get_in_time_placebo_gaps()accessors.DiagnosticReport(_scm_nativeopt-in blocks with machine-readablereason_code),BusinessReport, andpractitioner_next_steps.Methodology references (required if estimator / math changes)
SyntheticControl— Abadie, Diamond & Hainmueller (2015) §4 robustness diagnostics (leave-one-out donor robustness; in-time / backdating placebo).docs/methodology/papers/abadie-diamond-hainmueller-2015-review.md. Documented indocs/methodology/REGISTRY.md§SyntheticControl.Synthhas no in-time function, so truncate = a manualdataprep+synthre-run for outcome-predictor fits; documented**Note:**. (2) Leave-one-out drops the reportable (>1e-6) support rather than every strictly-positive weight (sub-floor = numerical dust, ~0delta_att); documented**Note:**. (3) In-time placebo requires ≥2 pre-fake periods (stricter than the baseT0≥1, an auto-swept single-pre-fake date is non-credible); documented**Note:**. Validation: deterministic self-consistency (each diagnostic == a fresh fit on the equivalent sub-problem, to 1e-7) + an RSynthdrop-donor LOO golden on Basque; the custom-V solver's existing Basque R-parity transitively anchors both. Remaining ADH-2015 items (CVV-selection,W^regextrapolation, sparse-SC) deferred inTODO.md.Validation
tests/test_methodology_synthetic_control.py(LOO + in-time behavioral/edge/fail-closed/pickle/determinism/self-consistency + Tier-1 R LOO parity),tests/test_diagnostic_report.py,tests/test_business_report.py,tests/test_practitioner.py; R goldenbenchmarks/R/generate_synth_basque_golden.R+tests/data/synth_basque_golden.json(drop-donor LOO block). Local: 478 passed (pure-Python) + Rust/@slow tier green; black/ruff clean; mypy no new errors. Iterated/ai-review-local --backend codexto a clean verdict (9 rounds).Security / privacy
Generated with Claude Code