Skip to content

SyntheticControl: cv (out-of-sample) + inverse-variance V-selection (ADH 2015 / Abadie 2021)#523

Merged
igerber merged 1 commit into
mainfrom
feature/sc-v-selection
Jun 1, 2026
Merged

SyntheticControl: cv (out-of-sample) + inverse-variance V-selection (ADH 2015 / Abadie 2021)#523
igerber merged 1 commit into
mainfrom
feature/sc-v-selection

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Jun 1, 2026

Summary

  • Adds two v_method values to SyntheticControl, completing the ADH-2015 / Abadie-2021 predictor-importance V-selection menu, each threaded through the in-space / leave-one-out / in-time placebo refits (a diagnostic uses the same estimator as the headline fit).
  • v_method="cv" — out-of-sample cross-validation (ADH 2015 §; Abadie 2021 Eq. 9). New v_cv_t0 param splits the pre-period (default len(pre)//2). Each predictor spec is re-aggregated per window (a separate dataprep/standardization per window), so the V-search is genuinely out-of-sample for every predictor type: V minimizes the validation-window outcome MSPE of the training-window fit, then the final weights are re-estimated on the validation-window predictors (step 4). v_weights reproduce donor_weights; predictor_balance is reported on the validation-window basis; mspe_v is the held-out validation MSPE. Deterministic, convergence-aware flat-MSPE tie-break (fn. 7).
  • v_method="inverse_variance" — closed-form v_h = 1/Var(X_h) (Abadie 2021 §3.2(a)) applied to the raw predictors (the standardize pre-scaling is intentionally bypassed — inverse-variance weighting is unit-variance rescaling). Exact for every positive variance; zero-variance rows → 0 weight; all-zero/overflow → uniform + warn.
  • Fail-closed identification gates (cv): every predictor must span both windows; each window must have cross-donor variation (a donor-indistinguishable window leaves X0·W constant in W → unidentified). Violations raise on the headline fit; in-space placebos drop the affected refit; in-time-truncated dates → infeasible. Single-donor fits force w=[1]V unidentified → uniform v_weights + mspe_v=None (documented).

Methodology references (required if estimator / math changes)

  • Method name(s): SyntheticControl V-selection — out-of-sample cross-validation; inverse-variance V.
  • Paper / source link(s): Abadie, Diamond & Hainmueller (2015, Am. J. Pol. Sci.) §; Abadie (2021, JEL) §3.2(a), Eq. 9. On-file reviews: docs/methodology/papers/abadie-diamond-hainmueller-2015-review.md, abadie-2021-review.md.
  • Intentional deviations (documented as REGISTRY **Note:** / **Deviation from R:**): R Synth has no built-in CV (ADH-2015's CV is a manual two-dataprep re-run) — our per-window re-aggregation reproduces it for absolute-period spec aggregates; deterministic densest-V tie-break for fn.7 non-uniqueness; single-donor uniform-V degeneracy. All in docs/methodology/REGISTRY.md §SyntheticControl.

Validation

  • Tests added/updated: tests/test_methodology_synthetic_control.py — config validation, exact inverse-variance (incl. tiny-positive-variance), the spanning / donor-variation / training-convergence fail-closed gates, single-donor degeneracy, and placebo/LOO/in-time propagation + self-consistency for both methods. R-anchored: cv by deterministic equivalence to the R-anchored custom_v path on the per-window re-aggregated predictors (step-3 criterion + step-4 weights) + cv self-consistency (in-time == fresh backdated fit, 1e-7); inverse-variance bit-for-bit vs custom_v=1/Var(X).
  • Backtest / simulation / notebook evidence: N/A (no tutorial change in this PR).

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

🤖 Generated with Claude Code

…ADH 2015 / Abadie 2021)

Completes the ADH-2015 / Abadie-2021 predictor-importance V-selection menu with two new
`v_method` values, each threaded through the in-space / leave-one-out / in-time placebo
refits so a diagnostic uses the same estimator as the headline fit.

`v_method="cv"` — out-of-sample cross-validation (ADH 2015 §; Abadie 2021 Eq. 9). The
pre-period is split at `v_cv_t0` (new constructor param; default `len(pre)//2`) into a
training and a validation window. Each predictor spec is RE-AGGREGATED over each window
(its op recomputed over only that window's periods — a separate `dataprep` per window,
standardized per window), so the V-search is genuinely out-of-sample for every predictor
type: V is selected to minimize the validation-window outcome MSPE of the training-window
fit, then the final weights are re-estimated on the validation-window predictors (step 4).
The same V* drives both fits with no zeroed coordinate, so `v_weights` reproduce
`donor_weights` and `predictor_balance` is reported on the validation-window basis.
`mspe_v` reports the held-out validation MSPE. Abadie 2021 fn.7 non-uniqueness is handled
by a deterministic, convergence-aware flat-MSPE tie-break (prefer the densest V; never let
a non-converged candidate displace a converged incumbent).

`v_method="inverse_variance"` — closed-form `v_h = 1/Var(X_h)` (Abadie 2021 §3.2(a)),
variance over donors+treated on the unstandardized predictors, applied to the RAW
predictors (the `standardize` pre-scaling is intentionally bypassed — inverse-variance
weighting IS the unit-variance rescaling). Exact for every positive variance (no flooring);
zero-variance rows get 0 weight; an all-zero / overflow panel falls back to uniform + warn.

Fail-closed identification gates (cv): every predictor must SPAN both windows (re-aggregation
needs it measurable on each — default single-period lags are rejected), and each window must
have cross-DONOR variation (donor-indistinguishable windows leave X0·W constant in W → the
weight solve is unidentified). Violations raise on the headline fit; in-space placebos drop
the affected refit; in-time-truncated dates are marked `infeasible`. Single-donor fits force
w=[1] so V is unidentified → uniform `v_weights` + `mspe_v=None` for all methods (documented).

Validation: R `Synth` has no built-in CV, so cv is anchored by deterministic equivalence to
the R-anchored `custom_v` path on the per-window re-aggregated predictors (both the step-3
criterion and step-4 weights) + cv self-consistency (in_time_placebo == fresh backdated fit);
inverse-variance is bit-for-bit vs `custom_v=1/Var(X)`. New tests cover config validation,
exact inverse-variance, the spanning / donor-variation / convergence fail-closed gates,
single-donor degeneracy, and placebo/LOO/in-time propagation for both methods.

Docs: REGISTRY §SyntheticControl Notes (per-window re-aggregation, fully-spanning + donor
gates, tie-break, inverse-variance, single-donor), checklist tick, `docs/api`, the LLM
guides, README, CHANGELOG. The remaining ADH-2015 tail (`W^reg` extrapolation, sparse-SC)
and an in-space/LOO machine-readable cv-infeasible reason-code stay tracked in TODO.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 1, 2026

Overall assessment

✅ Looks good — no unmitigated P0 or P1 findings.

Executive summary

Methodology

No findings. The changed estimator behavior is aligned with the on-file methodology reviews and the corresponding registry notes in docs/methodology/REGISTRY.md. The key implementation choices that differ from paper/R defaults, including per-window re-aggregation, deterministic CV tie-breaking, raw-scale inverse-variance weighting, and single-donor uniform-V handling, are documented at docs/methodology/REGISTRY.md.

Code Quality

No findings.

Performance

No findings. The added CV guards do extra predictor-matrix work, but only on the new v_method="cv" path and they prevent arbitrary-weight fits.

Maintainability

No findings. The new helper split around _inverse_variance_v(), _outer_solve_V_cv(), and the CV window checks keeps the new logic localized and reuses the same refit path for placebo/LOO/in-time diagnostics.

Tech Debt

  • P3 Impact: in_space_placebo() and leave_one_out() still report structurally infeasible CV refits with generic status="failed" instead of a separate machine-readable infeasible status in diff_diff/synthetic_control_results.py and diff_diff/synthetic_control_results.py. Concrete fix: implement the already-tracked follow-up in TODO.md by threading a reason code out of _outer_solve_V_cv() / _placebo_fit_unit() and surfacing separate status="infeasible" plus counts. This is tracked, so it does not block approval.

Security

No findings.

Documentation/Tests

No findings in the changed docs/tests. The added coverage in tests/test_methodology_synthetic_control.py is strong on the new branches: non-spanning predictors, donor-indistinguishable CV windows, single-donor degeneracy, fail-closed training solves, and propagation through placebo/LOO/in-time.

Residual risk: I could not execute the tests in this environment because pytest and numpy are not installed here.

@igerber igerber added the ready-for-ci Triggers CI test workflows label Jun 1, 2026
@igerber igerber merged commit 1f1efcc into main Jun 1, 2026
33 of 34 checks passed
@igerber igerber deleted the feature/sc-v-selection branch June 1, 2026 20:49
@igerber igerber mentioned this pull request Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant