Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added
- **`SyntheticControl` cross-validation + inverse-variance `V`-selection (ADH 2015 §; Abadie 2021 §3.2(a), Eq. 9).** Two new `v_method` values complete the ADH-2015/Abadie-2021 `V`-selection menu (joining `"nested"` / `"custom"`), each threaded through the in-space / leave-one-out / in-time placebo refits so a diagnostic uses the **same** estimator as the headline fit. **`v_method="cv"`** selects the diagonal predictor-importance `V` by out-of-sample cross-validation: the pre-period is split positionally at `v_cv_t0` (new constructor param; default `len(pre)//2`, Abadie 2021's `t0 = T0/2`) into a training and a validation window, `V` is chosen to minimize the validation-window outcome MSPE of the training-fit weights (`mspe_v` now reports this validation MSPE under cv), and the final reported weights are re-estimated on the validation-window predictors (ADH 2015 step 4). Each predictor spec is **re-aggregated** over each window (its mean/sum/identity recomputed over only the periods that fall in that window — a separate `dataprep` per window, exactly as ADH 2015's CV does, since R `Synth` has no built-in CV function), so the V-search is genuinely out-of-sample for every predictor type and the same `V*` drives both fits with no zeroed coordinate (`v_weights` reproduce `donor_weights` on the validation-window predictors, and `predictor_balance` is reported on that validation-window basis). **Fully-spanning precondition (fail-closed):** re-aggregating a predictor on each window requires it to be observed in **both** windows, so `cv` **requires every predictor to span both the training and validation windows** and raises `ValueError` otherwise — satisfied by ADH 2015's shared covariate / multi-period `special_predictors` (which span the windows) but NOT by the default per-period outcome lags (each is single-period and lives in one window only), so `cv` with the bare default predictors is rejected with guidance to pass spanning predictors. In-time-placebo truncation that breaks the fully-spanning precondition (a kept spec stops spanning both windows at the truncated split) marks that date `infeasible`. A second fail-closed gate covers windows that span but carry **no cross-donor variation** (every re-aggregated predictor constant across the donors, so `X0·W` is constant in `W` → a flat, unidentified weight solve that would otherwise return arbitrary "converged" weights — even when the treated unit differs, since donor distinguishability, not treated-vs-donor variation, identifies `W`): the headline fit raises `ValueError`, in-space placebo refits whose donor pool is indistinguishable in a window are dropped from the reference set, and such in-time-truncated dates are marked `infeasible`. Abadie 2021 footnote 7's CV non-uniqueness is handled by a **deterministic tie-break** (prefer the `V` closest to uniform among ties), making the selected `V*` among equally-good optima independent of the multistart evaluation order. The cv fit is reproducible for a fixed `seed` (like `nested`) but is not seed-independent — the multistart fills any slots beyond the distinct heuristic starts with seed-dependent random Dirichlet draws, so the tie-break removes start-order dependence among ties, not seed dependence. The tie-break is convergence-aware (a non-converged optimizer candidate cannot displace a converged incumbent on an objective tie). If the training-window solve that defines `mspe_v` truncates (e.g. `inner_max_iter` too small), the fit fails closed — `mspe_v=NaN` and the fit is marked non-converged — rather than reporting an invalid Eq. 9 criterion. **`v_method="inverse_variance"`** uses the closed form `v_h = 1/Var(X_h)` (variance over donors+treated on the unstandardized predictors), applied to the **raw** predictors so the effective objective is the unit-variance-rescaled `Σ_h diff_h²/Var_h` (Abadie 2021 §3.2(a)); the `standardize` pre-scaling is intentionally bypassed on this branch (inverse-variance weighting *is* the unit-variance rescaling — applying it on already-standardized rows would double-rescale to `Σ_h diff_h²/Var_h²`), so it is equivalent to uniform `V` on standardized predictors. No search (`mspe_v=None`); a zero-variance row gets 0 weight and an all-zero-variance panel falls back to uniform `V` with a warning. `custom_v` is rejected (fail-closed) for both methods and `v_cv_t0` is rejected unless `v_method="cv"`. On the degenerate **single-donor** path (`J=1` forces `w=[1]`) `V` is unidentified — every `V` yields the same synthetic — so `v_weights` is **uniform** and `mspe_v=None` for ALL `v_method`s (cv / inverse_variance included; their selected / closed-form `V` would be inert), with a `UserWarning`; the donor weights / gap / ATT are unaffected. An explicitly pinned `v_cv_t0` that no longer fits the truncated pre-fake window is nulled to the `//2` default for the placebo refit (a pinned value that still fits the truncated window is kept). **Validation:** R `Synth` has no built-in CV function (ADH 2015's CV is a manual `dataprep`+`synth` re-run), so cv is anchored by deterministic equivalence to the R-anchored `custom_v` path (the step-3 validation MSPE of the training-window fit and the step-4 validation-window weights each match a `custom_v=V*` fit on the correspondingly re-aggregated predictors) plus cv self-consistency (`in_time_placebo` under cv == a fresh cv fit on the backdated panel to 1e-7); inverse-variance is anchored bit-for-bit to a `custom_v=1/Var(X)` fit. Documented in `docs/methodology/REGISTRY.md` §SyntheticControl (new `**Note:**` labels for the per-window re-aggregation convention, the flat-MSPE tie-break, and inverse-variance), `docs/api/synthetic_control.rst`, the LLM guides, and `README.md`. The remaining ADH-2015 items (`W^reg` extrapolation diagnostic, sparse-SC subset search) stay tracked in `TODO.md`.
- **Firpo & Possebom (2018) SCM inference paper review on file (PR-A).** Added `docs/methodology/papers/firpo-possebom-2018-review.md`, a faithful, paper-sourced fidelity review of Firpo & Possebom (2018, *Journal of Causal Inference* 6(2), DOI 10.1515/jci-2016-0026) — the Step-1 artifact for the forthcoming SCM **confidence-set / CI-by-test-inversion** track (PR-B) layered on the existing `SyntheticControl` estimator (classic SCM has no analytical SE; `se`/`p_value`/`conf_int` are NaN). Transcribes (paper-sourced only, no code-deviation verdicts) the benchmark RMSPE-ratio permutation test (Eqs. 4–6), the sensitivity-analysis parametric p-value weights with worst/best-case `φ̲`/`φ̄` (Eqs. 7–9), the sharp-null `RMSPE^f` test (Eqs. 10–13), the **confidence sets by test inversion** (Eq. 14) with the operational constant-effect CI (Eqs. 15–16) and linear-effect CS (Eqs. 17–18), the general test-statistic framework + Monte Carlo size/power of five statistics (Eq. 19, Section 5), and the multiple-outcome FWER (Eqs. 23–24) and multiple-treated-unit pooled (Eqs. 25–26) extensions; the requirements checklist flags the PR-B target (sharp-null test + constant/linear CI + benchmark + one-sided) versus the deferred sensitivity-analysis and multi-outcome/treated extensions. Docs-only; no code change. Registered in `docs/references.rst` (Synthetic Control Method section) and `docs/doc-deps.yaml`; REGISTRY `## SyntheticControl` gains a `firpo-possebom-2018-review.md` reviews-on-file pointer.
- **`SyntheticControl` confidence sets by test inversion (Firpo & Possebom 2018 §4, PR-B).** Classic SCM gains the uncertainty quantification it has lacked — a confidence set for the treatment-effect *path* — without changing its always-NaN analytical inference contract. Two opt-in `SyntheticControlResults` methods built ON TOP of the in-space placebo: `test_sharp_null(effect, gamma=0.1)` tests a sharp null `H_0: α_1t = f(t)` (Eq 11; `effect` a scalar constant effect or a length-`n_post` post-period path) by subtracting `f(t)` from every unit's post-period gaps and re-ranking the modified RMSPE ratio `RMSPE^f` (Eqs 12–13 at `φ=0`, `v=(1,…,1)`), and `confidence_set(family="constant"|"linear", gamma=0.1, bounds=None, n_grid=200)` inverts that test into a confidence set — a constant-in-time interval (Eqs 15–16) or a linear-in-time slope set (Eqs 17–18) — keeping every value whose sharp null is not rejected at the paper's **strict** `p^f > γ` boundary (Eq 14). The whole computation is a **pure re-ranking of the gap paths `in_space_placebo()` already computes** (no synthetic-control refits): under a common-effect null the donor synthetics and the pre-period MSPE denominators are unchanged — only the post gaps shift by `f(t)` — so each grid value costs an `O(J)` rank, not a refit. With `bounds=None` the set is recovered **EXACTLY** by piecewise-constant breakpoint inversion: `p^c` is constant between the real roots of the placebo-vs-treated comparison quadratics, so `p` is evaluated once per induced interval AND at each breakpoint (a tie under `≥` can lift `p` above γ there, yielding an isolated accepted point) — NO centering/monotonicity assumption, so accepted tails, disjoint components, and unbounded/empty sets are all handled (a poor-pre-fit treated unit can have its accepted region in the tails). `bounds=(lo,hi)` instead scans a fixed grid (grid-limited); `n_grid` controls only the returned inspection table when `bounds=None`. Results: a pickle-surviving `effect_confidence_set` summary (`{family, parameter, gamma, lower, upper, contiguous, status, …}`, `status ∈ {"ran","empty","unbounded"}`) + a `get_confidence_set_df()` grid table, surfaced under `estimator_native_diagnostics.confidence_set`. **The analytical `conf_int`/`se`/`t_stat`/`p_value` stay NaN** — this is a permutation set at level `1−γ` (γ granular in `1/(J+1)`), possibly a set / unbounded / non-contiguous, so it cannot be coerced into the Wald-interval `conf_int` tuple; it is kept separate exactly as `placebo_p_value` is kept off `p_value`. **Fail-closed:** `γ < 1/(J+1)` (no value rejectable — fn 8) or a treated unit lacking the best pre-fit → `"unbounded"` (`±inf` + warning); no interval or breakpoint accepted → `"empty"` (NaN endpoints); a non-contiguous accepted region (disjoint components / an isolated singleton) → the `[lower, upper]` hull with `contiguous=False` + warning; `< 2` donors / a non-converged treated fit / an unpickled result (no placebo reference set) → `ValueError`. `test_sharp_null(0)` is held bit-for-bit equal to `placebo_p_value` (Eq 5 = Eq 13) by reusing each unit's **per-unit** floored pre-period denominator persisted from the placebo run. **Scope:** the sensitivity-analysis weights (`φ≠0`, Eqs 7–9), the general test-statistic menu (Eq 19), one-sided (§7's signed-`t` statistic), and the multiple-outcome/treated extensions (§6) are deferred (flagged in the paper review checklist). **Validation:** no R anchor (R `Synth` has no test inversion; the authors' Code Ocean capsule was not consulted) — self-consistency to the (Basque-R-anchored) `placebo_p_value`, a numpy oracle on Eqs 12–14 (incl. the strict `p=γ` boundary and the per-unit floor), invariants (the point estimate lies in the constant set for a well-posed fit; a center-rejected/tails-accepted regression; an isolated-breakpoint singleton; monotone-in-γ), and a coverage simulation. Consumes the PR-A `firpo-possebom-2018-review.md`; documented in `docs/methodology/REGISTRY.md` §SyntheticControl (new methodology block + `**Note:**` labels for the boundary convention, the grid choice, the non-analytical `conf_int` contract, and the no-R-anchor validation), `docs/api/synthetic_control.rst`, and the LLM guides.
- **`HeterogeneousAdoptionDiD.fit()` fit-time extensive-margin warning + `covariates=` not-implemented pointer.** Two UX additions to the HAD `fit()` surface, with **no change to any estimate or standard error**. (1) The **overall** path now emits a `UserWarning` when a non-trivial fraction (`>= 10%`, a library-convention cutoff in `_HAD_EXTENSIVE_MARGIN_ZERO_DOSE_FRAC`) of units have an exactly-zero post-period dose — a genuine untreated mass for which a standard DiD using those units as controls may be more appropriate (de Chaisemartin et al. 2026, Section 2 / Assumption 3). The paper retains *small* untreated shares (e.g. 12/2954 in Garrett et al., with close-to-nominal coverage), so the 10% cutoff sits ~25× above that; the warning is **overall-path-only** because the event-study path *requires* never-treated units per Appendix B.2. Previously the recommendation surfaced only via `qug_test()`'s zero-dose warning when the user ran the pre-tests. (2) `HeterogeneousAdoptionDiD.fit(covariates=...)` now raises `NotImplementedError` with a pointer to the deferred Appendix B.1 / Theorem 6 covariate-adjusted extension (via an explicit keyword-only `covariates=` param) instead of a bare `TypeError` from an unknown kwarg; pre-residualize the outcome on the covariates as a workaround. Documented in `docs/methodology/REGISTRY.md` §HeterogeneousAdoptionDiD; new tests in `tests/test_had.py` and `tests/test_methodology_had.py`.

### Fixed
Expand Down
43 changes: 43 additions & 0 deletions diff_diff/diagnostic_report.py
Original file line number Diff line number Diff line change
Expand Up @@ -2458,6 +2458,49 @@ def _scm_native(self, r: Any) -> Dict[str, Any]:
"placebo (opt-in; refits per backdated date)."
),
}

# Test-inversion confidence set (Firpo & Possebom 2018 §4): opt-in, surfaced once
# the user has run results.confidence_set() (it reuses the in-space placebo
# reference set — no refits). The analytical conf_int stays NaN; this is a SEPARATE
# permutation set at level 1 - gamma, possibly unbounded or non-contiguous.
ecs = getattr(r, "effect_confidence_set", None)
if ecs is not None:
ecs_status = ecs.get("status")
_lo, _hi = ecs.get("lower"), ecs.get("upper")
block = {
"status": ecs_status,
"family": ecs.get("family"),
"parameter": ecs.get("parameter"),
"gamma": _to_python_float(ecs.get("gamma")),
# Emit each endpoint independently: a finite float, else None for a non-finite
# side (NaN for an empty set, +/-inf for an unbounded tail) -- keeps the dict
# JSON-safe while preserving the FINITE side of a one-sided unbounded set.
"lower": float(_lo) if isinstance(_lo, (int, float)) and np.isfinite(_lo) else None,
"upper": float(_hi) if isinstance(_hi, (int, float)) and np.isfinite(_hi) else None,
"contiguous": bool(ecs.get("contiguous")),
"n_placebos": _to_python_scalar(ecs.get("n_placebos")),
}
if ecs_status == "unbounded":
block["reason"] = (
"confidence_set() ran but the set is unbounded (gamma below the "
"1/(J+1) permutation granularity, or the treated unit lacks the best "
"pre-treatment fit); endpoint(s) are +/-inf."
)
elif ecs_status == "empty":
block["reason"] = (
"confidence_set() ran but the set is empty (every effect in the "
"family is rejected at gamma); endpoints are NaN."
)
out["confidence_set"] = block
else:
out["confidence_set"] = {
"status": "not_run",
"reason": (
"Call results.confidence_set() for a test-inversion confidence set of "
"the effect path (Firpo-Possebom 2018; opt-in, reuses the in-space "
"placebo reference set)."
),
}
return out

# -- Heterogeneity helpers --------------------------------------------
Expand Down
Loading
Loading