diff --git a/CHANGELOG.md b/CHANGELOG.md
index 5ee0ce38..15a8ddeb 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
+### Changed
+- **HAD survey-design API consolidated to single `survey_design=` kwarg** across all 8 HAD surfaces: `HeterogeneousAdoptionDiD.fit`, `did_had_pretest_workflow`, `qug_test`, `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`. Matches the rest of the library (`ContinuousDiD`, `EfficientDiD`, `ChaisemartinDHaultfoeuille` already used `survey_design=`). On data-in surfaces (HAD.fit, workflow, joint data-in wrappers) `survey_design=` accepts a `SurveyDesign` instance (column references resolved against `data` at fit time, same convention as the rest of the library). On the three array-in linearity helpers (`stute_test`, `yatchew_hr_test`, `stute_joint_pretest`) `survey_design=` accepts a pre-resolved `ResolvedSurveyDesign`; passing a `SurveyDesign` raises `TypeError` with migration guidance to `make_pweight_design(arr)` (pweight-only) or pre-resolution. `qug_test` is the 8th surface and accepts the same kwarg signature for consistency, but **all** non-`None` values raise `NotImplementedError` per the Phase 4.5 C0 permanent deferral (no migration path; the qug-specific mutex error reflects this). New public helper `make_pweight_design(weights: np.ndarray) -> ResolvedSurveyDesign` exported from the `diff_diff` top level for the pweight-only convenience on the three array-in linearity helpers (formerly the private `survey._make_trivial_resolved`, kept as a permanent private alias); validates 1-D input at the front door. Three-way mutex (`survey_design + survey + weights`) extends the prior 2-way (`survey + weights`) — at most one may be non-None per call. Patch-level addition (additive new kwarg + permanent alias for the helper; no breaking changes this release).
+
+### Deprecated
+- **`HeterogeneousAdoptionDiD.fit(survey=, weights=)`, `did_had_pretest_workflow(survey=, weights=)`, and the 6 HAD pretest helpers' `survey=` / `weights=` kwargs are deprecated** in favor of the canonical `survey_design=`. Emits `DeprecationWarning` with migration guidance; the deprecated kwargs continue to route through the unchanged legacy back-end paths so numerical results are identical to pre-PR (bit-exact regression locked by parity tests in `tests/test_had_dual_knob_deprecation.py`). Both `survey=` and `weights=` will be removed in the next minor release. **Carve-out for `qug_test`**: the deprecation is kwarg-name-consolidation only; `qug_test` permanently rejects all non-`None` `survey_design` / `survey` / `weights` values (Phase 4.5 C0 deferral) and `make_pweight_design(arr)` is NOT a valid migration target — the deprecation warning text on `qug_test` is qug-specific and points users to `did_had_pretest_workflow(..., survey_design=...)` for survey-aware HAD pretesting (which skips the QUG step under survey).
+
### Added
- **HAD linearity-family pretests under survey (Phase 4.5 C).** `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`, and `did_had_pretest_workflow` now accept `weights=` / `survey=` keyword-only kwargs. Stute family uses **PSU-level Mammen multiplier bootstrap** via `bootstrap_utils.generate_survey_multiplier_weights_batch` (the same kernel as PR #363's HAD event-study sup-t bootstrap): each replicate draws an `(n_bootstrap, n_psu)` Mammen multiplier matrix, broadcast to per-obs perturbation `eta_obs[g] = eta_psu[psu(g)]`, weighted OLS refit, weighted CvM via new `_cvm_statistic_weighted` helper. Joint Stute SHARES the multiplier matrix across horizons within each replicate, preserving both the vector-valued empirical-process unit-level dependence AND PSU clustering. Yatchew uses **closed-form weighted OLS + pweight-sandwich variance components** (no bootstrap): `sigma2_lin = sum(w·eps²)/sum(w)`, `sigma2_diff = sum(w_avg·diff²)/(2·sum(w))` with arithmetic-mean pair weights `w_avg_g = (w_g+w_{g-1})/2`, `sigma4_W = sum(w_avg·prod)/sum(w_avg)`, `T_hr = sqrt(sum(w))·(sigma2_lin-sigma2_diff)/sigma2_W`. All three Yatchew components reduce bit-exactly to the unweighted formulas at `w=ones(G)` (locked at `atol=1e-14` by direct helper test). The pweight `weights=` shortcut routes through a synthetic trivial `ResolvedSurveyDesign` (new `survey._make_trivial_resolved` helper) so the same kernel handles both entry paths. `did_had_pretest_workflow(..., survey=, weights=)` removes the Phase 4.5 C0 `NotImplementedError`, dispatches to the survey-aware sub-tests, **skips the QUG step with `UserWarning`** (per C0 deferral), sets `qug=None` on the report, and appends a `"linearity-conditional verdict; QUG-under-survey deferred per Phase 4.5 C0"` suffix to the verdict. `HADPretestReport.qug` retyped from `QUGTestResults` to `Optional[QUGTestResults]`; `summary()` / `to_dict()` / `to_dataframe()` updated to None-tolerant rendering. Replicate-weight survey designs (BRR/Fay/JK1/JKn/SDR) raise `NotImplementedError` at every entry point (defense in depth, reciprocal-guard discipline) — parallel follow-up after this PR. **Stratified designs (`SurveyDesign(strata=...)`) also raise `NotImplementedError` on the Stute family** — the within-stratum demean + `sqrt(n_h/(n_h-1))` correction that the HAD sup-t bootstrap applies to match the Binder-TSL stratified target has not been derived for the Stute CvM functional, so applying raw multipliers from `generate_survey_multiplier_weights_batch` directly to residual perturbations would leave the bootstrap p-value silently miscalibrated. Phase 4.5 C narrows survey support to **pweight-only**, **PSU-only** (`SurveyDesign(weights=, psu=)`), and **FPC-only** (`SurveyDesign(weights=, fpc=)`) designs; stratified is a follow-up after the matching Stute-CvM stratified-correction derivation lands. Strictly positive weights required on Yatchew (the adjacent-difference variance is undefined under contiguous-zero blocks). Per-row `weights=` / `survey=col` aggregated to per-unit via existing HAD helpers `_aggregate_unit_weights` / `_aggregate_unit_resolved_survey` (constant-within-unit invariant enforced). Unweighted code paths preserved bit-exactly. Patch-level addition (additive on stable surfaces). See `docs/methodology/REGISTRY.md` § "QUG Null Test" — Note (Phase 4.5 C) for the full methodology.
- **`ChaisemartinDHaultfoeuille.by_path` + `placebo=True`** — per-path backward-horizon placebos `DID^{pl}_{path, l}` for `l = 1..L_max`. The same per-path SE convention used for the event-study (joiners/leavers IF precedent: switcher-side contributions zeroed for non-path groups; cohort structure and control pool unchanged; plug-in SE with path-specific divisor `N^{pl}_{l, path}`) is applied to backward horizons via the new `switcher_subset_mask` parameter on `_compute_per_group_if_placebo_horizon`. Surfaced on `results.path_placebo_event_study[path][-l]` (negative-int inner keys mirroring `placebo_event_study`); `summary()` renders the rows alongside per-path event-study horizons; `to_dataframe(level="by_path")` emits negative-horizon rows alongside the existing positive-horizon rows. **Bootstrap** (when `n_bootstrap > 0`) propagates per-`(path, lag)` percentile CI / p-value through the same `_bootstrap_one_target` dispatch as the per-path event-study, with the canonical NaN-on-invalid contract enforced on the new surface (PR #364 library-wide invariant). **SE inherits the cross-path cohort-sharing deviation from R** documented for `path_effects` (full-panel cohort-centered plug-in vs R's per-path re-run): tracks R within tolerance on single-path-cohort panels, diverges materially on cohort-mixed panels — the bootstrap SE is a Monte Carlo analog of the analytical SE and inherits the same deviation. R-parity confirmed at `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityByPathPlacebo` on the new `multi_path_reversible_by_path_placebo` scenario (point estimates exact match; SE within Phase-2 envelope rtol ≤ 5%); positive analytical + bootstrap invariants at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathPlacebo` (and the gated `::TestBootstrap` subclass). See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path ...)` → "Per-path placebos" for the full contract.
diff --git a/TODO.md b/TODO.md
index 0c420b92..8aeff2a2 100644
--- a/TODO.md
+++ b/TODO.md
@@ -99,6 +99,7 @@ Deferred items from PR reviews that were not addressed before merge.
| `HeterogeneousAdoptionDiD` Phase 4.5: weight-aware auto-bandwidth MSE-DPI selector. Phase 4.5 A ships weighted `lprobust` with an unweighted DPI selector; users who want a weight-aware bandwidth must pass `h`/`b` explicitly. Extending `lpbwselect_mse_dpi` to propagate weights through density, second-derivative, and variance stages is ~300 LoC of methodology and was out of scope. | `diff_diff/_nprobust_port.py::lpbwselect_mse_dpi` | Phase 4.5 | Low |
| `HeterogeneousAdoptionDiD` Phase 4.5 C: replicate-weight SurveyDesigns (BRR / Fay / JK1 / JKn / SDR) on the continuous-dose paths. Phase 4.5 A raises `NotImplementedError` on replicate designs in `_aggregate_unit_resolved_survey`. Rao-Wu-style replicate bootstrap for HAD paths requires deriving the per-replicate weight-ratio rescaling for the local-linear intercept IF. | `diff_diff/had.py::_aggregate_unit_resolved_survey` | Phase 4.5 C | Low |
| `HeterogeneousAdoptionDiD` mass-point: `vcov_type in {"hc2", "hc2_bm"}` raises `NotImplementedError` pending a 2SLS-specific leverage derivation. The OLS leverage `x_i' (X'X)^{-1} x_i` is wrong for 2SLS; the correct finite-sample correction uses `x_i' (Z'X)^{-1} (...) (X'Z)^{-1} x_i`. Needs derivation plus an R / Stata (`ivreg2 small robust`) parity anchor. | `diff_diff/had.py::_fit_mass_point_2sls` | Phase 2a | Medium |
+| `HeterogeneousAdoptionDiD` survey-design API consolidation, **next minor bump**: drop the deprecated `survey=` and `weights=` kwargs on all 8 HAD surfaces (`HeterogeneousAdoptionDiD.fit`, `did_had_pretest_workflow`, `qug_test`, `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`); only `survey_design=` remains. Also fold the legacy back-end `weights=` paths (e.g. `_aggregate_unit_weights` ad-hoc routing) into the unified `_resolve_survey_for_fit`-driven path. The `_make_trivial_resolved` underscore alias on `survey.py` stays (one-line, harmless). DeprecationWarning ships in this PR; the removal PR is ~50 LoC of cleanup. | `diff_diff/had.py`, `diff_diff/had_pretests.py` | next minor bump | Medium |
| `HeterogeneousAdoptionDiD` continuous paths: thread `cluster=` through `bias_corrected_local_linear` (Phase 1c's wrapper already supports cluster; Phase 2a ignores it with a `UserWarning` on the continuous path to keep scope tight). | `diff_diff/had.py`, `diff_diff/local_linear.py` | Phase 2a | Low |
| `HeterogeneousAdoptionDiD` Eq 18 linear-trend detrending (Pierce-Schott style): the joint-Stute infrastructure shipped in the Phase 3 follow-up supports pre-trends (mean-indep) and post-homogeneity (linearity) nulls. The Pierce-Schott application (paper Section 5.2) uses a LINEAR-TREND detrending of pre-period outcomes before the joint CvM — `Y_{g,t} - Y_{g,t_anchor} - (t - t_anchor)*(Y_{g,t_anchor} - Y_{g,t_anchor-1})` — reaching p=0.51 on US-China tariff data. Extends `joint_pretrends_test` with a detrending mode or a separate Eq 18-specific helper. Deferred to Phase 4 replication harness (where the published p=0.51 serves as the parity anchor). | `diff_diff/had_pretests.py::joint_pretrends_test` | Phase 4 | Medium |
| `HeterogeneousAdoptionDiD` Phase 3 Stute performance: Appendix D vectorized matrix form replaces the per-iteration OLS refit with a single precomputed `M = I - X(X'X)^{-1}X'` applied to `eps * eta`. Functionally identical, ~2x faster. Shipped literal-refit form in Phase 3 to match paper text and keep reviewer surface small. | `diff_diff/had_pretests.py::stute_test` | Phase 3 | Low |
diff --git a/diff_diff/__init__.py b/diff_diff/__init__.py
index ccf425f3..e95ec008 100644
--- a/diff_diff/__init__.py
+++ b/diff_diff/__init__.py
@@ -151,6 +151,7 @@
SurveyDesign,
SurveyMetadata,
compute_deff_diagnostics,
+ make_pweight_design,
)
from diff_diff.staggered import (
CallawaySantAnna,
@@ -445,6 +446,7 @@
"SurveyMetadata",
"DEFFDiagnostics",
"compute_deff_diagnostics",
+ "make_pweight_design",
# Rust backend
"HAS_RUST_BACKEND",
# Linear algebra helpers
diff --git a/diff_diff/had.py b/diff_diff/had.py
index 84ac8963..b7a77f05 100644
--- a/diff_diff/had.py
+++ b/diff_diff/had.py
@@ -76,7 +76,13 @@
BiasCorrectedFit,
bias_corrected_local_linear,
)
-from diff_diff.survey import SurveyMetadata, compute_survey_metadata
+from diff_diff.survey import (
+ HAD_DEPRECATION_MSG_SURVEY_KWARG,
+ HAD_DEPRECATION_MSG_WEIGHTS_KWARG_HAD_FIT,
+ HAD_DUAL_KNOB_MUTEX_MSG_DATA_IN,
+ SurveyMetadata,
+ compute_survey_metadata,
+)
from diff_diff.utils import safe_inference
__all__ = [
@@ -695,10 +701,13 @@ class HeterogeneousAdoptionDiDEventStudyResults:
# fits stay unchanged; all None on unweighted fits).
variance_formula: Optional[str] = None
"""Per-horizon variance family label (applied uniformly across all
- horizons in the fit). One of ``"pweight"`` / ``"pweight_2sls"``
- (weights= shortcut; continuous / mass-point), ``"survey_binder_tsl"``
- / ``"survey_binder_tsl_2sls"`` (survey= path), or ``None`` on
- unweighted fits. Mirrors the static-path ``variance_formula`` field."""
+ horizons in the fit). One of ``"pweight"`` / ``"pweight_2sls"`` (when
+ a per-row weight array was supplied, including via the deprecated
+ ``weights=`` alias; continuous / mass-point), ``"survey_binder_tsl"``
+ / ``"survey_binder_tsl_2sls"`` (when a SurveyDesign was supplied via
+ ``survey_design=`` or the deprecated ``survey=`` alias), or ``None``
+ on unweighted fits. Mirrors the static-path ``variance_formula``
+ field."""
effective_dose_mean: Optional[float] = None
"""Weighted denominator used by the β̂-scale rescaling. For continuous
designs: weighted ``sum(w · d)/sum(w)`` (continuous_at_zero) or
@@ -2783,9 +2792,15 @@ def fit(
unit_col: str,
first_treat_col: Optional[str] = None,
aggregate: str = "overall",
+ # PR #376 R4 P1: preserve pre-PR positional-or-keyword status of
+ # `survey`, `weights`, `cband` for back-compat with positional
+ # callers. `survey_design=` is the only new addition and is
+ # keyword-only.
survey: Any = None,
weights: Optional[np.ndarray] = None,
cband: bool = True,
+ *,
+ survey_design: Any = None,
) -> HeterogeneousAdoptionDiDResults:
"""Fit the HAD estimator.
@@ -2835,66 +2850,123 @@ def fit(
CIs per horizon; joint cross-horizon covariance is deferred
to a follow-up PR. Staggered-timing panels are auto-filtered
to the last-treatment cohort with a ``UserWarning``.
- survey : SurveyDesign or None
+ survey_design : SurveyDesign or None, keyword-only
Survey design (sampling weights + optional strata / PSU / FPC)
- for design-based inference on the two continuous-dose paths
- (``continuous_at_zero``, ``continuous_near_d_lower``). Passes
- through :func:`compute_survey_if_variance` (Binder 1983 TSL)
- for the SE; weights propagate pointwise into the lprobust
- kernel composition. Only ``weight_type="pweight"`` is
- supported in Phase 4.5 A — ``aweight`` / ``fweight`` raise
- ``NotImplementedError``. Survey design columns (strata / PSU /
- FPC) must be constant within unit (sampling-unit-level
- assignment); within-unit variance raises ``ValueError``.
- Replicate-weight designs raise ``NotImplementedError``
- (Phase 4.5 C). Phase 4.5 B support matrix: survey / weights
- are now accepted on ALL design × aggregate combinations
- (continuous × {overall, event-study}, mass-point × {overall,
- event-study}); HAD pretests (``qug_test``, ``stute_test``,
- ``yatchew_hr_test``, joint variants,
- ``did_had_pretest_workflow``) still don't accept
- survey/weights — deferred to Phase 4.5 C / C0.
+ for design-based inference. Supported on ALL design × aggregate
+ combinations after Phase 4.5 B: continuous paths
+ (``continuous_at_zero``, ``continuous_near_d_lower``) on both
+ ``aggregate="overall"`` and ``aggregate="event_study"``, AND
+ the ``mass_point`` design on both aggregates. Continuous paths
+ compose the SE via :func:`compute_survey_if_variance` (Binder
+ 1983 TSL); weights propagate pointwise into the lprobust
+ kernel. Mass-point composes the per-unit 2SLS IF on the
+ HC1-scale and Binder-TSL-aggregates that — requires
+ ``vcov_type='hc1'`` (the classical default raises
+ ``NotImplementedError`` on the survey path). Event-study fits
+ with ``cband=True`` add a multiplier-bootstrap simultaneous
+ confidence band. Only ``weight_type="pweight"`` is supported
+ (``aweight`` / ``fweight`` raise ``NotImplementedError``).
+ Survey design columns (strata / PSU / FPC) must be constant
+ within unit (sampling-unit-level assignment); within-unit
+ variance raises ``ValueError``. Replicate-weight designs raise
+ ``NotImplementedError``. Mutually exclusive with the deprecated
+ ``survey=`` and ``weights=`` aliases. See
+ ``docs/methodology/REGISTRY.md`` § HeterogeneousAdoptionDiD —
+ "Note (HAD survey-design API consolidation)" for the full
+ dispatch matrix.
+ survey : SurveyDesign or None
+ DEPRECATED alias of ``survey_design=``. Remains positional-or-
+ keyword for one minor cycle to preserve pre-PR call shapes;
+ will be removed in the next minor release. Prefer
+ ``survey_design=``.
weights : np.ndarray or None
- Per-row sampling weights as a lightweight shortcut equivalent
- to ``survey=SurveyDesign(weights=
)``. Produces the same
- ATT; the SE uses the analytical weighted HC1 sandwich
- (continuous: CCT-2014 weighted-robust; mass-point: pweight
- 2SLS sandwich) rather than Binder-TSL. Must be constant
- within each unit; row-order aligned with ``data`` (index
- labels are resolved to positional offsets via
- ``data.index.get_indexer``, so custom non-RangeIndex inputs
- work as long as ``data.index`` is unique). Mutually
- exclusive with ``survey=`` — passing both raises
- ``ValueError``.
+ DEPRECATED alias for the per-row pweight shortcut. Remains
+ positional-or-keyword for one minor cycle. Prefer adding the
+ weights as a column on ``data`` and passing
+ ``survey_design=SurveyDesign(weights='col_name')`` instead.
+ Will be removed in the next minor release. Currently
+ preserved as the analytical-HC1-sandwich shortcut (continuous:
+ CCT-2014 weighted-robust; mass-point: pweight 2SLS sandwich)
+ with the per-row → per-unit aggregation invariant intact.
+ Mutually exclusive with ``survey_design=`` and ``survey=``.
cband : bool, default True
Phase 4.5 B: controls the multiplier-bootstrap simultaneous
confidence band on the weighted event-study path. When
- ``True`` (default) and ``aggregate="event_study"`` AND
- ``weights=`` or ``survey=`` is supplied, the fit populates
- ``cband_low`` / ``cband_high`` / ``cband_crit_value`` /
- ``cband_method`` / ``cband_n_bootstrap`` on the result. When
- ``False`` those fields stay ``None``. No effect on
- ``aggregate="overall"`` or on unweighted event-study.
- ``n_bootstrap`` and ``seed`` (constructor params) control
- replicate count and RNG; defaults are 999 / ``None``.
+ ``True`` (default) and ``aggregate="event_study"`` AND any of
+ ``survey_design=`` / ``survey=`` / ``weights=`` is supplied,
+ the fit populates ``cband_low`` / ``cband_high`` /
+ ``cband_crit_value`` / ``cband_method`` / ``cband_n_bootstrap``
+ on the result. When ``False`` those fields stay ``None``. No
+ effect on ``aggregate="overall"`` or on unweighted event-
+ study. ``n_bootstrap`` and ``seed`` (constructor params)
+ control replicate count and RNG; defaults are 999 / ``None``.
Returns
-------
HeterogeneousAdoptionDiDResults
"""
- # ---- aggregate / survey / weights validation ----
+ # ---- aggregate / survey_design / survey / weights validation ----
if aggregate not in _VALID_AGGREGATES:
raise ValueError(
f"Invalid aggregate={aggregate!r}. Must be one of " f"{_VALID_AGGREGATES}."
)
- if survey is not None and weights is not None:
- raise ValueError(
- "Pass survey= OR weights=, not both. "
- "For SurveyDesign-composed inference (PSU, strata, FPC, "
- "replicate weights), use survey=. For a simple pweight-only "
- "shortcut, use weights=; it is internally equivalent to "
- "survey=SurveyDesign(weights=w)."
+ # Three-way mutex on survey_design / survey / weights (data-in pattern).
+ n_set = sum(x is not None for x in (survey_design, survey, weights))
+ if n_set > 1:
+ raise ValueError(HAD_DUAL_KNOB_MUTEX_MSG_DATA_IN)
+
+ # Soft deprecation: route legacy survey=/weights= aliases to
+ # survey_design=. The internal back-end paths (legacy weights= and
+ # survey= routing below) are unchanged; only the entry signature
+ # wraps them. The bit-exact back-compat invariant is preserved
+ # because we only rebind names, not values, and the legacy `survey`
+ # / `weights` variables are re-derived from `survey_design` for
+ # downstream consumption.
+ if survey is not None:
+ warnings.warn(HAD_DEPRECATION_MSG_SURVEY_KWARG, DeprecationWarning, stacklevel=2)
+ survey_design = survey
+ elif weights is not None:
+ warnings.warn(
+ HAD_DEPRECATION_MSG_WEIGHTS_KWARG_HAD_FIT,
+ DeprecationWarning,
+ stacklevel=2,
)
+ # weights= shortcut preserved as-is on the back end (the
+ # downstream `if weights is not None:` branch consumes the
+ # raw array directly via _aggregate_unit_weights). Don't
+ # rebind survey_design here — the array is not a
+ # SurveyDesign and survey_design= cannot accept arrays.
+ else:
+ # Canonical path: survey_design= may be None or a SurveyDesign
+ # instance. Map back to the internal `survey` variable name
+ # so downstream code (legacy `if survey is not None:` branch)
+ # consumes the input transparently.
+ survey = survey_design
+
+ # Type guard on the data-in surface (PR #376 R8 P1): HAD.fit()
+ # accepts a SurveyDesign that gets resolved against `data` at fit
+ # time; a pre-resolved ResolvedSurveyDesign (or its
+ # make_pweight_design factory output) goes to the array-in pretest
+ # helpers, NOT to fit(). Reject explicitly with migration guidance
+ # rather than letting `survey.resolve(data)` AttributeError or
+ # `survey.weights` (a numpy array on Resolved) be misinterpreted as
+ # a column name. Mirrors the array-in helpers' isinstance-SurveyDesign
+ # rejection in stute_test/yatchew_hr_test/stute_joint_pretest.
+ if survey is not None and not hasattr(survey, "resolve"):
+ raise TypeError(
+ "HeterogeneousAdoptionDiD.fit: `survey_design=` accepts a "
+ "SurveyDesign instance (column-referencing, gets "
+ "`.resolve(data)`'d at fit time) on the data-in estimator "
+ "surface. Got "
+ f"{type(survey).__name__} (no `.resolve()` method). "
+ "If you have a pre-resolved ResolvedSurveyDesign or used "
+ "`make_pweight_design(arr)`, that pattern is for the "
+ "array-in pretest helpers (`stute_test`, `yatchew_hr_test`, "
+ "`stute_joint_pretest`). On HAD.fit, add the weights as a "
+ "column on `data` and pass "
+ "`survey_design=SurveyDesign(weights='col_name', ...)`."
+ )
+
# Dispatch the event-study path to a dedicated method so the
# single-period path stays unchanged (Phase 2a contract preserved).
# Note: event_study returns HeterogeneousAdoptionDiDEventStudyResults
diff --git a/diff_diff/had_pretests.py b/diff_diff/had_pretests.py
index 0f72e3cf..1b0e3bcf 100644
--- a/diff_diff/had_pretests.py
+++ b/diff_diff/had_pretests.py
@@ -75,7 +75,15 @@
_validate_had_panel,
_validate_had_panel_event_study,
)
-from diff_diff.survey import _make_trivial_resolved
+from diff_diff.survey import (
+ HAD_DEPRECATION_MSG_SURVEY_KWARG,
+ HAD_DEPRECATION_MSG_WEIGHTS_KWARG_ARRAY_IN,
+ HAD_DEPRECATION_MSG_WEIGHTS_KWARG_DATA_IN,
+ HAD_DUAL_KNOB_MUTEX_MSG_ARRAY_IN,
+ HAD_DUAL_KNOB_MUTEX_MSG_DATA_IN,
+ SurveyDesign,
+ make_pweight_design,
+)
from diff_diff.utils import _generate_mammen_weights
__all__ = [
@@ -100,6 +108,7 @@
_MIN_N_BOOTSTRAP = 99
_STUTE_LARGE_G_THRESHOLD = 100_000
+
# Scale-invariant tolerance for detecting a numerically exact linear OLS fit.
# The ratio SSR / TSS = sum(eps^2) / sum((dy - dybar)^2) equals 1 - R^2
# and is BOTH TRANSLATION-INVARIANT (centering absorbs additive shifts)
@@ -1201,6 +1210,7 @@ def qug_test(
d: np.ndarray,
alpha: float = 0.05,
*,
+ survey_design: Any = None,
survey: Any = None,
weights: Optional[np.ndarray] = None,
) -> QUGTestResults:
@@ -1223,12 +1233,26 @@ def qug_test(
Post-period dose vector. Must be 1D numeric and contain no NaN.
alpha : float, default 0.05
One-sided significance level. Must satisfy ``0 < alpha < 1``.
- survey : SurveyDesign or None, keyword-only, default None
+ survey_design : ResolvedSurveyDesign or None, keyword-only, default None
Permanently rejected with ``NotImplementedError`` (Phase 4.5 C0
- decision gate). See *Notes -- Survey/weighted data*.
+ decision gate). Surface-symmetric kwarg with the rest of the HAD
+ family — accepted in the signature so all 8 HAD entry points
+ share the canonical kwarg name, but ``qug_test`` has no
+ survey-aware migration target. See *Notes -- Survey/weighted
+ data*.
+ survey : SurveyDesign or None, keyword-only, default None
+ DEPRECATED alias of ``survey_design=``. Surface-symmetric only;
+ any non-``None`` value still raises ``NotImplementedError`` —
+ the deprecation is about kwarg-name consolidation, NOT a
+ migration path (there is no survey-aware QUG). Will be removed
+ in the next minor release.
weights : np.ndarray or None, keyword-only, default None
- Permanently rejected with ``NotImplementedError`` (Phase 4.5 C0
- decision gate). See *Notes -- Survey/weighted data*.
+ DEPRECATED alias of ``survey_design=`` for the per-row pweight
+ shortcut on the rest of the HAD array-in family. On
+ ``qug_test``, surface-symmetric only; any non-``None`` value
+ still raises ``NotImplementedError`` — there is no migration
+ path (``make_pweight_design(arr)`` is NOT a valid QUG migration
+ target). Will be removed in the next minor release.
Returns
-------
@@ -1240,11 +1264,11 @@ def qug_test(
------
ValueError
If ``d`` is not 1D numeric or contains NaN, or if ``alpha`` is
- not in ``(0, 1)``, or if ``survey`` and ``weights`` are both
- non-None (mutex).
+ not in ``(0, 1)``, or if more than one of
+ ``survey_design``/``survey``/``weights`` is non-None (mutex).
NotImplementedError
- If ``survey`` or ``weights`` is non-None. See
- *Notes -- Survey/weighted data*.
+ If any of ``survey_design``, ``survey``, ``weights`` is non-None.
+ See *Notes -- Survey/weighted data*.
Notes
-----
@@ -1277,25 +1301,67 @@ def qug_test(
if not (0.0 < alpha < 1.0):
raise ValueError(f"alpha must satisfy 0 < alpha < 1, got {alpha}.")
- # Mutex on survey/weights, mirroring HeterogeneousAdoptionDiD.fit()
- # at had.py:2890 so users get a consistent error across the HAD
- # surface area.
- if survey is not None and weights is not None:
+ # Three-way mutex on survey_design / survey / weights. qug_test rejects
+ # ALL non-None survey-aware inputs (Phase 4.5 C0 permanent deferral, see
+ # NotImplementedError below), so the mutex message here is qug-specific
+ # and does NOT point users to `make_pweight_design(arr)` (which the
+ # array-in mutex on `stute_test`/`yatchew_hr_test`/`stute_joint_pretest`
+ # does suggest as the migration target). PR #376 R2 P3 fix.
+ n_set = sum(x is not None for x in (survey_design, survey, weights))
+ if n_set > 1:
raise ValueError(
- "Pass survey= OR weights=, not both. "
- "qug_test does not yet accept either kwarg (Phase 4.5 C0 "
- "decision gate); see the NotImplementedError below for the "
- "methodology rationale."
+ "qug_test: pass at most one of `survey_design=`, `survey=`, or "
+ "`weights=`. All three are permanently rejected on qug_test "
+ "(Phase 4.5 C0 deferral) — there is no migration path; see the "
+ "NotImplementedError raised below for the methodology rationale."
)
+ # Soft deprecation: route legacy survey=/weights= aliases through
+ # survey_design= for the gated NotImplementedError below. PR #376 R10
+ # P3: qug_test-specific deprecation messages — the shared
+ # HAD_DEPRECATION_MSG_*_KWARG_ARRAY_IN strings tell users to migrate to
+ # `survey_design=` / `make_pweight_design(...)`, but qug_test
+ # permanently rejects ALL survey-aware kwargs (Phase 4.5 C0 deferral).
+ # Use qug-specific warning text that says the aliases are deprecated
+ # but survey-aware QUG remains unsupported, and points users to
+ # unweighted `qug_test()` or `did_had_pretest_workflow(...,
+ # survey_design=...)` for the survey-aware linearity family.
+ if survey is not None:
+ warnings.warn(
+ "`survey=` is deprecated on qug_test (will be removed in the "
+ "next minor release). Note that qug_test does NOT support "
+ "survey-aware inputs at all (Phase 4.5 C0 permanent deferral; "
+ "see the NotImplementedError below). For survey-aware HAD "
+ "pretesting, use `did_had_pretest_workflow(..., "
+ "survey_design=...)` (the workflow skips the QUG step under "
+ "survey/weights and runs the linearity family).",
+ DeprecationWarning,
+ stacklevel=2,
+ )
+ survey_design = survey
+ elif weights is not None:
+ warnings.warn(
+ "`weights=` is deprecated on qug_test (will be removed in the "
+ "next minor release). Note that qug_test does NOT support "
+ "weighted/survey inputs at all (Phase 4.5 C0 permanent deferral; "
+ "see the NotImplementedError below). For survey-aware HAD "
+ "pretesting, use `did_had_pretest_workflow(..., "
+ "survey_design=...)` (the workflow skips the QUG step under "
+ "survey/weights and runs the linearity family).",
+ DeprecationWarning,
+ stacklevel=2,
+ )
+ survey_design = make_pweight_design(np.asarray(weights, dtype=np.float64))
+
# Phase 4.5 C0 decision gate: QUG-under-survey is permanently deferred.
# Extreme-order-statistic functionals are not smooth in the empirical
# CDF, so standard survey machinery (Binder TSL linearization, Rao-Wu
# rescaled bootstrap) does not provide a calibrated test. See
# REGISTRY.md § "QUG Null Test" for the full methodology note.
- if survey is not None or weights is not None:
+ if survey_design is not None:
raise NotImplementedError(
- "qug_test does not support survey= / weights= kwargs.\n"
+ "qug_test does not support survey_design= / survey= / "
+ "weights= kwargs.\n"
"\n"
"QUG (de Chaisemartin et al. 2026, Theorem 4) tests "
"H_0: d_lower = 0 via the ratio of the two smallest order "
@@ -1311,7 +1377,7 @@ def qug_test(
"boundary tests; no off-the-shelf survey-aware QUG exists.\n"
"\n"
"For survey-aware HAD pretesting, use the joint Stute family "
- "via did_had_pretest_workflow(..., survey=..., "
+ "via did_had_pretest_workflow(..., survey_design=..., "
"aggregate=...) -- shipped in Phase 4.5 C. The workflow "
"skips the QUG step under survey/weights with a UserWarning "
"and runs the linearity family with a PSU-level Mammen "
@@ -1418,8 +1484,9 @@ def stute_test(
n_bootstrap: int = 999,
seed: Optional[int] = None,
*,
- weights: Optional[np.ndarray] = None,
+ survey_design: Any = None,
survey: Any = None,
+ weights: Optional[np.ndarray] = None,
) -> StuteTestResults:
"""Run the Stute Cramer-von Mises linearity test (paper Appendix D).
@@ -1446,19 +1513,22 @@ def stute_test(
seed : int or None, default None
Seed for ``np.random.default_rng``. Pass an integer for
reproducible results.
- weights : np.ndarray or None, keyword-only, default None
- Per-unit positive weights for the pweight shortcut. Mutually
- exclusive with ``survey``. When supplied, the bootstrap is routed
- through a synthetic trivial ``ResolvedSurveyDesign`` (no
- strata/PSU/FPC) so that the same survey-aware kernel handles both
- entry points. See *Notes -- Survey/weighted data*.
- survey : ResolvedSurveyDesign or None, keyword-only, default None
- Already-resolved survey design (per-unit). Triggers the survey-
- aware Stute calibration: PSU-level Mammen multipliers via
+ survey_design : ResolvedSurveyDesign or None, keyword-only, default None
+ Already-resolved survey design (per-unit). Array-in helpers
+ accept ``ResolvedSurveyDesign`` ONLY; passing a ``SurveyDesign``
+ raises ``TypeError`` with migration guidance. For the pweight-only
+ shortcut, use ``survey_design=make_pweight_design(arr)``. Triggers
+ the survey-aware Stute calibration: PSU-level Mammen multipliers
+ via
:func:`diff_diff.bootstrap_utils.generate_survey_multiplier_weights_batch`,
broadcast to per-unit residual perturbation, with weighted CvM
- recompute. Replicate-weight designs raise ``NotImplementedError``
- (deferred to a parallel follow-up after Phase 4.5 C).
+ recompute. Replicate-weight designs raise ``NotImplementedError``.
+ survey : ResolvedSurveyDesign or None, keyword-only, default None
+ DEPRECATED alias of ``survey_design=``. Will be removed in the
+ next minor release.
+ weights : np.ndarray or None, keyword-only, default None
+ DEPRECATED alias of ``survey_design=make_pweight_design(arr)``.
+ Will be removed in the next minor release.
Returns
-------
@@ -1470,8 +1540,16 @@ def stute_test(
If ``d`` / ``dy`` are not 1D numeric, contain NaN, have unequal
lengths, if any ``d`` value is negative (paper Section 2 HAD
support restriction), if ``alpha`` is outside ``(0, 1)``, or if
- ``n_bootstrap < 99``. Also raised if BOTH ``weights`` and
- ``survey`` are supplied (mutex).
+ ``n_bootstrap < 99``. Also raised if more than one of
+ ``survey_design``, ``survey``, ``weights`` is supplied (3-way
+ mutex; ``survey=`` and ``weights=`` are deprecated aliases of
+ ``survey_design=``).
+ TypeError
+ If ``survey_design=SurveyDesign(...)`` (or the deprecated
+ ``survey=SurveyDesign(...)`` alias) is passed; array-in helpers
+ accept ``ResolvedSurveyDesign`` only. Use
+ ``survey_design=make_pweight_design(arr)`` for pweight-only or
+ pre-resolve via ``SurveyDesign(...).resolve(data)``.
NotImplementedError
If ``survey.replicate_weights is not None``. Replicate-weight
pretests are a parallel follow-up after Phase 4.5 C; the
@@ -1524,14 +1602,52 @@ def stute_test(
f"Got n_bootstrap={n_bootstrap}."
)
- # Phase 4.5 C: survey/weights mutex + replicate-weight rejection.
- # Mirrors the C0 pattern from qug_test and HeterogeneousAdoptionDiD.fit().
- if survey is not None and weights is not None:
- raise ValueError(
- "stute_test: pass survey= OR weights=, "
- "not both. survey= triggers full PSU-aware bootstrap; weights= is "
- "the pweight shortcut routed through a synthetic trivial design."
+ # Three-way mutex on survey_design / survey / weights (array-in pattern).
+ n_set = sum(x is not None for x in (survey_design, survey, weights))
+ if n_set > 1:
+ raise ValueError(HAD_DUAL_KNOB_MUTEX_MSG_ARRAY_IN)
+
+ # Soft deprecation: route legacy survey=/weights= aliases to survey_design=
+ # FIRST so the type guard below covers `survey=SurveyDesign(...)` too
+ # (PR #376 R1 P1: alias must behave identically to the canonical kwarg).
+ # The bit-exact normalization-order invariant requires passing UNNORMALIZED
+ # weights to make_pweight_design; the unified path's mean=1 step (~line
+ # 1669) fires downstream EXACTLY ONCE.
+ if survey is not None:
+ warnings.warn(HAD_DEPRECATION_MSG_SURVEY_KWARG, DeprecationWarning, stacklevel=2)
+ survey_design = survey
+ elif weights is not None:
+ warnings.warn(
+ HAD_DEPRECATION_MSG_WEIGHTS_KWARG_ARRAY_IN,
+ DeprecationWarning,
+ stacklevel=2,
)
+ survey_design = make_pweight_design(np.asarray(weights, dtype=np.float64))
+
+ # Type guard: array-in helpers reject SurveyDesign (cannot resolve column
+ # names without `data`). Runs AFTER alias rebinding so it covers both
+ # `survey_design=SurveyDesign(...)` and the deprecated
+ # `survey=SurveyDesign(...)` form identically.
+ if survey_design is not None and isinstance(survey_design, SurveyDesign):
+ raise TypeError(
+ "stute_test: `survey_design=` accepts a pre-resolved "
+ "ResolvedSurveyDesign only (array-in helpers have no `data` to "
+ "resolve column names against). For pweight-only, use "
+ "`survey_design=make_pweight_design(arr)`. For full PSU/strata/"
+ "FPC, pre-resolve via `SurveyDesign(...).resolve(data)` and pass "
+ "the result."
+ )
+
+ # Internal alias rebind: downstream code uses `survey` and `weights` as
+ # internal variable names (Phase 4.5 C convention). After the deprecation
+ # block, fold the canonical survey_design back into the legacy variable
+ # names so the unchanged downstream logic consumes the input transparently.
+ survey = survey_design
+ weights = None # weights= alias has been folded into survey_design
+
+ # Replicate-weight rejection: the per-replicate weight-ratio rescaling for
+ # the OLS-on-residuals refit step is not covered by the multiplier-bootstrap
+ # composition. Parallel follow-up after Phase 4.5 C.
if survey is not None and getattr(survey, "replicate_weights", None) is not None:
raise NotImplementedError(
"stute_test: replicate-weight survey designs (BRR/Fay/JK1/JKn/SDR) "
@@ -1721,7 +1837,7 @@ def stute_test(
# (broadcast to per-obs perturbation), weighted OLS refit, weighted
# CvM recompute. Routes via synthetic trivial ResolvedSurveyDesign
# for the weights= shortcut to share the same kernel.
- resolved_for_boot = survey if survey is not None else _make_trivial_resolved(w_arr)
+ resolved_for_boot = survey if survey is not None else make_pweight_design(w_arr)
# R10 P1: reject stratified designs explicitly until a derived
# Stute-specific correction lands. The HAD sup-t bootstrap
# (had.py:2120+) applies a within-stratum demean +
@@ -1833,8 +1949,9 @@ def yatchew_hr_test(
dy: np.ndarray,
alpha: float = 0.05,
*,
- weights: Optional[np.ndarray] = None,
+ survey_design: Any = None,
survey: Any = None,
+ weights: Optional[np.ndarray] = None,
) -> YatchewTestResults:
"""Run the Yatchew heteroskedasticity-robust linearity test.
@@ -1858,17 +1975,23 @@ def yatchew_hr_test(
Dose and first-difference outcome vectors.
alpha : float, default 0.05
One-sided significance level.
- weights : np.ndarray or None, keyword-only, default None
- Per-unit STRICTLY POSITIVE weights for the pweight shortcut.
- Mutually exclusive with ``survey``. See *Notes -- Survey/weighted data*.
- survey : ResolvedSurveyDesign or None, keyword-only, default None
- Already-resolved survey design (per-unit). When supplied, the OLS
+ survey_design : ResolvedSurveyDesign or None, keyword-only, default None
+ Already-resolved survey design (per-unit). Array-in helpers accept
+ ``ResolvedSurveyDesign`` ONLY; passing a ``SurveyDesign`` raises
+ ``TypeError``. For pweight-only, use
+ ``survey_design=make_pweight_design(arr)``. When supplied, the OLS
baseline becomes weighted OLS and all three variance components
become their pweight-sandwich analogs. PSU clustering is NOT
propagated through the variance-ratio statistic (would require
deriving a survey-aware variance-of-variance estimator; out of
scope per Phase 4.5 C). Replicate-weight designs raise
``NotImplementedError``.
+ survey : ResolvedSurveyDesign or None, keyword-only, default None
+ DEPRECATED alias of ``survey_design=``. Will be removed in the
+ next minor release.
+ weights : np.ndarray or None, keyword-only, default None
+ DEPRECATED alias of ``survey_design=make_pweight_design(arr)``.
+ Will be removed in the next minor release.
Returns
-------
@@ -1880,8 +2003,16 @@ def yatchew_hr_test(
If ``d`` / ``dy`` are not 1D numeric, contain NaN, have unequal
lengths, if any ``d`` value is negative (paper Section 2 HAD
support restriction), or if ``alpha`` is outside ``(0, 1)``.
- Also raised if BOTH ``weights`` and ``survey`` supplied (mutex),
- or if any weight is non-positive.
+ Also raised if more than one of ``survey_design``, ``survey``,
+ ``weights`` is supplied (3-way mutex; ``survey=`` and
+ ``weights=`` are deprecated aliases of ``survey_design=``), or
+ if any weight is non-positive.
+ TypeError
+ If ``survey_design=SurveyDesign(...)`` (or the deprecated
+ ``survey=SurveyDesign(...)`` alias) is passed; array-in helpers
+ accept ``ResolvedSurveyDesign`` only. Use
+ ``survey_design=make_pweight_design(arr)`` for pweight-only or
+ pre-resolve via ``SurveyDesign(...).resolve(data)``.
NotImplementedError
If ``survey.replicate_weights is not None`` (deferred follow-up).
@@ -1956,11 +2087,42 @@ def yatchew_hr_test(
if not (0.0 < alpha < 1.0):
raise ValueError(f"alpha must satisfy 0 < alpha < 1, got {alpha}.")
- # Phase 4.5 C: survey/weights mutex + replicate-weight rejection.
- if survey is not None and weights is not None:
- raise ValueError(
- "yatchew_hr_test: pass survey= OR " "weights=, not both."
+ # Three-way mutex on survey_design / survey / weights (array-in pattern).
+ n_set = sum(x is not None for x in (survey_design, survey, weights))
+ if n_set > 1:
+ raise ValueError(HAD_DUAL_KNOB_MUTEX_MSG_ARRAY_IN)
+
+ # Soft deprecation: route legacy survey=/weights= aliases to survey_design=
+ # FIRST so the type guard below covers `survey=SurveyDesign(...)` too
+ # (PR #376 R1 P1: alias must behave identically to the canonical kwarg).
+ if survey is not None:
+ warnings.warn(HAD_DEPRECATION_MSG_SURVEY_KWARG, DeprecationWarning, stacklevel=2)
+ survey_design = survey
+ elif weights is not None:
+ warnings.warn(
+ HAD_DEPRECATION_MSG_WEIGHTS_KWARG_ARRAY_IN,
+ DeprecationWarning,
+ stacklevel=2,
+ )
+ survey_design = make_pweight_design(np.asarray(weights, dtype=np.float64))
+
+ # Type guard: array-in helpers reject SurveyDesign. Runs AFTER alias
+ # rebinding so it covers both `survey_design=SurveyDesign(...)` and the
+ # deprecated `survey=SurveyDesign(...)` form identically.
+ if survey_design is not None and isinstance(survey_design, SurveyDesign):
+ raise TypeError(
+ "yatchew_hr_test: `survey_design=` accepts a pre-resolved "
+ "ResolvedSurveyDesign only (array-in helpers have no `data` to "
+ "resolve column names against). For pweight-only, use "
+ "`survey_design=make_pweight_design(arr)`. For full PSU/strata/"
+ "FPC, pre-resolve via `SurveyDesign(...).resolve(data)`."
)
+
+ # Internal alias rebind for back-compat with downstream code.
+ survey = survey_design
+ weights = None
+
+ # Replicate-weight rejection.
if survey is not None and getattr(survey, "replicate_weights", None) is not None:
raise NotImplementedError(
"yatchew_hr_test: replicate-weight survey designs (BRR/Fay/JK1/JKn/"
@@ -2516,8 +2678,9 @@ def stute_joint_pretest(
n_bootstrap: int = 999,
seed: Optional[int] = None,
null_form: str = "custom",
- weights: Optional[np.ndarray] = None,
+ survey_design: Any = None,
survey: Any = None,
+ weights: Optional[np.ndarray] = None,
) -> StuteJointResult:
"""Joint Cramer-von Mises pretest across multiple horizons.
@@ -2563,21 +2726,25 @@ def stute_joint_pretest(
(``"mean_independence"`` | ``"linearity"`` | ``"custom"``).
The wrappers :func:`joint_pretrends_test` and
:func:`joint_homogeneity_test` set this automatically.
- weights : np.ndarray or None, keyword-only, default None
- Per-unit positive weights (Phase 4.5 C). When supplied, the
- per-horizon CvM uses :func:`_cvm_statistic_weighted` and the
- bootstrap routes through a synthetic trivial
- ``ResolvedSurveyDesign``. Mutually exclusive with ``survey``.
+ survey_design : ResolvedSurveyDesign or None, keyword-only, default None
+ Already-resolved per-unit survey design (Phase 4.5 C). Array-in
+ helpers accept ``ResolvedSurveyDesign`` ONLY; passing a
+ ``SurveyDesign`` raises ``TypeError``. For pweight-only, use
+ ``survey_design=make_pweight_design(arr)``. When supplied, the
+ bootstrap is a PSU-level Mammen multiplier bootstrap with the
+ multiplier matrix shared across horizons within each replicate
+ (preserves both vector-valued empirical-process unit-level
+ dependence + PSU clustering). Replicate-weight designs raise
+ ``NotImplementedError``; non-pweight weight types are rejected.
+ Variance-unidentified designs (``df_survey <= 0``) return NaN
+ with a ``UserWarning`` instead of calibrating against an
+ all-zero multiplier matrix.
survey : ResolvedSurveyDesign or None, keyword-only, default None
- Already-resolved per-unit survey design (Phase 4.5 C). When
- supplied, the bootstrap is a PSU-level Mammen multiplier
- bootstrap with the multiplier matrix shared across horizons
- within each replicate (preserves both vector-valued empirical-
- process unit-level dependence + PSU clustering). Replicate-
- weight designs raise ``NotImplementedError``; non-pweight
- weight types are rejected. Variance-unidentified designs
- (``df_survey <= 0``) return NaN with a ``UserWarning`` instead
- of calibrating against an all-zero multiplier matrix.
+ DEPRECATED alias of ``survey_design=``. Will be removed in the
+ next minor release.
+ weights : np.ndarray or None, keyword-only, default None
+ DEPRECATED alias of ``survey_design=make_pweight_design(arr)``.
+ Will be removed in the next minor release.
Returns
-------
@@ -2602,13 +2769,42 @@ def stute_joint_pretest(
negative values, ``n_bootstrap < _MIN_N_BOOTSTRAP``, or invalid
``alpha``. ``G < _MIN_G_STUTE`` does NOT raise; see Returns.
"""
- # Phase 4.5 C: survey/weights mutex + replicate-weight rejection
- # (mirrors stute_test, yatchew_hr_test, did_had_pretest_workflow).
- if survey is not None and weights is not None:
- raise ValueError(
- "stute_joint_pretest: pass survey= OR "
- "weights=, not both."
+ # Three-way mutex on survey_design / survey / weights (array-in pattern).
+ n_set = sum(x is not None for x in (survey_design, survey, weights))
+ if n_set > 1:
+ raise ValueError(HAD_DUAL_KNOB_MUTEX_MSG_ARRAY_IN)
+
+ # Soft deprecation: route legacy survey=/weights= aliases to survey_design=
+ # FIRST so the type guard below covers `survey=SurveyDesign(...)` too
+ # (PR #376 R1 P1: alias must behave identically to the canonical kwarg).
+ if survey is not None:
+ warnings.warn(HAD_DEPRECATION_MSG_SURVEY_KWARG, DeprecationWarning, stacklevel=2)
+ survey_design = survey
+ elif weights is not None:
+ warnings.warn(
+ HAD_DEPRECATION_MSG_WEIGHTS_KWARG_ARRAY_IN,
+ DeprecationWarning,
+ stacklevel=2,
+ )
+ survey_design = make_pweight_design(np.asarray(weights, dtype=np.float64))
+
+ # Type guard: array-in helpers reject SurveyDesign. Runs AFTER alias
+ # rebinding so it covers both `survey_design=SurveyDesign(...)` and the
+ # deprecated `survey=SurveyDesign(...)` form identically.
+ if survey_design is not None and isinstance(survey_design, SurveyDesign):
+ raise TypeError(
+ "stute_joint_pretest: `survey_design=` accepts a pre-resolved "
+ "ResolvedSurveyDesign only (array-in helpers have no `data` to "
+ "resolve column names against). For pweight-only, use "
+ "`survey_design=make_pweight_design(arr)`. For full PSU/strata/"
+ "FPC, pre-resolve via `SurveyDesign(...).resolve(data)`."
)
+
+ # Internal alias rebind for back-compat with downstream code.
+ survey = survey_design
+ weights = None
+
+ # Replicate-weight rejection.
if survey is not None and getattr(survey, "replicate_weights", None) is not None:
raise NotImplementedError(
"stute_joint_pretest: replicate-weight survey designs (BRR/Fay/JK1/"
@@ -2935,7 +3131,7 @@ def stute_joint_pretest(
# broadcasts the SAME multipliers, preserving both the
# vector-valued empirical-process unit-level dependence (paper
# convention) AND PSU clustering (Krieger-Pfeffermann 1997).
- resolved_for_boot = survey if survey is not None else _make_trivial_resolved(w_arr)
+ resolved_for_boot = survey if survey is not None else make_pweight_design(w_arr)
# R10 P1: reject stratified designs explicitly until a derived
# Stute-specific correction lands (mirrors stute_test
# single-horizon).
@@ -3100,9 +3296,22 @@ def _resolve_pretest_unit_weights(
return weights_unit, None
# survey is not None
if not hasattr(survey, "resolve"):
+ # PR #376 R9 P3: error message names the canonical kwarg
+ # `survey_design=` (with the deprecated `survey=` alias mentioned
+ # for back-compat), and points pre-resolved-design users to the
+ # array-in pretest helpers where ResolvedSurveyDesign /
+ # make_pweight_design(arr) belong.
raise TypeError(
- f"{caller_name}: survey= must be a SurveyDesign instance "
- f"(with .resolve()); got {type(survey).__name__}."
+ f"{caller_name}: `survey_design=` (or the deprecated `survey=` "
+ f"alias) accepts a SurveyDesign instance (column-referencing, "
+ f"gets `.resolve(data)`'d at fit time) on data-in surfaces; "
+ f"got {type(survey).__name__} (no `.resolve()` method). "
+ "If you have a pre-resolved ResolvedSurveyDesign or used "
+ "`make_pweight_design(arr)`, that pattern is for the array-in "
+ "pretest helpers (`stute_test`, `yatchew_hr_test`, "
+ "`stute_joint_pretest`). On data-in surfaces, add the weights "
+ "as a column on `data` and pass "
+ "`survey_design=SurveyDesign(weights='col_name', ...)`."
)
resolved_full = survey.resolve(data)
if getattr(resolved_full, "replicate_weights", None) is not None:
@@ -3151,8 +3360,9 @@ def joint_pretrends_test(
alpha: float = 0.05,
n_bootstrap: int = 999,
seed: Optional[int] = None,
- weights: Optional[np.ndarray] = None,
+ survey_design: Any = None,
survey: Any = None,
+ weights: Optional[np.ndarray] = None,
) -> StuteJointResult:
"""Joint Stute pre-trends test (paper Section 4.2 step 2).
@@ -3189,23 +3399,46 @@ def joint_pretrends_test(
handling follows the HAD contract (staggered auto-filter warns
and proceeds on last cohort; solo cohort proceeds).
alpha, n_bootstrap, seed : as in :func:`stute_test`.
- weights : np.ndarray or None, keyword-only, default None
- Per-row positive weights (Phase 4.5 C). Aggregated to per-unit
- via :func:`diff_diff.had._aggregate_unit_weights` (constant-
- within-unit invariant enforced). On staggered panels the
- wrapper subsets ``weights`` to the surviving cohort BEFORE
- aggregation. Mutually exclusive with ``survey``.
- survey : SurveyDesign or None, keyword-only, default None
+ survey_design : SurveyDesign or None, keyword-only, default None
Survey design (Phase 4.5 C). Resolved on the filtered panel;
replicate-weight designs raise ``NotImplementedError``;
``weight_type`` must be ``"pweight"``. Forwarded to
:func:`stute_joint_pretest` as a per-unit
- ``ResolvedSurveyDesign``.
+ ``ResolvedSurveyDesign``. Mutually exclusive with the deprecated
+ ``survey=`` and ``weights=`` aliases.
+ survey : SurveyDesign or None, keyword-only, default None
+ DEPRECATED alias of ``survey_design=``. Will be removed in the
+ next minor release.
+ weights : np.ndarray or None, keyword-only, default None
+ DEPRECATED alias for the per-row pweight shortcut. Prefer
+ ``survey_design=SurveyDesign(weights='col_name')`` against your
+ dataframe instead. Will be removed in the next minor release.
Returns
-------
StuteJointResult with ``null_form = "mean_independence"``.
"""
+ # Three-way mutex on survey_design / survey / weights (data-in pattern).
+ n_set = sum(x is not None for x in (survey_design, survey, weights))
+ if n_set > 1:
+ raise ValueError(HAD_DUAL_KNOB_MUTEX_MSG_DATA_IN)
+
+ # Soft deprecation: route legacy survey=/weights= aliases to survey_design=.
+ if survey is not None:
+ warnings.warn(HAD_DEPRECATION_MSG_SURVEY_KWARG, DeprecationWarning, stacklevel=2)
+ survey_design = survey
+ elif weights is not None:
+ warnings.warn(
+ HAD_DEPRECATION_MSG_WEIGHTS_KWARG_DATA_IN,
+ DeprecationWarning,
+ stacklevel=2,
+ )
+ # weights= shortcut preserved as-is on the back end.
+
+ # Internal alias rebind: downstream code uses `survey` and `weights`.
+ if survey_design is not None and survey is None:
+ survey = survey_design
+
if len(pre_periods) == 0:
raise ValueError(
"pre_periods must be non-empty. Workflow dispatch handles "
@@ -3380,6 +3613,17 @@ def joint_pretrends_test(
design_matrix = np.ones((G, 1), dtype=np.float64)
+ # Internal forwarding: pass survey_design= directly to stute_joint_pretest
+ # to avoid emitting the deprecation warning on every internal call. The
+ # canonical kwarg is the same on both ends; the warning fires ONCE at the
+ # user-facing front door (this wrapper) when the user passed a deprecated
+ # alias.
+ if resolved_unit is not None:
+ joint_survey_design = resolved_unit
+ elif weights_unit is not None:
+ joint_survey_design = make_pweight_design(weights_unit)
+ else:
+ joint_survey_design = None
return stute_joint_pretest(
residuals_by_horizon=residuals_by_horizon,
fitted_by_horizon=fitted_by_horizon,
@@ -3389,8 +3633,7 @@ def joint_pretrends_test(
n_bootstrap=n_bootstrap,
seed=seed,
null_form="mean_independence",
- weights=weights_unit if resolved_unit is None else None,
- survey=resolved_unit,
+ survey_design=joint_survey_design,
)
@@ -3407,8 +3650,9 @@ def joint_homogeneity_test(
alpha: float = 0.05,
n_bootstrap: int = 999,
seed: Optional[int] = None,
- weights: Optional[np.ndarray] = None,
+ survey_design: Any = None,
survey: Any = None,
+ weights: Optional[np.ndarray] = None,
) -> StuteJointResult:
"""Joint Stute homogeneity-linearity test (paper Section 4.3 joint).
@@ -3440,19 +3684,43 @@ def joint_homogeneity_test(
first_treat_col : str or None
Forwarded to the underlying panel validator.
alpha, n_bootstrap, seed : as in :func:`stute_test`.
- weights : np.ndarray or None, keyword-only, default None
- Per-row positive weights (Phase 4.5 C). See
- :func:`joint_pretrends_test` for the contract; semantics are
- identical (per-unit aggregation, staggered subsetting,
- replicate-weight rejection).
- survey : SurveyDesign or None, keyword-only, default None
+ survey_design : SurveyDesign or None, keyword-only, default None
Survey design (Phase 4.5 C). Same contract as
- :func:`joint_pretrends_test`.
+ :func:`joint_pretrends_test`. Mutually exclusive with the
+ deprecated ``survey=`` and ``weights=`` aliases.
+ survey : SurveyDesign or None, keyword-only, default None
+ DEPRECATED alias of ``survey_design=``. Will be removed in the
+ next minor release.
+ weights : np.ndarray or None, keyword-only, default None
+ DEPRECATED alias for the per-row pweight shortcut. Prefer
+ ``survey_design=SurveyDesign(weights='col_name')`` against your
+ dataframe instead. Will be removed in the next minor release.
Returns
-------
StuteJointResult with ``null_form = "linearity"``.
"""
+ # Three-way mutex on survey_design / survey / weights (data-in pattern).
+ n_set = sum(x is not None for x in (survey_design, survey, weights))
+ if n_set > 1:
+ raise ValueError(HAD_DUAL_KNOB_MUTEX_MSG_DATA_IN)
+
+ # Soft deprecation: route legacy survey=/weights= aliases to survey_design=.
+ if survey is not None:
+ warnings.warn(HAD_DEPRECATION_MSG_SURVEY_KWARG, DeprecationWarning, stacklevel=2)
+ survey_design = survey
+ elif weights is not None:
+ warnings.warn(
+ HAD_DEPRECATION_MSG_WEIGHTS_KWARG_DATA_IN,
+ DeprecationWarning,
+ stacklevel=2,
+ )
+ # weights= shortcut preserved as-is on the back end.
+
+ # Internal alias rebind: downstream code uses `survey` and `weights`.
+ if survey_design is not None and survey is None:
+ survey = survey_design
+
if len(post_periods) == 0:
raise ValueError(
"post_periods must be non-empty. Workflow dispatch handles "
@@ -3613,6 +3881,13 @@ def joint_homogeneity_test(
design_matrix = np.column_stack([np.ones(G, dtype=np.float64), d_arr.astype(np.float64)])
+ # Internal forwarding via canonical kwarg (avoids deprecation warning).
+ if resolved_unit is not None:
+ joint_survey_design = resolved_unit
+ elif weights_unit is not None:
+ joint_survey_design = make_pweight_design(weights_unit)
+ else:
+ joint_survey_design = None
return stute_joint_pretest(
residuals_by_horizon=residuals_by_horizon,
fitted_by_horizon=fitted_by_horizon,
@@ -3622,8 +3897,7 @@ def joint_homogeneity_test(
n_bootstrap=n_bootstrap,
seed=seed,
null_form="linearity",
- weights=weights_unit if resolved_unit is None else None,
- survey=resolved_unit,
+ survey_design=joint_survey_design,
)
@@ -3750,6 +4024,7 @@ def did_had_pretest_workflow(
seed: Optional[int] = None,
*,
aggregate: str = "overall",
+ survey_design: Any = None,
survey: Any = None,
weights: Optional[np.ndarray] = None,
) -> HADPretestReport:
@@ -3803,17 +4078,24 @@ def did_had_pretest_workflow(
deterministic.
aggregate : str, keyword-only, default ``"overall"``
Dispatch mode. Invalid values raise ``ValueError``.
- survey : SurveyDesign or None, keyword-only, default None
+ survey_design : SurveyDesign or None, keyword-only, default None
Survey design for design-based pretest inference. Linearity-family
pretests use PSU-level Mammen multiplier bootstrap (Stute family)
and weighted OLS + weighted variance components (Yatchew). The QUG
step is skipped under survey with a ``UserWarning`` (permanent
deferral per Phase 4.5 C0). Replicate-weight designs raise
- ``NotImplementedError``. Mutually exclusive with ``weights``.
+ ``NotImplementedError``. Mutually exclusive with the deprecated
+ ``survey=`` and ``weights=`` aliases.
+ survey : SurveyDesign or None, keyword-only, default None
+ DEPRECATED alias of ``survey_design=``. Will be removed in the
+ next minor release; prefer ``survey_design=``.
weights : np.ndarray or None, keyword-only, default None
- Per-row positive weights for the pweight shortcut. Mutually
- exclusive with ``survey``. Routed through a synthetic trivial
- ``ResolvedSurveyDesign`` so the same kernel handles both paths.
+ DEPRECATED alias for the per-row pweight shortcut. Prefer adding
+ the weights as a column on ``data`` and passing
+ ``survey_design=SurveyDesign(weights='col_name')`` instead. Will
+ be removed in the next minor release. Currently routed through a
+ synthetic trivial ``ResolvedSurveyDesign`` so the same kernel
+ handles both paths.
Returns
-------
@@ -3830,9 +4112,11 @@ def did_had_pretest_workflow(
Raises
------
ValueError
- On invalid ``aggregate``, ``survey`` and ``weights`` both
- non-None, or any downstream front-door failure (panel balance,
- dtype, dose invariant).
+ On invalid ``aggregate``; if more than one of ``survey_design``,
+ ``survey``, ``weights`` is supplied (3-way mutex; ``survey=`` and
+ ``weights=`` are deprecated aliases of ``survey_design=``); or
+ any downstream front-door failure (panel balance, dtype, dose
+ invariant).
NotImplementedError
If ``survey.replicate_weights is not None`` (replicate-weight
pretests deferred to a parallel follow-up after Phase 4.5 C).
@@ -3885,19 +4169,43 @@ def did_had_pretest_workflow(
f"aggregate must be one of {list(_VALID_AGGREGATES)!r}; " f"got {aggregate!r}."
)
- # Phase 4.5 C: survey/weights mutex + presence detection. R6 P1 fix:
- # do NOT call _resolve_pretest_unit_weights on the FULL panel here --
- # under aggregate='event_study' the panel may be staggered and the
- # cohort filter at _validate_multi_period_panel can drop units. If
- # those dropped units have zero/invalid weights, eager full-panel
- # resolution would abort an otherwise-valid event-study run. Defer
- # resolution to the per-aggregate branches: overall path resolves on
- # the original data (no filtering); event-study path lets the joint
- # wrappers handle resolution on data_filtered.
- if survey is not None and weights is not None:
- raise ValueError(
- "did_had_pretest_workflow: pass survey= OR " "weights=, not both."
+ # Three-way mutex on survey_design / survey / weights (data-in pattern).
+ # R6 P1 fix: do NOT call _resolve_pretest_unit_weights on the FULL panel
+ # here -- under aggregate='event_study' the panel may be staggered and the
+ # cohort filter at _validate_multi_period_panel can drop units. If those
+ # dropped units have zero/invalid weights, eager full-panel resolution
+ # would abort an otherwise-valid event-study run. Defer resolution to the
+ # per-aggregate branches: overall path resolves on the original data (no
+ # filtering); event-study path lets the joint wrappers handle resolution
+ # on data_filtered.
+ n_set = sum(x is not None for x in (survey_design, survey, weights))
+ if n_set > 1:
+ raise ValueError(HAD_DUAL_KNOB_MUTEX_MSG_DATA_IN)
+
+ # Soft deprecation: route legacy survey=/weights= aliases to survey_design=.
+ # The internal back-end paths (_resolve_pretest_unit_weights + per-aggregate
+ # dispatch) consume `survey` and `weights` as internal variable names, so
+ # rebind both for back-compat with the unchanged downstream logic. The
+ # bit-exact regression invariant is preserved because we only rebind names,
+ # not values.
+ if survey is not None:
+ warnings.warn(HAD_DEPRECATION_MSG_SURVEY_KWARG, DeprecationWarning, stacklevel=2)
+ survey_design = survey
+ elif weights is not None:
+ warnings.warn(
+ HAD_DEPRECATION_MSG_WEIGHTS_KWARG_DATA_IN,
+ DeprecationWarning,
+ stacklevel=2,
)
+ # weights= shortcut preserved as-is on the back end. Don't rebind
+ # survey_design -- the array is not a SurveyDesign.
+
+ # Internal alias rebind: downstream code uses `survey` (when set, a
+ # SurveyDesign or pre-resolved). Map the canonical input back so the
+ # unchanged downstream `if survey is not None:` branches consume it.
+ if survey_design is not None and survey is None:
+ survey = survey_design
+
use_survey_path = (survey is not None) or (weights is not None)
if use_survey_path:
@@ -3999,41 +4307,54 @@ def did_had_pretest_workflow(
# whose lexical and chronological order disagree (e.g. "q10" <
# "q2" lexically but > chronologically).
earlier_pre = list(t_pre_list[:-1])
- if len(earlier_pre) >= 1:
- pretrends_joint = joint_pretrends_test(
+ # PR #376 R2 P3: when `weights=joint_weights` is forwarded to the joint
+ # wrappers (the only joint-internal entry that takes a numpy array),
+ # the wrapper would re-emit a DeprecationWarning. Suppress those
+ # nested warnings — the user-facing warning has already fired at the
+ # workflow's front door above. survey_design=joint_survey is a
+ # SurveyDesign (column-referencing) on the survey path and goes
+ # through canonically; only the weights= forwarding path needs the
+ # suppression. The joint wrappers also can't accept a pre-resolved
+ # ResolvedSurveyDesign (their `_resolve_pretest_unit_weights` requires
+ # a SurveyDesign with .resolve()), so converting weights= to
+ # survey_design= via make_pweight_design isn't an option here.
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ if len(earlier_pre) >= 1:
+ pretrends_joint = joint_pretrends_test(
+ data_filtered,
+ outcome_col=outcome_col,
+ dose_col=dose_col,
+ time_col=time_col,
+ unit_col=unit_col,
+ pre_periods=earlier_pre,
+ base_period=base_period,
+ first_treat_col=first_treat_col,
+ alpha=alpha,
+ n_bootstrap=n_bootstrap,
+ seed=seed,
+ survey_design=joint_survey,
+ weights=joint_weights,
+ )
+ else:
+ pretrends_joint = None
+
+ # Step 3: joint homogeneity-linearity on post-periods.
+ homogeneity_joint = joint_homogeneity_test(
data_filtered,
outcome_col=outcome_col,
dose_col=dose_col,
time_col=time_col,
unit_col=unit_col,
- pre_periods=earlier_pre,
+ post_periods=list(t_post_list),
base_period=base_period,
first_treat_col=first_treat_col,
alpha=alpha,
n_bootstrap=n_bootstrap,
seed=seed,
+ survey_design=joint_survey,
weights=joint_weights,
- survey=joint_survey,
)
- else:
- pretrends_joint = None
-
- # Step 3: joint homogeneity-linearity on post-periods.
- homogeneity_joint = joint_homogeneity_test(
- data_filtered,
- outcome_col=outcome_col,
- dose_col=dose_col,
- time_col=time_col,
- unit_col=unit_col,
- post_periods=list(t_post_list),
- base_period=base_period,
- first_treat_col=first_treat_col,
- alpha=alpha,
- n_bootstrap=n_bootstrap,
- seed=seed,
- weights=joint_weights,
- survey=joint_survey,
- )
# Event-study `all_pass`. On the unweighted path, every implemented
# step must be conclusive AND none reject (Phase 3 convention). On
@@ -4109,21 +4430,28 @@ def did_had_pretest_workflow(
# already aggregated to per-unit (weights_unit / resolved_unit); the
# _aggregate_first_difference call above also collapses to per-unit
# (one row per unit), so weights_unit and resolved_unit are aligned.
+ # Internal forwarding uses the canonical survey_design= kwarg to skip
+ # deprecation warnings; the user-facing warning has already fired at the
+ # workflow's front door.
+ if resolved_unit is not None:
+ per_test_survey_design = resolved_unit
+ elif weights_unit is not None:
+ per_test_survey_design = make_pweight_design(weights_unit)
+ else:
+ per_test_survey_design = None
stute_res = stute_test(
d_arr,
dy_arr,
alpha=alpha,
n_bootstrap=n_bootstrap,
seed=seed,
- weights=weights_unit if resolved_unit is None else None,
- survey=resolved_unit,
+ survey_design=per_test_survey_design,
)
yatchew_res = yatchew_hr_test(
d_arr,
dy_arr,
alpha=alpha,
- weights=weights_unit if resolved_unit is None else None,
- survey=resolved_unit,
+ survey_design=per_test_survey_design,
)
# `all_pass` must be conclusive under the paper's four-step workflow
diff --git a/diff_diff/survey.py b/diff_diff/survey.py
index 155bce05..085c151b 100644
--- a/diff_diff/survey.py
+++ b/diff_diff/survey.py
@@ -678,22 +678,41 @@ def needs_survey_vcov(self) -> bool:
return True # Any resolved survey design uses the survey vcov path
-def _make_trivial_resolved(weights: np.ndarray) -> "ResolvedSurveyDesign":
- """Construct a trivial pweight-only ResolvedSurveyDesign (no strata/PSU/FPC).
-
- Used by survey-aware code paths invoked via a bare per-row ``weights``
- array (the pweight shortcut). Routing through this synthetic resolved
- design lets the same bootstrap / variance kernel handle both the
- ``weights=`` shortcut and the full ``survey=SurveyDesign(...)`` path
- uniformly. Mirrors the PR #363 synthetic-trivial-resolved pattern that
- fixed sup-t under the ``weights=`` shortcut on
- ``HeterogeneousAdoptionDiD.fit()``.
+def make_pweight_design(weights: np.ndarray) -> "ResolvedSurveyDesign":
+ """Construct a pweight-only ResolvedSurveyDesign from a raw weight array.
+
+ Use this on the array-in HAD pretest helpers (``stute_test``,
+ ``yatchew_hr_test``, ``stute_joint_pretest``) when the caller has only
+ a per-observation weight array and no PSU/strata/FPC structure::
+
+ from diff_diff import stute_test, make_pweight_design
+ result = stute_test(d, dy, survey_design=make_pweight_design(w))
+
+ For the data-in HAD surfaces (``HeterogeneousAdoptionDiD.fit``,
+ ``did_had_pretest_workflow``, ``joint_pretrends_test``,
+ ``joint_homogeneity_test``), prefer adding the weights as a column on
+ your dataframe and passing ``SurveyDesign(weights="col_name")`` instead;
+ those surfaces resolve column references against ``data`` at fit time
+ (the standard library convention used by ContinuousDiD, EfficientDiD,
+ and ChaisemartinDHaultfoeuille).
+
+ Internal note: this constructs a synthetic ``ResolvedSurveyDesign`` with
+ each observation as its own PSU and no strata/FPC, so PSU-level
+ multiplier-bootstrap kernels reduce bit-exactly to per-observation
+ Mammen draws while sharing the survey-aware code path with full PSU /
+ strata / FPC designs (mirrors the PR #363 synthetic-trivial-resolved
+ pattern).
Parameters
----------
weights : np.ndarray, shape (n_obs,)
- Per-observation positive weights. Caller is responsible for any
- non-negativity / per-unit-constancy validation.
+ Per-observation positive weights. Must be 1-D (shape ``(n_obs,)``);
+ scalars, 0-D arrays, and column-vector inputs (shape ``(n, 1)``)
+ raise ``ValueError`` at the front door. Caller is responsible for
+ any non-negativity / per-unit-constancy validation. Typical usage
+ is positional (``make_pweight_design(arr)``); the parameter name
+ ``weights`` collides linguistically with the deprecated
+ ``weights=`` kwarg on HAD surfaces, so prefer positional form.
Returns
-------
@@ -702,8 +721,23 @@ def _make_trivial_resolved(weights: np.ndarray) -> "ResolvedSurveyDesign":
``n_strata=0``, ``n_psu=n_obs`` (each observation is its own PSU
under the trivial design), ``lonely_psu="remove"``,
``replicate_weights=None``.
+
+ Raises
+ ------
+ ValueError
+ If ``weights`` is not 1-D (PR #376 R3 P1: catches scalar / 0-D /
+ column-vector inputs with a clear front-door message instead of
+ bubbling a low-level numpy or dataclass exception).
"""
w = np.asarray(weights, dtype=np.float64)
+ if w.ndim != 1:
+ raise ValueError(
+ f"make_pweight_design: weights must be 1-dimensional (1-D, shape "
+ f"(n_obs,)), got shape {w.shape}. Common mistakes: scalar / 0-D "
+ f"input (`make_pweight_design(1.0)`); column-vector "
+ f"(`make_pweight_design(df[['w']].to_numpy())` produces (n, 1) "
+ f"-- use `df['w'].to_numpy()` for (n,)); 2-D matrix input."
+ )
n_obs = int(w.shape[0])
return ResolvedSurveyDesign(
weights=w,
@@ -717,6 +751,70 @@ def _make_trivial_resolved(weights: np.ndarray) -> "ResolvedSurveyDesign":
)
+_make_trivial_resolved = make_pweight_design
+
+
+# Three-way mutex error messages for `survey_design=` / `survey=` / `weights=`
+# kwargs across the 8 HAD surfaces (HeterogeneousAdoptionDiD.fit +
+# did_had_pretest_workflow + 5 pretest helpers + qug_test). The migration
+# target text differs between data-in surfaces (which can resolve
+# ``SurveyDesign(weights="col_name")`` against ``data``) and array-in
+# surfaces (which take pre-resolved ``ResolvedSurveyDesign`` and use
+# ``make_pweight_design(arr)`` for the pweight-only convenience). Defined
+# here to avoid circular imports between had.py and had_pretests.py.
+HAD_DUAL_KNOB_MUTEX_MSG_DATA_IN = (
+ "Pass at most one of `survey_design=`, `survey=`, or `weights=`. "
+ "`survey=` and `weights=` are deprecated aliases of `survey_design=` "
+ "and will be removed in the next minor release. Prefer "
+ "`survey_design=SurveyDesign(weights='col_name', ...)`."
+)
+HAD_DUAL_KNOB_MUTEX_MSG_ARRAY_IN = (
+ "Pass at most one of `survey_design=`, `survey=`, or `weights=`. "
+ "`survey=` and `weights=` are deprecated aliases of `survey_design=` "
+ "and will be removed in the next minor release. Prefer "
+ "`survey_design=make_pweight_design(arr)` for pweight-only or "
+ "`survey_design=` for full "
+ "PSU/strata/FPC."
+)
+HAD_DEPRECATION_MSG_SURVEY_KWARG = (
+ "`survey=` is deprecated; use `survey_design=` instead "
+ "(same accepted types). Will be removed in the next minor release."
+)
+HAD_DEPRECATION_MSG_WEIGHTS_KWARG_DATA_IN = (
+ "`weights=np.ndarray` is deprecated; add the weights as a column on "
+ "`data` and pass `survey_design=SurveyDesign(weights='col_name')` "
+ "instead. Will be removed in the next minor release."
+)
+# PR #376 R11 P3: HAD.fit-specific weights= deprecation message — the
+# generic data-in suggestion above (use `survey_design=SurveyDesign(...)`)
+# is the long-term API target, but on `HeterogeneousAdoptionDiD.fit` the
+# two paths currently produce different SE families: the deprecated
+# `weights=np.ndarray` shortcut yields `variance_formula="pweight"` /
+# `"pweight_2sls"` (CCT-2014 weighted-robust / 2SLS pweight-sandwich)
+# while `survey_design=SurveyDesign(...)` yields `"survey_binder_tsl"` /
+# `"survey_binder_tsl_2sls"`. The next-minor cleanup (TODO row 102) will
+# unify the two; until then, document the SE-family caveat explicitly so
+# users know what changes when they migrate.
+HAD_DEPRECATION_MSG_WEIGHTS_KWARG_HAD_FIT = (
+ "`weights=np.ndarray` is deprecated on HeterogeneousAdoptionDiD.fit; "
+ "the long-term API is to add the weights as a column on `data` and "
+ "pass `survey_design=SurveyDesign(weights='col_name')`. Will be "
+ "removed in the next minor release. NOTE: in the current release the "
+ "two paths produce different SE families on this surface — the "
+ "`weights=` shortcut keeps the analytical CCT-2014 / 2SLS pweight-"
+ "sandwich (`variance_formula='pweight'` or `'pweight_2sls'`), while "
+ "`survey_design=SurveyDesign(...)` composes Binder-TSL "
+ "(`'survey_binder_tsl'` or `'survey_binder_tsl_2sls'`). The "
+ "long-term unification is tracked for the next minor release."
+)
+HAD_DEPRECATION_MSG_WEIGHTS_KWARG_ARRAY_IN = (
+ "`weights=np.ndarray` is deprecated on array-in pretest helpers; use "
+ "`survey_design=make_pweight_design(weights)` instead "
+ "(import `make_pweight_design` from `diff_diff`). Will be removed in "
+ "the next minor release."
+)
+
+
@dataclass
class SurveyMetadata:
"""
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
index 77c62195..f5d1c596 100644
--- a/docs/methodology/REGISTRY.md
+++ b/docs/methodology/REGISTRY.md
@@ -2347,7 +2347,8 @@ Under `survey=SurveyDesign(weights, strata, psu, fpc)`, the variance composes vi
- **Note:** Monte Carlo oracle consistency — `tests/test_had_mc.py` validates that the weighted estimator recovers the oracle τ under informative sampling, with coverage near nominal and visible bias reduction vs unweighted. Slow-gated; 4 tests.
- **Note:** Auto-bandwidth selection (Phase 1b MSE-DPI via `lpbwselect_mse_dpi`) remains UNWEIGHTED in this phase; users who want a weight-aware bandwidth should pass `h`/`b` explicitly. The auto path with uniform weights reduces to the existing unweighted bandwidth selector, so the uniform-weights bit-parity chain is preserved.
- **Note:** Replicate-weight SurveyDesigns (BRR / Fay / JK1 / JKn / SDR) on the HAD continuous path raise `NotImplementedError` in this PR; Rao-Wu-style rescaled bootstrap is deferred to Phase 4.5 C (survey-under-pretests).
-- **Note:** `HeterogeneousAdoptionDiD.fit()` dispatch matrix after Phase 4.5 B — survey / weights are supported on ALL design × aggregate combinations (continuous × {overall, event-study}, mass-point × {overall, event-study}). Pretests (`qug_test`, `stute_test`, `yatchew_hr_test`, joint Stute variants, `did_had_pretest_workflow`) still do NOT accept `survey=` / `weights=` — deferred to Phase 4.5 C / C0 per reciprocal-guard discipline.
+- **Note:** `HeterogeneousAdoptionDiD.fit()` dispatch matrix after Phase 4.5 B + 4.5 C — survey/weights are supported on ALL design × aggregate combinations (continuous × {overall, event-study}, mass-point × {overall, event-study}). The HAD pretests (`qug_test`, `stute_test`, `yatchew_hr_test`, joint Stute variants, `did_had_pretest_workflow`) ship survey support in Phase 4.5 C (PR #370) — `qug_test` permanently rejects (Phase 4.5 C0 deferral; see "QUG Null Test" §); the linearity family supports pweight + PSU + FPC via PSU-level Mammen multipliers (Stute) + closed-form weighted variance components (Yatchew); replicate-weight and stratified designs raise `NotImplementedError` (parallel follow-ups). The canonical kwarg on all 8 HAD surfaces is `survey_design=` (see "Note (HAD survey-design API consolidation)" below); `survey=` / `weights=` remain accepted as deprecated aliases for one minor cycle.
+- **Note (HAD survey-design API consolidation):** All 8 HAD surfaces — `HeterogeneousAdoptionDiD.fit`, `did_had_pretest_workflow`, `qug_test`, `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test` — accept the canonical kwarg `survey_design=` (matching `ContinuousDiD`, `EfficientDiD`, `ChaisemartinDHaultfoeuille`). The pre-existing dual `survey=` and `weights=` kwargs become deprecated aliases (`DeprecationWarning`); both will be removed in the next minor release. Internal back-end behavior is UNCHANGED (the legacy paths for `weights=np.ndarray` and `survey=SurveyDesign(...)` still execute the same code; only the entry signature wraps them). Mutex semantics extend from 2-way (`survey + weights`) to 3-way (`survey_design + survey + weights`) — at most one may be non-None per call. Distinct mutex error messages by surface group: data-in surfaces (HAD.fit + workflow + joint data-in wrappers) point users to `survey_design=SurveyDesign(weights='col_name', ...)`; the three array-in linearity helpers (`stute_test` / `yatchew_hr_test` / `stute_joint_pretest`) point to `survey_design=make_pweight_design(arr)` (for pweight-only) or `survey_design=` (for full PSU/strata/FPC). The 8th surface — `qug_test` — has its own qug-specific mutex message that does NOT advertise `make_pweight_design(arr)` as a migration target; QUG-under-survey is permanently rejected (Phase 4.5 C0 deferral, see "QUG Null Test" §) regardless of which kwarg variant the caller uses, so the migration path doesn't apply. Array-in helpers reject `survey_design=SurveyDesign(...)` with `TypeError` since they have no `data` to resolve column names against. The `make_pweight_design(weights: np.ndarray) -> ResolvedSurveyDesign` factory is exported from the `diff_diff` top level (formerly `survey._make_trivial_resolved`, kept as a permanent private alias for back-compat); `weights` must be 1-D (scalar / 0-D / column-vector inputs raise `ValueError` at the front door).
*Weighted 2SLS (Phase 4.5 B):* `_fit_mass_point_2sls(..., weights=, return_influence=)` extends the Wald-IV / 2SLS sandwich with pweight semantics:
- **Weighted bread**: `Z'WX = Z'·diag(w)·X` (`w¹`, matches `estimatr::iv_robust(..., weights=)` weighted-bread convention).
@@ -2431,8 +2432,8 @@ Tuning-parameter-free test of `H_0: d̲ = 0` versus `H_1: d̲ > 0`. Shipped in `
3. **The literature on EVT under unequal-probability sampling is sparse.** Quintos et al. (2001) and Beirlant et al. cover tail-INDEX estimation under unequal sample sizes. There is no off-the-shelf method for "test the support endpoint under complex sampling" in the standard survey-statistics toolkit. Adapting Hill / Pickands / DEdH estimators to the boundary problem would be novel research, not engineering. The de Chaisemartin et al. (2026) paper itself does not discuss survey extensions of QUG.
The survey-compatible alternative for HAD pretesting is **joint Stute** (a CvM cusum of regression residuals) — a smooth functional of the empirical CDF for which Krieger-Pfeffermann (1997) + a survey-aware multiplier bootstrap give a calibrated test. Phase 4.5 C (PR #370) ships survey support for the linearity family — the **PSU-level Mammen multiplier bootstrap** for `stute_test` and the joint variants (NOT Rao-Wu rescaling — multiplier bootstrap is a different mechanism), and **closed-form weighted OLS + pweight-sandwich variance components** for `yatchew_hr_test`. See the dedicated Note (Phase 4.5 C) below for the full algorithm.
**Research direction (out of scope for diff-diff):** the bridge IS sketchable by combining (a) endpoint-estimation EVT under iid (Hall 1982, Aarssen-de Haan 1994, Hall-Wang 1999, Beirlant-de Wet-Goegebeur 2006); (b) survey-aware functional CLT for the empirical process (Boistard-Lopuhaä-Ruiz-Gazen 2017, Bertail-Chautru-Clémençon 2017); and (c) tail-empirical-process theory (Drees 2003) to define a "design-effective boundary intensity" `λ_eff = Σ_h W_h · f_h(0+)`. Under a "no boundary clumping" assumption (`P(D_{(1)}, D_{(2)}` in same PSU `| both ≤ δ) → 0`), the `Exp(1)/Exp(1)` limit law's pivotality is preserved and only the calibration needs a survey-aware bootstrap (subsampling within strata per Politis-Romano-Wolf, or Bertail et al.'s design-aware bootstrap). This is publishable methodology research — one paper, ~6-12 months for a methods PhD student. If the bridge gets built and published externally, this gate can be revisited.
-- **Note (Phase 4.5 C):** `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`, and `did_had_pretest_workflow` accept `weights=` and `survey=ResolvedSurveyDesign` kwargs (or `survey=SurveyDesign` for the data-in entries). Mechanism varies by test:
- - **Stute family** (`stute_test`, `stute_joint_pretest`, joint wrappers) uses **PSU-level Mammen multiplier bootstrap** via `bootstrap_utils.generate_survey_multiplier_weights_batch` (the same kernel as PR #363's HAD event-study sup-t bootstrap). Each replicate draws an `(n_bootstrap, n_psu)` Mammen multiplier matrix; multipliers broadcast to per-obs perturbation `eta_obs[g] = eta_psu[psu(g)]`. The bootstrap residual perturbation is `dy_b = fitted + eps * eta_obs` (paper Appendix D wild-bootstrap form — multipliers attach to UNWEIGHTED residuals; the weighting flows through the OLS refit + the weighted CvM, NOT through the perturbation step). Followed by weighted OLS refit (`_fit_weighted_ols_intercept_slope`) and weighted CvM recompute via `_cvm_statistic_weighted`. Joint Stute SHARES the multiplier matrix across horizons within each replicate, preserving both the vector-valued empirical-process unit-level dependence (Delgado 1993; Escanciano 2006) AND PSU clustering (Krieger-Pfeffermann 1997). PSU-shared multipliers are conservative under no-within-PSU outcome correlation (over-clustering gives conservative size in finite samples), asymptotically correct under the standard survey assumption that PSU is the ultimate sampling unit AND outcomes correlate within PSU. The pweight `weights=` shortcut routes through a synthetic trivial `ResolvedSurveyDesign` (constructed via `survey._make_trivial_resolved`) so the kernel is shared across both entry paths. NOT "Rao-Wu rescaled bootstrap" — different mechanism (the Rao-Wu kernel rescales per-unit weights via stratified PSU resampling, while this kernel applies multipliers without resampling).
+- **Note (Phase 4.5 C):** `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`, and `did_had_pretest_workflow` accept `survey_design=` (canonical) plus the deprecated aliases `survey=` and `weights=` (DeprecationWarning, removal next minor — see "Note (HAD survey-design API consolidation)" below). On data-in surfaces (`did_had_pretest_workflow`, `joint_pretrends_test`, `joint_homogeneity_test`), `survey_design=` accepts a `SurveyDesign` (resolved against `data` at fit time). On array-in surfaces (`stute_test`, `yatchew_hr_test`, `stute_joint_pretest`), `survey_design=` accepts a pre-resolved `ResolvedSurveyDesign`; for the pweight-only convenience, construct via `survey_design=make_pweight_design(arr)` (`make_pweight_design` exported from the `diff_diff` top level). Mechanism varies by test:
+ - **Stute family** (`stute_test`, `stute_joint_pretest`, joint wrappers) uses **PSU-level Mammen multiplier bootstrap** via `bootstrap_utils.generate_survey_multiplier_weights_batch` (the same kernel as PR #363's HAD event-study sup-t bootstrap). Each replicate draws an `(n_bootstrap, n_psu)` Mammen multiplier matrix; multipliers broadcast to per-obs perturbation `eta_obs[g] = eta_psu[psu(g)]`. The bootstrap residual perturbation is `dy_b = fitted + eps * eta_obs` (paper Appendix D wild-bootstrap form — multipliers attach to UNWEIGHTED residuals; the weighting flows through the OLS refit + the weighted CvM, NOT through the perturbation step). Followed by weighted OLS refit (`_fit_weighted_ols_intercept_slope`) and weighted CvM recompute via `_cvm_statistic_weighted`. Joint Stute SHARES the multiplier matrix across horizons within each replicate, preserving both the vector-valued empirical-process unit-level dependence (Delgado 1993; Escanciano 2006) AND PSU clustering (Krieger-Pfeffermann 1997). PSU-shared multipliers are conservative under no-within-PSU outcome correlation (over-clustering gives conservative size in finite samples), asymptotically correct under the standard survey assumption that PSU is the ultimate sampling unit AND outcomes correlate within PSU. The pweight-only entry (`survey_design=make_pweight_design(arr)`, or the deprecated `weights=arr` alias) routes through a synthetic trivial `ResolvedSurveyDesign` (constructed via `make_pweight_design`, the public alias for the formerly private `survey._make_trivial_resolved`) so the kernel is shared across both entry paths. NOT "Rao-Wu rescaled bootstrap" — different mechanism (the Rao-Wu kernel rescales per-unit weights via stratified PSU resampling, while this kernel applies multipliers without resampling).
- **Yatchew** (`yatchew_hr_test`) uses **closed-form weighted OLS + pweight-sandwich variance components** (no bootstrap). All three components reduce bit-exactly to the unweighted formulas at `w=ones(G)` (locked at `atol=1e-14` in `TestYatchewHRTestSurvey::test_weighted_reduces_to_unweighted_at_uniform_weights`):
- `sigma2_lin = sum(w * eps^2) / sum(w)` (weighted OLS residual variance).
- `sigma2_diff = sum(w_avg * (dy_g - dy_{g-1})^2) / (2 * sum(w))` with arithmetic-mean pair weights `w_avg_g = (w_g + w_{g-1})/2`. Divisor uses `sum(w)` (=G at `w=1`), NOT `sum(w_avg)`, to match the existing `(1/(2G))` unweighted formula at `had_pretests.py:1635`.
diff --git a/tests/test_had.py b/tests/test_had.py
index b499cf3c..a7b5d79a 100644
--- a/tests/test_had.py
+++ b/tests/test_had.py
@@ -972,13 +972,16 @@ def test_aggregate_invalid_raises(self):
)
def test_survey_bad_type_raises(self):
- """survey= must be a SurveyDesign-like object with a .weights
- attribute; a bare string (or any object lacking .weights) raises
- TypeError front-door."""
+ """survey= must be a SurveyDesign-like object with a `.resolve()`
+ method; a bare string (or any object lacking `.resolve()`) raises
+ TypeError front-door. Updated PR #376 R8 P1: the data-in type
+ guard now runs at the canonical entry and rejects on the
+ `hasattr(survey, "resolve")` check (which catches both bare
+ strings and ResolvedSurveyDesign / make_pweight_design output)."""
d, dy = _dgp_continuous_at_zero(200, seed=0)
panel = _make_panel(d, dy)
est = HeterogeneousAdoptionDiD()
- with pytest.raises(TypeError, match="SurveyDesign-like"):
+ with pytest.raises(TypeError, match="SurveyDesign"):
est.fit(
panel,
"outcome",
@@ -3389,7 +3392,7 @@ def test_survey_and_weights_mutex(self):
panel_with_w = panel.assign(w=row_w)
sd = SurveyDesign(weights="w")
est = HeterogeneousAdoptionDiD(design="continuous_at_zero")
- with pytest.raises(ValueError, match="OR weights"):
+ with pytest.raises(ValueError, match="at most one of"):
est.fit(
panel_with_w,
"outcome",
diff --git a/tests/test_had_dual_knob_deprecation.py b/tests/test_had_dual_knob_deprecation.py
new file mode 100644
index 00000000..d66e65fb
--- /dev/null
+++ b/tests/test_had_dual_knob_deprecation.py
@@ -0,0 +1,1312 @@
+"""Tests for HAD survey_design= consolidation + soft deprecation cycle.
+
+Covers all 8 HAD surfaces (HAD.fit + did_had_pretest_workflow + 4 array-in
+pretests + 2 data-in joint wrappers) per the consolidation plan
+(`whimsical-brewing-liskov.md`). Each surface gets:
+
+1. survey_design= positive smoke (new kwarg accepted, finite output).
+2. weights= deprecation warning (DeprecationWarning emitted; back-compat
+ numerics preserved).
+3. survey= deprecation warning (DeprecationWarning emitted; back-compat
+ numerics preserved).
+4. Numerical parity legacy ≡ new at atol=0 (skipped on qug_test, which
+ raises NotImplementedError on all paths).
+5. Three-way mutex ValueError (any 2-of-3 combo).
+
+Plus surface-spanning tests:
+- make_pweight_design importable from diff_diff top-level.
+- make_pweight_design ≡ _make_trivial_resolved (private alias).
+- Array-in helpers reject SurveyDesign (TypeError).
+- Bit-exact normalization-order invariant (scale-invariance).
+- qug_test surface symmetry (signature consistent with siblings).
+"""
+
+import warnings
+
+import numpy as np
+import pandas as pd
+import pytest
+
+from diff_diff import (
+ HeterogeneousAdoptionDiD,
+ SurveyDesign,
+ did_had_pretest_workflow,
+ joint_homogeneity_test,
+ joint_pretrends_test,
+ make_pweight_design,
+ qug_test,
+ stute_joint_pretest,
+ stute_test,
+ yatchew_hr_test,
+)
+from diff_diff.survey import ResolvedSurveyDesign
+
+# =============================================================================
+# Fixtures
+# =============================================================================
+
+
+@pytest.fixture
+def array_in_data():
+ """Simple (d, dy) arrays for the 3 numeric array-in helpers."""
+ rng = np.random.default_rng(0)
+ G = 30
+ d = rng.uniform(0, 1, size=G)
+ dy = 0.5 + 1.5 * d + rng.normal(0, 0.3, size=G)
+ return d, dy
+
+
+@pytest.fixture
+def array_in_doses():
+ """Just doses for qug_test (single-array)."""
+ return np.array([0.1, 0.3, 0.5, 0.7, 0.9])
+
+
+@pytest.fixture
+def two_period_panel():
+ """Two-period panel for HAD.fit + did_had_pretest_workflow on
+ aggregate='overall'. G=200 units, T=2 periods, dose constant within unit,
+ Beta(0.5, 1) draws so d.min() approaches 0 (boundary at 0 satisfied for
+ Design 1' continuous_at_zero)."""
+ rng = np.random.default_rng(1)
+ G = 200
+ # Beta(0.5, 1) puts mass near 0; d.min() will be very small relative to
+ # median, satisfying the Design 1' boundary heuristic.
+ d = rng.beta(0.5, 1.0, size=G)
+ rows = []
+ for g in range(G):
+ for t in (0, 1):
+ y = 0.0 if t == 0 else d[g] * 1.2 + rng.normal(0, 0.1)
+ rows.append({"unit": g, "time": t, "y": y, "d": (0.0 if t == 0 else d[g])})
+ df = pd.DataFrame(rows)
+ df["w"] = 1.0 # uniform weight column for SurveyDesign(weights="w")
+ return df
+
+
+@pytest.fixture
+def event_study_panel():
+ """Multi-period panel for joint_pretrends/joint_homogeneity workflows."""
+ rng = np.random.default_rng(2)
+ G = 30
+ rows = []
+ F = 2
+ for g in range(G):
+ d_g = rng.uniform(0.0, 1.0)
+ for t in range(4):
+ d_t = 0.0 if t < F else d_g
+ y = (0.0 if t < F else d_t * 1.5) + rng.normal(0, 0.15)
+ rows.append({"unit": g, "time": t, "y": y, "d": d_t})
+ df = pd.DataFrame(rows)
+ df["w"] = 1.0
+ return df
+
+
+# =============================================================================
+# 1. Surface-spanning tests
+# =============================================================================
+
+
+class TestPublicHelpers:
+ def test_make_pweight_design_export(self):
+ """make_pweight_design is importable from the diff_diff top level."""
+ from diff_diff import make_pweight_design as mpd
+
+ assert mpd is make_pweight_design
+
+ def test_make_pweight_design_returns_resolved(self):
+ w = np.array([1.0, 2.0, 3.0, 4.0])
+ resolved = make_pweight_design(w)
+ assert isinstance(resolved, ResolvedSurveyDesign)
+ assert resolved.weight_type == "pweight"
+ assert resolved.strata is None
+ assert resolved.psu is None
+ assert resolved.fpc is None
+ assert resolved.replicate_weights is None
+ assert resolved.n_strata == 0
+ assert resolved.n_psu == 4
+ assert np.array_equal(resolved.weights, w.astype(np.float64))
+
+ def test_make_pweight_design_eq_underscore_alias(self):
+ """Permanent private alias _make_trivial_resolved IS make_pweight_design."""
+ from diff_diff.survey import _make_trivial_resolved
+
+ assert _make_trivial_resolved is make_pweight_design
+
+ def test_make_pweight_design_rejects_scalar(self):
+ """PR #376 R3 P1: scalar / 0-D inputs raise a clear front-door
+ ValueError instead of bubbling a low-level numpy or dataclass
+ exception (was: `1.0` would fail at `int(w.shape[0])` with
+ `IndexError: tuple index out of range`)."""
+ with pytest.raises(ValueError, match="weights must be 1-dimensional"):
+ make_pweight_design(1.0)
+
+ def test_make_pweight_design_rejects_zero_d_array(self):
+ """PR #376 R3 P1: `np.array(1.0)` (0-D ndarray) raises ValueError."""
+ with pytest.raises(ValueError, match="weights must be 1-dimensional"):
+ make_pweight_design(np.array(1.0))
+
+ def test_make_pweight_design_rejects_column_vector(self):
+ """PR #376 R3 P1: `(n, 1)` column vectors raise ValueError pointing
+ users to `df['w'].to_numpy()` instead of `df[['w']].to_numpy()`."""
+ with pytest.raises(ValueError, match="weights must be 1-dimensional"):
+ make_pweight_design(np.ones((5, 1)))
+
+ def test_array_in_helpers_legacy_weights_scalar_raises_value_error(self, array_in_data):
+ """PR #376 R3 P1: deprecated `weights=scalar` on array-in helpers
+ also raises ValueError (the shim routes through make_pweight_design,
+ which catches scalars at its front door)."""
+ d, dy = array_in_data
+ with pytest.raises(ValueError, match="weights must be 1-dimensional"):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ stute_test(d, dy, weights=1.0, n_bootstrap=199, seed=0)
+ with pytest.raises(ValueError, match="weights must be 1-dimensional"):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ yatchew_hr_test(d, dy, weights=1.0)
+
+ def test_qug_test_legacy_weights_scalar_raises_value_error(self, array_in_doses):
+ """PR #376 R3 P1: deprecated `weights=scalar` on qug_test also raises
+ ValueError (the shim routes through make_pweight_design before the
+ NotImplementedError gate)."""
+ with pytest.raises(ValueError, match="weights must be 1-dimensional"):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ qug_test(array_in_doses, weights=1.0)
+
+
+class TestArrayInTypeGuard:
+ """Array-in helpers reject SurveyDesign (cannot resolve column names).
+
+ Both the canonical `survey_design=SurveyDesign(...)` form AND the
+ deprecated `survey=SurveyDesign(...)` alias trigger the same TypeError
+ (PR #376 R1 P1: alias must behave identically to the canonical kwarg).
+ """
+
+ def test_stute_test_rejects_SurveyDesign(self, array_in_data):
+ d, dy = array_in_data
+ with pytest.raises(TypeError, match="make_pweight_design"):
+ stute_test(d, dy, survey_design=SurveyDesign(weights="w"), n_bootstrap=199, seed=0)
+
+ def test_stute_test_rejects_SurveyDesign_via_legacy_alias(self, array_in_data):
+ """PR #376 R1 P1: `survey=SurveyDesign(...)` (deprecated alias) must
+ trigger the same TypeError as `survey_design=SurveyDesign(...)`."""
+ d, dy = array_in_data
+ with pytest.raises(TypeError, match="make_pweight_design"):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ stute_test(d, dy, survey=SurveyDesign(weights="w"), n_bootstrap=199, seed=0)
+
+ def test_yatchew_hr_test_rejects_SurveyDesign(self, array_in_data):
+ d, dy = array_in_data
+ with pytest.raises(TypeError, match="make_pweight_design"):
+ yatchew_hr_test(d, dy, survey_design=SurveyDesign(weights="w"))
+
+ def test_yatchew_hr_test_rejects_SurveyDesign_via_legacy_alias(self, array_in_data):
+ """PR #376 R1 P1: alias parity with canonical kwarg."""
+ d, dy = array_in_data
+ with pytest.raises(TypeError, match="make_pweight_design"):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ yatchew_hr_test(d, dy, survey=SurveyDesign(weights="w"))
+
+ def test_stute_joint_pretest_rejects_SurveyDesign(self):
+ rng = np.random.default_rng(3)
+ G = 30
+ d = rng.uniform(0, 1, size=G)
+ residuals = {0: rng.normal(0, 0.1, G)}
+ fitted = {0: np.zeros(G)}
+ X = np.column_stack([np.ones(G), d])
+ with pytest.raises(TypeError, match="make_pweight_design"):
+ stute_joint_pretest(
+ residuals_by_horizon=residuals,
+ fitted_by_horizon=fitted,
+ doses=d,
+ design_matrix=X,
+ survey_design=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_stute_joint_pretest_rejects_SurveyDesign_via_legacy_alias(self):
+ """PR #376 R1 P1: alias parity with canonical kwarg."""
+ rng = np.random.default_rng(3)
+ G = 30
+ d = rng.uniform(0, 1, size=G)
+ residuals = {0: rng.normal(0, 0.1, G)}
+ fitted = {0: np.zeros(G)}
+ X = np.column_stack([np.ones(G), d])
+ with pytest.raises(TypeError, match="make_pweight_design"):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ stute_joint_pretest(
+ residuals_by_horizon=residuals,
+ fitted_by_horizon=fitted,
+ doses=d,
+ design_matrix=X,
+ survey=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+
+class TestScaleInvariance:
+ """Bit-exact normalization-order invariant (Stability invariant #7).
+
+ The legacy weights= deprecation shim binds
+ `survey_design = make_pweight_design(weights_unnormalized)` and lets
+ the unified survey_design= path apply the mean=1 normalization step
+ EXACTLY ONCE downstream. If the shim pre-normalized AND the unified
+ path also normalized, the test statistic would scale differently
+ under multiplicative weight rescaling.
+ """
+
+ def test_stute_weights_alias_scale_invariant(self, array_in_data):
+ d, dy = array_in_data
+ w = np.random.default_rng(4).uniform(0.5, 1.5, size=30)
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r1 = stute_test(d, dy, weights=w, n_bootstrap=199, seed=0)
+ r2 = stute_test(d, dy, weights=w * 100.0, n_bootstrap=199, seed=0)
+ # Use atol/rtol=1e-14 (per `feedback_assert_allclose_numerical_parity`):
+ # the mean=1 normalization step `w * G/sum(w)` produces results that
+ # agree to ~16 significant figures but not bit-exactly across
+ # multiplicative rescaling (FP rounding in the renormalization step).
+ np.testing.assert_allclose(r1.cvm_stat, r2.cvm_stat, atol=1e-14, rtol=1e-14)
+
+ def test_yatchew_weights_alias_scale_invariant(self, array_in_data):
+ d, dy = array_in_data
+ w = np.random.default_rng(5).uniform(0.5, 1.5, size=30)
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r1 = yatchew_hr_test(d, dy, weights=w)
+ r2 = yatchew_hr_test(d, dy, weights=w * 100.0)
+ np.testing.assert_allclose(r1.t_stat_hr, r2.t_stat_hr, atol=1e-14, rtol=1e-14)
+
+
+# =============================================================================
+# 2. Per-surface deprecation + parity tests
+# =============================================================================
+
+
+class TestQUGTestDeprecation:
+ """qug_test (array-in, gated): all paths raise NotImplementedError;
+ consolidation tests focus on the deprecation/mutex cascade."""
+
+ def test_survey_design_kwarg_raises_notimpl(self, array_in_doses):
+ with pytest.raises(NotImplementedError, match="QUG"):
+ qug_test(array_in_doses, survey_design=make_pweight_design(np.ones(5)))
+
+ def test_weights_emits_deprecation_warning(self, array_in_doses):
+ with pytest.warns(DeprecationWarning, match="weights=.*deprecated"):
+ with pytest.raises(NotImplementedError):
+ qug_test(array_in_doses, weights=np.ones(5))
+
+ def test_survey_emits_deprecation_warning(self, array_in_doses):
+ with pytest.warns(DeprecationWarning, match="survey=.*deprecated"):
+ with pytest.raises(NotImplementedError):
+ qug_test(array_in_doses, survey=SurveyDesign(weights="w"))
+
+ def test_three_way_mutex_design_plus_survey(self, array_in_doses):
+ with pytest.raises(ValueError, match="at most one of"):
+ qug_test(
+ array_in_doses,
+ survey_design=make_pweight_design(np.ones(5)),
+ survey=SurveyDesign(weights="w"),
+ )
+
+ def test_three_way_mutex_design_plus_weights(self, array_in_doses):
+ with pytest.raises(ValueError, match="at most one of"):
+ qug_test(
+ array_in_doses,
+ survey_design=make_pweight_design(np.ones(5)),
+ weights=np.ones(5),
+ )
+
+ def test_three_way_mutex_all_three(self, array_in_doses):
+ with pytest.raises(ValueError, match="at most one of"):
+ qug_test(
+ array_in_doses,
+ survey_design=make_pweight_design(np.ones(5)),
+ survey=SurveyDesign(weights="w"),
+ weights=np.ones(5),
+ )
+
+
+class TestStuteTestDeprecation:
+ def test_survey_design_kwarg_smoke(self, array_in_data):
+ d, dy = array_in_data
+ w = np.ones(30)
+ r = stute_test(d, dy, survey_design=make_pweight_design(w), n_bootstrap=199, seed=0)
+ assert np.isfinite(r.cvm_stat)
+ assert 0.0 <= r.p_value <= 1.0
+
+ def test_weights_emits_deprecation_warning(self, array_in_data):
+ d, dy = array_in_data
+ with pytest.warns(DeprecationWarning, match="weights=.*deprecated"):
+ stute_test(d, dy, weights=np.ones(30), n_bootstrap=199, seed=0)
+
+ def test_survey_emits_deprecation_warning(self, array_in_data):
+ d, dy = array_in_data
+ with pytest.warns(DeprecationWarning, match="survey=.*deprecated"):
+ stute_test(
+ d,
+ dy,
+ survey=make_pweight_design(np.ones(30)),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_numerical_parity_weights_legacy_eq_new(self, array_in_data):
+ d, dy = array_in_data
+ w = np.random.default_rng(7).uniform(0.5, 1.5, size=30)
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = stute_test(d, dy, weights=w, n_bootstrap=199, seed=0)
+ r_new = stute_test(d, dy, survey_design=make_pweight_design(w), n_bootstrap=199, seed=0)
+ assert r_legacy.cvm_stat == r_new.cvm_stat
+ assert r_legacy.p_value == r_new.p_value
+
+ def test_numerical_parity_survey_legacy_eq_new(self, array_in_data):
+ d, dy = array_in_data
+ w = np.random.default_rng(8).uniform(0.5, 1.5, size=30)
+ resolved = make_pweight_design(w)
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = stute_test(d, dy, survey=resolved, n_bootstrap=199, seed=0)
+ r_new = stute_test(d, dy, survey_design=resolved, n_bootstrap=199, seed=0)
+ assert r_legacy.cvm_stat == r_new.cvm_stat
+ assert r_legacy.p_value == r_new.p_value
+
+ def test_three_way_mutex_design_plus_survey(self, array_in_data):
+ d, dy = array_in_data
+ w = np.ones(30)
+ with pytest.raises(ValueError, match="at most one of"):
+ stute_test(
+ d,
+ dy,
+ survey_design=make_pweight_design(w),
+ survey=make_pweight_design(w),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_three_way_mutex_all_three(self, array_in_data):
+ d, dy = array_in_data
+ w = np.ones(30)
+ with pytest.raises(ValueError, match="at most one of"):
+ stute_test(
+ d,
+ dy,
+ survey_design=make_pweight_design(w),
+ survey=make_pweight_design(w),
+ weights=w,
+ n_bootstrap=199,
+ seed=0,
+ )
+
+
+class TestYatchewHRTestDeprecation:
+ def test_survey_design_kwarg_smoke(self, array_in_data):
+ d, dy = array_in_data
+ w = np.ones(30)
+ r = yatchew_hr_test(d, dy, survey_design=make_pweight_design(w))
+ assert np.isfinite(r.t_stat_hr)
+
+ def test_weights_emits_deprecation_warning(self, array_in_data):
+ d, dy = array_in_data
+ with pytest.warns(DeprecationWarning, match="weights=.*deprecated"):
+ yatchew_hr_test(d, dy, weights=np.ones(30))
+
+ def test_survey_emits_deprecation_warning(self, array_in_data):
+ d, dy = array_in_data
+ with pytest.warns(DeprecationWarning, match="survey=.*deprecated"):
+ yatchew_hr_test(d, dy, survey=make_pweight_design(np.ones(30)))
+
+ def test_numerical_parity_weights_legacy_eq_new(self, array_in_data):
+ d, dy = array_in_data
+ w = np.random.default_rng(9).uniform(0.5, 1.5, size=30)
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = yatchew_hr_test(d, dy, weights=w)
+ r_new = yatchew_hr_test(d, dy, survey_design=make_pweight_design(w))
+ assert r_legacy.t_stat_hr == r_new.t_stat_hr
+ assert r_legacy.p_value == r_new.p_value
+
+ def test_three_way_mutex_design_plus_weights(self, array_in_data):
+ d, dy = array_in_data
+ with pytest.raises(ValueError, match="at most one of"):
+ yatchew_hr_test(
+ d,
+ dy,
+ survey_design=make_pweight_design(np.ones(30)),
+ weights=np.ones(30),
+ )
+
+
+class TestStuteJointPretestDeprecation:
+ def _setup(self):
+ rng = np.random.default_rng(10)
+ G = 30
+ d = rng.uniform(0, 1, size=G)
+ residuals = {0: rng.normal(0, 0.1, G), 1: rng.normal(0, 0.1, G)}
+ fitted = {0: np.zeros(G), 1: np.zeros(G)}
+ X = np.column_stack([np.ones(G), d])
+ return d, residuals, fitted, X
+
+ def test_survey_design_kwarg_smoke(self):
+ d, residuals, fitted, X = self._setup()
+ w = np.ones(30)
+ r = stute_joint_pretest(
+ residuals_by_horizon=residuals,
+ fitted_by_horizon=fitted,
+ doses=d,
+ design_matrix=X,
+ survey_design=make_pweight_design(w),
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert np.isfinite(r.cvm_stat_joint)
+
+ def test_weights_emits_deprecation_warning(self):
+ d, residuals, fitted, X = self._setup()
+ with pytest.warns(DeprecationWarning, match="weights=.*deprecated"):
+ stute_joint_pretest(
+ residuals_by_horizon=residuals,
+ fitted_by_horizon=fitted,
+ doses=d,
+ design_matrix=X,
+ weights=np.ones(30),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_survey_emits_deprecation_warning(self):
+ d, residuals, fitted, X = self._setup()
+ with pytest.warns(DeprecationWarning, match="survey=.*deprecated"):
+ stute_joint_pretest(
+ residuals_by_horizon=residuals,
+ fitted_by_horizon=fitted,
+ doses=d,
+ design_matrix=X,
+ survey=make_pweight_design(np.ones(30)),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_numerical_parity_weights_legacy_eq_new(self):
+ d, residuals, fitted, X = self._setup()
+ w = np.random.default_rng(11).uniform(0.5, 1.5, size=30)
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = stute_joint_pretest(
+ residuals_by_horizon=residuals,
+ fitted_by_horizon=fitted,
+ doses=d,
+ design_matrix=X,
+ weights=w,
+ n_bootstrap=199,
+ seed=0,
+ )
+ r_new = stute_joint_pretest(
+ residuals_by_horizon=residuals,
+ fitted_by_horizon=fitted,
+ doses=d,
+ design_matrix=X,
+ survey_design=make_pweight_design(w),
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert r_legacy.cvm_stat_joint == r_new.cvm_stat_joint
+ assert r_legacy.p_value == r_new.p_value
+
+ def test_three_way_mutex_all_three(self):
+ d, residuals, fitted, X = self._setup()
+ w = np.ones(30)
+ with pytest.raises(ValueError, match="at most one of"):
+ stute_joint_pretest(
+ residuals_by_horizon=residuals,
+ fitted_by_horizon=fitted,
+ doses=d,
+ design_matrix=X,
+ survey_design=make_pweight_design(w),
+ survey=make_pweight_design(w),
+ weights=w,
+ n_bootstrap=199,
+ seed=0,
+ )
+
+
+class TestJointPretrendsTestDeprecation:
+ def test_survey_design_kwarg_smoke(self, event_study_panel):
+ df = event_study_panel
+ r = joint_pretrends_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ pre_periods=[0],
+ base_period=1,
+ survey_design=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert np.isfinite(r.cvm_stat_joint)
+
+ def test_weights_emits_deprecation_warning(self, event_study_panel):
+ df = event_study_panel
+ n = len(df)
+ with pytest.warns(DeprecationWarning, match="weights=.*deprecated"):
+ joint_pretrends_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ pre_periods=[0],
+ base_period=1,
+ weights=np.ones(n),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_survey_emits_deprecation_warning(self, event_study_panel):
+ df = event_study_panel
+ with pytest.warns(DeprecationWarning, match="survey=.*deprecated"):
+ joint_pretrends_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ pre_periods=[0],
+ base_period=1,
+ survey=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_three_way_mutex_design_plus_survey(self, event_study_panel):
+ df = event_study_panel
+ n = len(df)
+ with pytest.raises(ValueError, match="at most one of"):
+ joint_pretrends_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ pre_periods=[0],
+ base_period=1,
+ survey_design=SurveyDesign(weights="w"),
+ weights=np.ones(n),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_legacy_alias_parity_survey(self, event_study_panel):
+ """PR #376 R9 P3: deprecated `survey=SurveyDesign(...)` ≡ canonical
+ `survey_design=SurveyDesign(...)` on joint_pretrends_test (locks
+ rebinding parity)."""
+ df = event_study_panel
+ sd = SurveyDesign(weights="w")
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = joint_pretrends_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ pre_periods=[0],
+ base_period=1,
+ survey=sd,
+ n_bootstrap=199,
+ seed=0,
+ )
+ r_new = joint_pretrends_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ pre_periods=[0],
+ base_period=1,
+ survey_design=sd,
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert r_legacy.cvm_stat_joint == r_new.cvm_stat_joint
+ assert r_legacy.p_value == r_new.p_value
+
+ def test_legacy_alias_parity_weights(self, event_study_panel):
+ """PR #376 R10 P3: deprecated `weights=np.ones(n)` ≡ canonical
+ `survey_design=SurveyDesign(weights="w")` (uniform 1.0 column) on
+ joint_pretrends_test."""
+ df = event_study_panel
+ n = len(df)
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = joint_pretrends_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ pre_periods=[0],
+ base_period=1,
+ weights=np.ones(n),
+ n_bootstrap=199,
+ seed=0,
+ )
+ r_new = joint_pretrends_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ pre_periods=[0],
+ base_period=1,
+ survey_design=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert r_legacy.cvm_stat_joint == r_new.cvm_stat_joint
+ assert r_legacy.p_value == r_new.p_value
+
+
+class TestJointHomogeneityTestDeprecation:
+ def test_survey_design_kwarg_smoke(self, event_study_panel):
+ df = event_study_panel
+ r = joint_homogeneity_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ post_periods=[2, 3],
+ base_period=1,
+ survey_design=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert np.isfinite(r.cvm_stat_joint)
+
+ def test_weights_emits_deprecation_warning(self, event_study_panel):
+ df = event_study_panel
+ n = len(df)
+ with pytest.warns(DeprecationWarning, match="weights=.*deprecated"):
+ joint_homogeneity_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ post_periods=[2, 3],
+ base_period=1,
+ weights=np.ones(n),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_survey_emits_deprecation_warning(self, event_study_panel):
+ df = event_study_panel
+ with pytest.warns(DeprecationWarning, match="survey=.*deprecated"):
+ joint_homogeneity_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ post_periods=[2, 3],
+ base_period=1,
+ survey=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_legacy_alias_parity_survey(self, event_study_panel):
+ """PR #376 R9 P3: deprecated `survey=SurveyDesign(...)` ≡ canonical
+ `survey_design=SurveyDesign(...)` on joint_homogeneity_test."""
+ df = event_study_panel
+ sd = SurveyDesign(weights="w")
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = joint_homogeneity_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ post_periods=[2, 3],
+ base_period=1,
+ survey=sd,
+ n_bootstrap=199,
+ seed=0,
+ )
+ r_new = joint_homogeneity_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ post_periods=[2, 3],
+ base_period=1,
+ survey_design=sd,
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert r_legacy.cvm_stat_joint == r_new.cvm_stat_joint
+ assert r_legacy.p_value == r_new.p_value
+
+ def test_legacy_alias_parity_weights(self, event_study_panel):
+ """PR #376 R10 P3: deprecated `weights=np.ones(n)` ≡ canonical
+ `survey_design=SurveyDesign(weights="w")` on
+ joint_homogeneity_test."""
+ df = event_study_panel
+ n = len(df)
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = joint_homogeneity_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ post_periods=[2, 3],
+ base_period=1,
+ weights=np.ones(n),
+ n_bootstrap=199,
+ seed=0,
+ )
+ r_new = joint_homogeneity_test(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ post_periods=[2, 3],
+ base_period=1,
+ survey_design=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert r_legacy.cvm_stat_joint == r_new.cvm_stat_joint
+ assert r_legacy.p_value == r_new.p_value
+
+
+class TestHADFitDeprecation:
+ def test_survey_design_kwarg_smoke(self, two_period_panel):
+ df = two_period_panel
+ est = HeterogeneousAdoptionDiD(design="continuous_at_zero")
+ r = est.fit(df, "y", "d", "time", "unit", survey_design=SurveyDesign(weights="w"))
+ assert np.isfinite(r.att)
+
+ def test_weights_emits_deprecation_warning(self, two_period_panel):
+ df = two_period_panel
+ n = len(df)
+ est = HeterogeneousAdoptionDiD(design="continuous_at_zero")
+ with pytest.warns(DeprecationWarning, match="weights=.*deprecated"):
+ est.fit(df, "y", "d", "time", "unit", weights=np.ones(n))
+
+ def test_survey_emits_deprecation_warning(self, two_period_panel):
+ df = two_period_panel
+ est = HeterogeneousAdoptionDiD(design="continuous_at_zero")
+ with pytest.warns(DeprecationWarning, match="survey=.*deprecated"):
+ est.fit(df, "y", "d", "time", "unit", survey=SurveyDesign(weights="w"))
+
+ def test_three_way_mutex_design_plus_weights(self, two_period_panel):
+ df = two_period_panel
+ n = len(df)
+ est = HeterogeneousAdoptionDiD(design="continuous_at_zero")
+ with pytest.raises(ValueError, match="at most one of"):
+ est.fit(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ survey_design=SurveyDesign(weights="w"),
+ weights=np.ones(n),
+ )
+
+ def test_fit_rejects_pre_resolved_design_overall(self, two_period_panel):
+ """PR #376 R8 P1: HAD.fit() data-in surface must reject a
+ pre-resolved ResolvedSurveyDesign with TypeError pointing users to
+ `SurveyDesign(weights='col_name', ...)`. Mirrors the array-in
+ helpers' rejection of SurveyDesign — the data-in/array-in surface
+ split is symmetric."""
+ df = two_period_panel
+ n = len(df)
+ est = HeterogeneousAdoptionDiD(design="continuous_at_zero")
+ # survey_design=ResolvedSurveyDesign should raise TypeError.
+ with pytest.raises(TypeError, match=r"`survey_design=` accepts a SurveyDesign"):
+ est.fit(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ survey_design=make_pweight_design(np.ones(n // 2)),
+ )
+
+ def test_fit_rejects_pre_resolved_design_event_study(self, event_study_continuous_panel):
+ """PR #376 R8 P1: same TypeError on aggregate='event_study'."""
+ df = event_study_continuous_panel
+ est = HeterogeneousAdoptionDiD(design="continuous_at_zero", n_bootstrap=99, seed=0)
+ with pytest.raises(TypeError, match=r"`survey_design=` accepts a SurveyDesign"):
+ est.fit(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ aggregate="event_study",
+ survey_design=make_pweight_design(np.ones(200)),
+ )
+
+ def test_fit_rejects_pre_resolved_design_via_legacy_alias_overall(self, two_period_panel):
+ """PR #376 R8 P1: deprecated `survey=ResolvedSurveyDesign` (alias)
+ also raises TypeError after the alias rebinding."""
+ df = two_period_panel
+ n = len(df)
+ est = HeterogeneousAdoptionDiD(design="continuous_at_zero")
+ with pytest.raises(TypeError, match=r"`survey_design=` accepts a SurveyDesign"):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ est.fit(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ survey=make_pweight_design(np.ones(n // 2)),
+ )
+
+ def test_fit_rejects_pre_resolved_design_via_legacy_alias_event_study(
+ self, event_study_continuous_panel
+ ):
+ """PR #376 R8 P1: deprecated `survey=ResolvedSurveyDesign` (alias)
+ on event-study path also raises TypeError."""
+ df = event_study_continuous_panel
+ est = HeterogeneousAdoptionDiD(design="continuous_at_zero", n_bootstrap=99, seed=0)
+ with pytest.raises(TypeError, match=r"`survey_design=` accepts a SurveyDesign"):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ est.fit(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ aggregate="event_study",
+ survey=make_pweight_design(np.ones(200)),
+ )
+
+ def test_legacy_positional_call_back_compat(self, two_period_panel):
+ """PR #376 R4 P1: pre-PR positional call shape for `survey`,
+ `weights`, `cband` MUST still work (the consolidation is additive,
+ not breaking). Tests the full positional sequence:
+ `fit(data, outcome, dose, time, unit, first_treat, aggregate,
+ survey, weights, cband)`."""
+ df = two_period_panel
+ est = HeterogeneousAdoptionDiD(design="continuous_at_zero")
+ # Pre-PR positional order: ..., first_treat_col, aggregate, survey,
+ # weights, cband. None of these should be flagged as keyword-only.
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r = est.fit(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ None, # first_treat_col
+ "overall", # aggregate
+ SurveyDesign(weights="w"), # survey (positional)
+ None, # weights (positional)
+ True, # cband (positional)
+ )
+ assert np.isfinite(r.att)
+
+
+class TestDidHadPretestWorkflowDeprecation:
+ def test_survey_design_kwarg_smoke(self, two_period_panel):
+ df = two_period_panel
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning) # QUG-skip warning
+ report = did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ survey_design=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert report.qug is None # skipped under survey path
+ assert report.stute is not None
+
+ def test_weights_emits_deprecation_warning(self, two_period_panel):
+ df = two_period_panel
+ n = len(df)
+ with pytest.warns(DeprecationWarning, match="weights=.*deprecated"):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning)
+ # We still need to allow the DeprecationWarning to propagate
+ # to the outer pytest.warns; only filter UserWarning.
+ warnings.simplefilter("always", DeprecationWarning)
+ did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ weights=np.ones(n),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_survey_emits_deprecation_warning(self, two_period_panel):
+ df = two_period_panel
+ with pytest.warns(DeprecationWarning, match="survey=.*deprecated"):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning)
+ warnings.simplefilter("always", DeprecationWarning)
+ did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ survey=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_three_way_mutex_all_three(self, two_period_panel):
+ df = two_period_panel
+ n = len(df)
+ with pytest.raises(ValueError, match="at most one of"):
+ did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ survey_design=SurveyDesign(weights="w"),
+ survey=SurveyDesign(weights="w"),
+ weights=np.ones(n),
+ n_bootstrap=199,
+ seed=0,
+ )
+
+ def test_legacy_alias_parity_survey_overall(self, two_period_panel):
+ """PR #376 R9 P3: deprecated `survey=SurveyDesign(...)` ≡ canonical
+ `survey_design=SurveyDesign(...)` on
+ did_had_pretest_workflow(aggregate='overall'). Locks rebinding
+ parity on the workflow's overall-path data-in surface."""
+ df = two_period_panel
+ sd = SurveyDesign(weights="w")
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning) # QUG-skip warning
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ survey=sd,
+ n_bootstrap=199,
+ seed=0,
+ )
+ r_new = did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ survey_design=sd,
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert r_legacy.stute.cvm_stat == r_new.stute.cvm_stat
+ assert r_legacy.stute.p_value == r_new.stute.p_value
+ assert r_legacy.yatchew.t_stat_hr == r_new.yatchew.t_stat_hr
+
+ def test_legacy_alias_parity_weights_overall(self, two_period_panel):
+ """PR #376 R10 P3: deprecated `weights=np.ones(n)` ≡ canonical
+ `survey_design=SurveyDesign(weights="w")` on
+ did_had_pretest_workflow(aggregate='overall'). Closes the data-in
+ rebinding-parity gap on the weights= shortcut path."""
+ df = two_period_panel
+ n = len(df)
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning)
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ weights=np.ones(n),
+ n_bootstrap=199,
+ seed=0,
+ )
+ r_new = did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ survey_design=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert r_legacy.stute.cvm_stat == r_new.stute.cvm_stat
+ assert r_legacy.stute.p_value == r_new.stute.p_value
+ assert r_legacy.yatchew.t_stat_hr == r_new.yatchew.t_stat_hr
+
+
+# =============================================================================
+# 3. PR #376 R2 P1: extended dispatch-matrix coverage on the new front door
+# =============================================================================
+#
+# Reviewer flagged that the canonical `survey_design=` kwarg was added across
+# all HAD design × aggregate combinations but only directly tested on the
+# two-period continuous_at_zero / overall path. These tests cover the
+# weighted mass_point overall path, the weighted continuous event-study
+# path, and the workflow event-study path — each with both a
+# `survey_design=` smoke and a legacy-alias parity check.
+
+
+@pytest.fixture
+def mass_point_panel():
+ """Two-period panel with a continuous mass-point at d_lower=0.05.
+
+ G=200 units, fraction `0.06 > 0.02` modal at d_lower triggers the
+ mass-point heuristic in HAD's auto-detection. Used to exercise
+ `design="mass_point"` survey_design= forwarding through the weighted
+ 2SLS sandwich.
+ """
+ rng = np.random.default_rng(13)
+ G = 200
+ n_modal = int(0.06 * G) # 12 units at d_lower
+ d_modal = np.full(n_modal, 0.05)
+ d_continuous = rng.uniform(0.06, 1.0, size=G - n_modal)
+ d = np.concatenate([d_modal, d_continuous])
+ rng.shuffle(d)
+ rows = []
+ for g in range(G):
+ for t in (0, 1):
+ y = 0.0 if t == 0 else d[g] * 1.2 + rng.normal(0, 0.1)
+ rows.append({"unit": g, "time": t, "y": y, "d": (0.0 if t == 0 else d[g])})
+ df = pd.DataFrame(rows)
+ df["w"] = 1.0
+ return df
+
+
+@pytest.fixture
+def event_study_continuous_panel():
+ """Multi-period continuous_at_zero panel for HAD.fit aggregate='event_study'.
+
+ G=200 units, T=3 periods (t=0 pre, t=1 base, t=2 post), Beta(0.5, 1)
+ doses so d.min() approaches 0 (Design 1' boundary heuristic satisfied),
+ F=2 (treatment starts at t=2)."""
+ rng = np.random.default_rng(14)
+ G = 200
+ d = rng.beta(0.5, 1.0, size=G)
+ rows = []
+ F = 2
+ for g in range(G):
+ for t in range(3):
+ d_t = 0.0 if t < F else d[g]
+ y = (0.0 if t < F else d_t * 1.2) + rng.normal(0, 0.1)
+ rows.append({"unit": g, "time": t, "y": y, "d": d_t})
+ df = pd.DataFrame(rows)
+ df["w"] = 1.0
+ return df
+
+
+class TestHADFitMassPointSurveyDesign:
+ """PR #376 R2 P1: cover `design='mass_point'` + survey_design= path.
+
+ Mass-point + survey requires vcov_type='hc1' (not the classical default)
+ per the documented Phase 4.5 B deviation: the survey path composes
+ Binder-TSL on the HC1-scale IF.
+ """
+
+ def test_survey_design_kwarg_smoke(self, mass_point_panel):
+ df = mass_point_panel
+ est = HeterogeneousAdoptionDiD(design="mass_point", vcov_type="hc1")
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning) # mass-point methodology warning
+ r = est.fit(df, "y", "d", "time", "unit", survey_design=SurveyDesign(weights="w"))
+ assert np.isfinite(r.att)
+ assert np.isfinite(r.se)
+
+ def test_legacy_alias_parity_weights(self, mass_point_panel):
+ """weights=arr (deprecated) ≡ survey_design=SurveyDesign(weights='w')
+ produce identical point estimate on mass_point overall path. SE differs
+ by variance family (weights= → HC1 sandwich; survey_design= →
+ Binder-TSL on HC1-scale IF), so we assert att-only parity."""
+ df = mass_point_panel
+ n = len(df)
+ est = HeterogeneousAdoptionDiD(design="mass_point", vcov_type="hc1")
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ warnings.simplefilter("ignore", UserWarning)
+ r_legacy = est.fit(df, "y", "d", "time", "unit", weights=np.ones(n))
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning)
+ r_new = est.fit(df, "y", "d", "time", "unit", survey_design=SurveyDesign(weights="w"))
+ np.testing.assert_allclose(r_legacy.att, r_new.att, atol=1e-10, rtol=1e-10)
+
+
+class TestHADFitEventStudySurveyDesign:
+ """PR #376 R2 P1: cover aggregate='event_study' + cband=True + survey_design=."""
+
+ def test_survey_design_kwarg_smoke(self, event_study_continuous_panel):
+ df = event_study_continuous_panel
+ est = HeterogeneousAdoptionDiD(design="continuous_at_zero", n_bootstrap=99, seed=0)
+ r = est.fit(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ aggregate="event_study",
+ survey_design=SurveyDesign(weights="w"),
+ cband=True,
+ )
+ # Event-study returns HeterogeneousAdoptionDiDEventStudyResults
+ assert r.att.shape[0] >= 1
+ assert np.all(np.isfinite(r.att))
+ assert r.cband_low is not None
+ assert r.cband_high is not None
+
+ def test_legacy_alias_parity_survey(self, event_study_continuous_panel):
+ """survey=SurveyDesign(...) (deprecated) ≡ survey_design=SurveyDesign(...)
+ on event-study path."""
+ df = event_study_continuous_panel
+ sd = SurveyDesign(weights="w")
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = HeterogeneousAdoptionDiD(
+ design="continuous_at_zero", n_bootstrap=99, seed=0
+ ).fit(df, "y", "d", "time", "unit", aggregate="event_study", survey=sd, cband=True)
+ r_new = HeterogeneousAdoptionDiD(design="continuous_at_zero", n_bootstrap=99, seed=0).fit(
+ df, "y", "d", "time", "unit", aggregate="event_study", survey_design=sd, cband=True
+ )
+ np.testing.assert_array_equal(r_legacy.att, r_new.att)
+ np.testing.assert_array_equal(r_legacy.se, r_new.se)
+
+
+class TestDidHadPretestWorkflowEventStudySurveyDesign:
+ """PR #376 R2 P1: cover did_had_pretest_workflow(aggregate='event_study',
+ survey_design=...)."""
+
+ def test_survey_design_kwarg_smoke(self, event_study_panel):
+ df = event_study_panel
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning) # QUG-skip + staggered
+ report = did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ aggregate="event_study",
+ survey_design=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert report.qug is None # skipped under survey path
+ assert report.homogeneity_joint is not None
+
+ def test_legacy_alias_parity_survey(self, event_study_panel):
+ """survey=SurveyDesign(...) (deprecated) ≡ survey_design=SurveyDesign(...)
+ on workflow event-study path."""
+ df = event_study_panel
+ sd = SurveyDesign(weights="w")
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning)
+ warnings.simplefilter("ignore", DeprecationWarning)
+ r_legacy = did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ aggregate="event_study",
+ survey=sd,
+ n_bootstrap=199,
+ seed=0,
+ )
+ r_new = did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ aggregate="event_study",
+ survey_design=sd,
+ n_bootstrap=199,
+ seed=0,
+ )
+ # Joint Stute on the event-study path is bootstrap-driven; both calls
+ # use the same seed=0 + same survey design → identical bootstrap
+ # multiplier draws → identical p-values + statistics.
+ assert r_legacy.homogeneity_joint.cvm_stat_joint == r_new.homogeneity_joint.cvm_stat_joint
+ assert r_legacy.homogeneity_joint.p_value == r_new.homogeneity_joint.p_value
+
+ def test_legacy_alias_parity_weights(self, event_study_panel):
+ """weights=arr (deprecated) ≡ survey_design=SurveyDesign(weights='w')
+ with uniform 1.0 weights on the workflow event-study path. Locks the
+ nested-DeprecationWarning suppression: the user-facing warning fires
+ ONCE at the workflow front door, no extra warnings from the joint
+ wrappers when survey/weights are forwarded internally."""
+ df = event_study_panel
+ n = len(df)
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning)
+ with warnings.catch_warnings(record=True) as w_record:
+ warnings.simplefilter("always", DeprecationWarning)
+ r_legacy = did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ aggregate="event_study",
+ weights=np.ones(n),
+ n_bootstrap=199,
+ seed=0,
+ )
+ # PR #376 R2 P3 fix: workflow event-study weights= path emits
+ # exactly ONE DeprecationWarning (not three — joint wrappers'
+ # nested warnings are suppressed since the user-facing one
+ # already fired at the workflow's front door).
+ n_dep_warnings = sum(1 for w in w_record if issubclass(w.category, DeprecationWarning))
+ assert n_dep_warnings == 1, (
+ f"expected 1 DeprecationWarning at workflow front door, got " f"{n_dep_warnings}"
+ )
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning)
+ r_new = did_had_pretest_workflow(
+ df,
+ "y",
+ "d",
+ "time",
+ "unit",
+ aggregate="event_study",
+ survey_design=SurveyDesign(weights="w"),
+ n_bootstrap=199,
+ seed=0,
+ )
+ assert r_legacy.homogeneity_joint.cvm_stat_joint == r_new.homogeneity_joint.cvm_stat_joint
+ assert r_legacy.homogeneity_joint.p_value == r_new.homogeneity_joint.p_value
diff --git a/tests/test_had_pretests.py b/tests/test_had_pretests.py
index 4b97e2c2..b122e339 100644
--- a/tests/test_had_pretests.py
+++ b/tests/test_had_pretests.py
@@ -205,7 +205,7 @@ def test_mutex_both_set_raises_value_error(self):
from diff_diff import SurveyDesign
d = np.array([0.1, 0.5, 0.9])
- with pytest.raises(ValueError, match="OR weights=.*not both"):
+ with pytest.raises(ValueError, match="at most one of"):
qug_test(d, survey=SurveyDesign(weights="w"), weights=np.ones(3))
def test_methodology_pointer_in_message(self):
@@ -2881,7 +2881,7 @@ def test_workflow_mutex_both_raises(self):
from diff_diff import SurveyDesign
df = self._make_minimal_overall_panel(with_weight_col=True)
- with pytest.raises(ValueError, match="OR weights=.*not both"):
+ with pytest.raises(ValueError, match="at most one of"):
did_had_pretest_workflow(
df,
"y",
@@ -3059,23 +3059,23 @@ def test_weights_smoke(self):
def test_survey_smoke(self):
"""survey= via trivial ResolvedSurveyDesign produces a finite result."""
- from diff_diff.survey import _make_trivial_resolved
+ from diff_diff.survey import make_pweight_design
d, dy = self._setup()
w = np.random.default_rng(7).uniform(0.5, 2.0, size=30)
- resolved = _make_trivial_resolved(w)
+ resolved = make_pweight_design(w)
r = stute_test(d, dy, survey=resolved, n_bootstrap=199, seed=0)
assert np.isfinite(r.cvm_stat)
assert 0.0 <= r.p_value <= 1.0
def test_mutex_both_raises(self):
"""survey + weights mutex (mirrors workflow + qug_test pattern)."""
- from diff_diff.survey import _make_trivial_resolved
+ from diff_diff.survey import make_pweight_design
d, dy = self._setup()
w = np.ones(30)
- with pytest.raises(ValueError, match="OR weights=.*not both"):
- stute_test(d, dy, weights=w, survey=_make_trivial_resolved(w), n_bootstrap=199, seed=0)
+ with pytest.raises(ValueError, match="at most one of"):
+ stute_test(d, dy, weights=w, survey=make_pweight_design(w), n_bootstrap=199, seed=0)
def test_replicate_weights_raises(self):
"""Phase 4.5 C MEDIUM #4: replicate-weight survey designs raise
@@ -3167,20 +3167,20 @@ def test_weights_smoke(self):
assert 0.0 <= r.p_value <= 1.0
def test_survey_smoke(self):
- from diff_diff.survey import _make_trivial_resolved
+ from diff_diff.survey import make_pweight_design
d, dy = self._setup()
w = np.random.default_rng(7).uniform(0.5, 2.0, size=30)
- r = yatchew_hr_test(d, dy, survey=_make_trivial_resolved(w))
+ r = yatchew_hr_test(d, dy, survey=make_pweight_design(w))
assert np.isfinite(r.t_stat_hr)
def test_mutex_both_raises(self):
- from diff_diff.survey import _make_trivial_resolved
+ from diff_diff.survey import make_pweight_design
d, dy = self._setup()
w = np.ones(30)
- with pytest.raises(ValueError, match="not both"):
- yatchew_hr_test(d, dy, weights=w, survey=_make_trivial_resolved(w))
+ with pytest.raises(ValueError, match="at most one of"):
+ yatchew_hr_test(d, dy, weights=w, survey=make_pweight_design(w))
def test_zero_weight_rejected(self):
"""Per Reviewer Question #4: strictly-positive weights required
@@ -3302,7 +3302,7 @@ def test_joint_pretrends_mutex_both_raises(self):
df = self._make_event_study_panel()
df["w"] = 1.0
- with pytest.raises(ValueError, match="not both"):
+ with pytest.raises(ValueError, match="at most one of"):
joint_pretrends_test(
df,
"y",