Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added
- **HAD linearity-family pretests under survey (Phase 4.5 C).** `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`, and `did_had_pretest_workflow` now accept `weights=` / `survey=` keyword-only kwargs. Stute family uses **PSU-level Mammen multiplier bootstrap** via `bootstrap_utils.generate_survey_multiplier_weights_batch` (the same kernel as PR #363's HAD event-study sup-t bootstrap): each replicate draws an `(n_bootstrap, n_psu)` Mammen multiplier matrix, broadcast to per-obs perturbation `eta_obs[g] = eta_psu[psu(g)]`, weighted OLS refit, weighted CvM via new `_cvm_statistic_weighted` helper. Joint Stute SHARES the multiplier matrix across horizons within each replicate, preserving both the vector-valued empirical-process unit-level dependence AND PSU clustering. Yatchew uses **closed-form weighted OLS + pweight-sandwich variance components** (no bootstrap): `sigma2_lin = sum(w·eps²)/sum(w)`, `sigma2_diff = sum(w_avg·diff²)/(2·sum(w))` with arithmetic-mean pair weights `w_avg_g = (w_g+w_{g-1})/2`, `sigma4_W = sum(w_avg·prod)/sum(w_avg)`, `T_hr = sqrt(sum(w))·(sigma2_lin-sigma2_diff)/sigma2_W`. All three Yatchew components reduce bit-exactly to the unweighted formulas at `w=ones(G)` (locked at `atol=1e-14` by direct helper test). The pweight `weights=` shortcut routes through a synthetic trivial `ResolvedSurveyDesign` (new `survey._make_trivial_resolved` helper) so the same kernel handles both entry paths. `did_had_pretest_workflow(..., survey=, weights=)` removes the Phase 4.5 C0 `NotImplementedError`, dispatches to the survey-aware sub-tests, **skips the QUG step with `UserWarning`** (per C0 deferral), sets `qug=None` on the report, and appends a `"linearity-conditional verdict; QUG-under-survey deferred per Phase 4.5 C0"` suffix to the verdict. `HADPretestReport.qug` retyped from `QUGTestResults` to `Optional[QUGTestResults]`; `summary()` / `to_dict()` / `to_dataframe()` updated to None-tolerant rendering. Replicate-weight survey designs (BRR/Fay/JK1/JKn/SDR) raise `NotImplementedError` at every entry point (defense in depth, reciprocal-guard discipline) — parallel follow-up after this PR. **Stratified designs (`SurveyDesign(strata=...)`) also raise `NotImplementedError` on the Stute family** — the within-stratum demean + `sqrt(n_h/(n_h-1))` correction that the HAD sup-t bootstrap applies to match the Binder-TSL stratified target has not been derived for the Stute CvM functional, so applying raw multipliers from `generate_survey_multiplier_weights_batch` directly to residual perturbations would leave the bootstrap p-value silently miscalibrated. Phase 4.5 C narrows survey support to **pweight-only**, **PSU-only** (`SurveyDesign(weights=, psu=)`), and **FPC-only** (`SurveyDesign(weights=, fpc=)`) designs; stratified is a follow-up after the matching Stute-CvM stratified-correction derivation lands. Strictly positive weights required on Yatchew (the adjacent-difference variance is undefined under contiguous-zero blocks). Per-row `weights=` / `survey=col` aggregated to per-unit via existing HAD helpers `_aggregate_unit_weights` / `_aggregate_unit_resolved_survey` (constant-within-unit invariant enforced). Unweighted code paths preserved bit-exactly. Patch-level addition (additive on stable surfaces). See `docs/methodology/REGISTRY.md` § "QUG Null Test" — Note (Phase 4.5 C) for the full methodology.
- **`ChaisemartinDHaultfoeuille.by_path` + `n_bootstrap > 0` joint sup-t bands** — per-path joint sup-t simultaneous confidence intervals across horizons `1..L_max` within each path. A single shared `(n_bootstrap, n_eligible)` multiplier weight matrix (using the estimator's configured `bootstrap_weights` — Rademacher / Mammen / Webb) is drawn per path and broadcast across all horizons of that path, producing correlated bootstrap distributions across horizons. The path-specific critical value `c_p = quantile(max_l |t_l|, 1 - α)` is used to construct symmetric joint bands `effect_l ± c_p · se_l` per horizon. Surfaced on `results.path_sup_t_bands` (dict keyed by path tuple, each entry with `crit_value / alpha / n_bootstrap / method / n_valid_horizons`); as `cband_conf_int` per horizon entry on `path_effects[path]["horizons"][l]`; and as `cband_lower` / `cband_upper` columns on `results.to_dataframe(level="by_path")` (mirrors the OVERALL `level="event_study"` schema; positive-horizon rows of banded paths get populated values, placebo / unbanded / empty-window rows get NaN). Gates: a path needs `>= 2` valid horizons (finite bootstrap SE > 0) AND a strict majority (more than 50%) of finite sup-t draws to receive a band. Empty-state contract: `path_sup_t_bands is None` when not requested; `{}` when requested but no path passes both gates. **Methodology asymmetry vs OVERALL `event_study_sup_t_bands`:** the per-path sup-t draws a fresh shared weight matrix per path AFTER the per-path SE bootstrap block has already populated `results.path_ses` via independent per-(path, horizon) draws — asymptotically equivalent to OVERALL's self-consistent reuse but NOT bit-identical. Documented intentional choice to preserve RNG-state isolation for existing per-path SE seed-reproducibility tests. Inherits the cross-path cohort-sharing SE deviation from R documented for `path_effects`. **Deviation from R:** `did_multiplegt_dyn` does not provide joint / sup-t bands at any surface — this is a Python-only methodology extension consistent with the existing OVERALL sup-t bands (also Python-only). Bands cover joint inference WITHIN a single path across horizons; they do NOT provide simultaneous coverage across paths. Pre-audit fix bundled: stale "Phase 2 placeholder" docstring on the existing `sup_t_bands` field updated to the actual contract description. Tests at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathSupTBands` (`@pytest.mark.slow`). See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path per-path joint sup-t bands)` for the full contract.
- **`ChaisemartinDHaultfoeuille.by_path` + `placebo=True`** — per-path backward-horizon placebos `DID^{pl}_{path, l}` for `l = 1..L_max`. The same per-path SE convention used for the event-study (joiners/leavers IF precedent: switcher-side contributions zeroed for non-path groups; cohort structure and control pool unchanged; plug-in SE with path-specific divisor `N^{pl}_{l, path}`) is applied to backward horizons via the new `switcher_subset_mask` parameter on `_compute_per_group_if_placebo_horizon`. Surfaced on `results.path_placebo_event_study[path][-l]` (negative-int inner keys mirroring `placebo_event_study`); `summary()` renders the rows alongside per-path event-study horizons; `to_dataframe(level="by_path")` emits negative-horizon rows alongside the existing positive-horizon rows. **Bootstrap** (when `n_bootstrap > 0`) propagates per-`(path, lag)` percentile CI / p-value through the same `_bootstrap_one_target` dispatch as the per-path event-study, with the canonical NaN-on-invalid contract enforced on the new surface (PR #364 library-wide invariant). **SE inherits the cross-path cohort-sharing deviation from R** documented for `path_effects` (full-panel cohort-centered plug-in vs R's per-path re-run): tracks R within tolerance on single-path-cohort panels, diverges materially on cohort-mixed panels — the bootstrap SE is a Monte Carlo analog of the analytical SE and inherits the same deviation. R-parity confirmed at `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityByPathPlacebo` on the new `multi_path_reversible_by_path_placebo` scenario (point estimates exact match; SE within Phase-2 envelope rtol ≤ 5%); positive analytical + bootstrap invariants at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathPlacebo` (and the gated `::TestBootstrap` subclass). See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path ...)` → "Per-path placebos" for the full contract.
- **Tutorial 19: dCDH for Marketing Pulse Campaigns** (`docs/tutorials/19_dcdh_marketing_pulse.ipynb`) — end-to-end practitioner walkthrough on a 60-market reversible-treatment panel covering the TWFE decomposition diagnostic (`twowayfeweights`), `DCDH` Phase 1 (DID_M, joiners-vs-leavers, single-lag placebo), the `L_max` multi-horizon event study with multiplier bootstrap, a stakeholder communication template, and drift guards. README listing for Tutorial 17 (Brand Awareness Survey) backfilled in the same edit. Cross-link from `docs/practitioner_decision_tree.rst` § "Reversible Treatment" added.

Expand Down
84 changes: 84 additions & 0 deletions diff_diff/chaisemartin_dhaultfoeuille.py
Original file line number Diff line number Diff line change
Expand Up @@ -431,6 +431,24 @@ class ChaisemartinDHaultfoeuille(ChaisemartinDHaultfoeuilleBootstrapMixin):
cross-path cohort-sharing deviation from R is inherited from
the analytical event-study path.

With ``n_bootstrap > 0``, per-path joint sup-t simultaneous
confidence bands are also computed across horizons
``1..L_max`` within each path. A path-specific critical value
``c_p`` (constructed from a fresh shared-weights multiplier-
bootstrap draw per path) is surfaced at top level as
``results.path_sup_t_bands[path] = {"crit_value", "alpha",
"n_bootstrap", "method", "n_valid_horizons"}``, applied
per-horizon as ``cband_conf_int`` on
``path_effects[path]["horizons"][l]``, and rendered as
``cband_lower`` / ``cband_upper`` columns on
``results.to_dataframe(level="by_path")`` (mirroring the
OVERALL ``level="event_study"`` schema). Bands cover joint
inference WITHIN a single path across horizons; they do NOT
provide simultaneous coverage across paths. Python-only
library extension; R ``did_multiplegt_dyn`` provides no joint
bands at any surface. See REGISTRY.md ``Note (Phase 3 by_path
per-path joint sup-t bands)``.

SE convention: per-path IF parallels the joiners / leavers
construction — the switcher-side contribution is zeroed for
groups not in the selected path, and the cohort structure and
Expand Down Expand Up @@ -2986,6 +3004,33 @@ def fit(
path_placebos[path_key][neg_key]["conf_int"] = (np.nan, np.nan)
path_placebos[path_key][neg_key]["t_stat"] = np.nan

# Phase 3: propagate per-path sup-t critical values to per-
# horizon `cband_conf_int` entries on path_effects (by_path +
# n_bootstrap > 0). Sibling of the OVERALL event-study cband
# propagation at `:2865-2875`. For each path with a finite
# crit, write `cband_conf_int = (eff - c_p*se, eff + c_p*se)`
# into each horizon's dict whose bootstrap-replaced SE is
# finite > 0. Mirror the OVERALL absent-key pattern: non-finite
# SE horizons simply don't get the `cband_conf_int` key.
if (
bootstrap_results is not None
and bootstrap_results.path_cband_crit_values is not None
and path_effects is not None
):
for path_key, crit in bootstrap_results.path_cband_crit_values.items():
if path_key not in path_effects:
continue
if not np.isfinite(crit):
continue
for l_h, h_dict in path_effects[path_key]["horizons"].items():
se = h_dict.get("se", np.nan)
eff = h_dict.get("effect", np.nan)
if np.isfinite(se) and se > 0:
h_dict["cband_conf_int"] = (
eff - crit * se,
eff + crit * se,
)

# When L_max >= 1 and the per-group path is active, sync
# overall_* from event_study_effects[1] AFTER bootstrap propagation
# so that bootstrap SE/p/CI flow to the top-level surface.
Expand Down Expand Up @@ -3618,6 +3663,45 @@ def fit(
),
path_effects=path_effects,
path_placebo_event_study=path_placebos,
path_sup_t_bands=(
# When by_path + n_bootstrap > 0 is active, surface a
# dict (possibly empty) — preserving the documented
# `None` (not requested) vs `{}` (requested but empty)
# contract that mirrors `path_effects` / `path_placebo_
# event_study` empty-state behavior. The empty case
# arises in two ways:
# 1. `path_effects == {}` — no observed path has a
# complete window; the per-path bootstrap collector
# is skipped upstream and `path_cband_crit_values`
# stays `None`. We materialize `{}` here.
# 2. Bootstrap ran but no path passed both gates
# (>=2 valid horizons AND a strict majority — more
# than 50% — of finite sup-t draws);
# `path_cband_crit_values == {}` — passes through.
{
path_key: {
"crit_value": crit,
"alpha": self.alpha,
"n_bootstrap": self.n_bootstrap,
"method": "multiplier_bootstrap",
"n_valid_horizons": (
bootstrap_results.path_cband_n_valid_horizons.get(path_key, 0)
if bootstrap_results is not None
and bootstrap_results.path_cband_n_valid_horizons is not None
else 0
),
}
for path_key, crit in (
bootstrap_results.path_cband_crit_values
if bootstrap_results is not None
and bootstrap_results.path_cband_crit_values is not None
else {}
).items()
if np.isfinite(crit)
}
if (self.by_path is not None and self.n_bootstrap > 0)
else None
),
survey_metadata=survey_metadata,
_estimator_ref=self,
)
Expand Down
88 changes: 88 additions & 0 deletions diff_diff/chaisemartin_dhaultfoeuille_bootstrap.py
Original file line number Diff line number Diff line change
Expand Up @@ -778,6 +778,94 @@ def _compute_dcdh_bootstrap(
results.path_placebo_cis = path_pl_cis
results.path_placebo_p_values = path_pl_pvals

# --- Phase 3: Per-path joint sup-t (by_path + n_bootstrap > 0) ---
# Sibling of the OVERALL event-study sup-t at the multi-horizon
# block above (`:599-614`). Per-path joint simultaneous
# confidence bands across horizons 1..L_max within each path:
# one shared (n_bootstrap, n_eligible) multiplier weight matrix
# (using `self.bootstrap_weights` — Rademacher / Mammen / Webb)
# per path is broadcast across all valid horizons of that path,
# producing correlated bootstrap distributions across horizons.
# The path-specific critical value
# `c_p = quantile(max_l |t_l|, 1-alpha)` is the band half-width
# multiplier applied to each horizon's bootstrap SE in fit().
#
# Note (asymmetry vs OVERALL): this draws a FRESH shared-weights
# matrix per path AFTER the per-path SE block above has populated
# results.path_ses via independent per-(path, horizon) draws.
# Numerator: fresh shared draws; denominator: bootstrap SEs from
# the earlier independent draws. Asymptotically equivalent to
# OVERALL's self-consistent reuse, but NOT bit-identical. The
# fresh draw is intentional: it preserves RNG-state isolation
# for existing per-path SE seed-reproducibility tests.
#
# Gates: a path needs >=2 valid horizons (finite bootstrap SE>0)
# AND a strict majority (>50%) of finite sup-t draws to receive
# a band. Otherwise the path is absent from
# path_cband_crit_values (mirrors OVERALL absent-key pattern at
# `:605,612`; the strict-majority gate matches the OVERALL
# `finite_mask.sum() > 0.5 * n_bootstrap` semantics — exactly
# half finite is NOT enough).
if path_bootstrap_inputs is not None and results.path_ses:
path_cband_crits: Dict[Tuple[int, ...], float] = {}
path_cband_n_valid: Dict[Tuple[int, ...], int] = {}

for path_key, horizon_inputs in path_bootstrap_inputs.items():
bs_ses_for_path = results.path_ses.get(path_key, {})
valid_horizons = []
for l_h, (u_h, n_h, eff_h, _u_pp_h) in sorted(horizon_inputs.items()):
if u_h.size == 0 or n_h <= 0:
continue
bs_se = bs_ses_for_path.get(l_h, np.nan)
if not np.isfinite(bs_se) or bs_se <= 0:
continue
valid_horizons.append((l_h, u_h, n_h, eff_h, bs_se))

if len(valid_horizons) < 2:
continue

# All horizons within a path use the same n_eligible
# (variance-eligible group ordering enforced by
# _collect_path_bootstrap_inputs's use of
# eligible_mask_var for cohort-recentering); use the
# first valid horizon's IF size as the shared dim.
n_dim = valid_horizons[0][1].size
map_path = _map_for_target(
n_dim,
group_id_to_psu_code,
eligible_group_ids,
)
with np.errstate(invalid="ignore", divide="ignore"):
shared_weights = _generate_psu_or_group_weights(
n_bootstrap=self.n_bootstrap,
n_groups_target=n_dim,
weight_type=self.bootstrap_weights,
rng=rng,
group_to_psu_map=map_path,
)
es_dists_path = []
for _l_h, u_h, n_h, eff_h, _bs_se in valid_horizons:
deviations = (shared_weights @ u_h) / n_h
es_dists_path.append(eff_h + deviations)
boot_matrix = np.asarray(es_dists_path)
effects_vec = np.array([v[3] for v in valid_horizons])
ses_vec = np.array([v[4] for v in valid_horizons])
t_stats = np.abs((boot_matrix - effects_vec[:, None]) / ses_vec[:, None])
sup_t_dist = np.max(t_stats, axis=0)
finite_mask = np.isfinite(sup_t_dist)
if finite_mask.sum() <= 0.5 * self.n_bootstrap:
continue
crit_p = float(np.quantile(sup_t_dist[finite_mask], 1.0 - self.alpha))

if not np.isfinite(crit_p):
continue

path_cband_crits[path_key] = crit_p
path_cband_n_valid[path_key] = len(valid_horizons)

results.path_cband_crit_values = path_cband_crits
results.path_cband_n_valid_horizons = path_cband_n_valid

return results


Expand Down
Loading
Loading