Skip to content

Consolidate HAD survey-design API to single survey_design= kwarg#376

Merged
igerber merged 13 commits intomainfrom
had-survey-design-consolidation
Apr 26, 2026
Merged

Consolidate HAD survey-design API to single survey_design= kwarg#376
igerber merged 13 commits intomainfrom
had-survey-design-consolidation

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Apr 25, 2026

Summary

  • Consolidates all 8 HAD surfaces (HAD.fit + workflow + 6 pretests) to the canonical survey_design= kwarg matching ContinuousDiD/EfficientDiD/dCDH.
  • Soft deprecation cycle: survey= and weights= become DeprecationWarning aliases; removal queued for the next minor (TODO row added).
  • New public helper make_pweight_design(weights: np.ndarray) -> ResolvedSurveyDesign exported from diff_diff top level for the pweight-only convenience on array-in pretest helpers (formerly the private survey._make_trivial_resolved).
  • Bit-exact regression preserved — internal back-end paths unchanged; deprecation shim only rebinds entry kwarg names.

Key design choices

  • Surface split (data-in vs array-in): data-in surfaces (HAD.fit, workflow, joint data-in wrappers) accept survey_design=SurveyDesign(weights="col") and resolve against data at fit time. Array-in surfaces (stute_test, yatchew_hr_test, stute_joint_pretest, qug_test) take pre-resolved ResolvedSurveyDesign only; passing a SurveyDesign raises TypeError with migration guidance to make_pweight_design(arr) or pre-resolution.
  • Three-way mutex: at most one of {survey_design, survey, weights} may be non-None per call. Two distinct error messages per surface group (data-in vs array-in) point users to the right migration target.
  • Normalization-order invariant (load-bearing): the weights= deprecation shim binds survey_design = make_pweight_design(weights_unnormalized) and lets the unified path apply the mean=1 normalization step exactly once. Locked by scale-invariance test.

Test plan

  • 535 tests pass (489 pre-PR baseline + 46 new in tests/test_had_dual_knob_deprecation.py)
  • Bit-exact numerical regression on the legacy weights= and survey= paths (all existing weighted tests still produce identical numbers + a DeprecationWarning)
  • 8 surfaces × {survey_design= smoke, weights= warn, survey= warn, parity legacy ≡ new, three-way mutex} — 40 cases
  • Surface-spanning: make_pweight_design exported, alias of _make_trivial_resolved, array-in TypeError on SurveyDesign, scale-invariance for both stute_test and yatchew_hr_test
  • Cross-estimator regression: tests/test_survey.py + test_continuous_did.py + test_efficient_did.py + test_chaisemartin_dhaultfoeuille.py — 536 passed (no breakage from survey.py changes)
  • black + ruff clean on touched files

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown

Overall Assessment

⚠️ Needs changes

Executive Summary

  • I did not find a paper-level estimator/SE rewrite here; the HAD changes are intended as an API-front-door consolidation, and the core math/variance paths still look unchanged.
  • P1: the deprecated survey= alias is not actually equivalent to survey_design= on the array-in helpers. SurveyDesign(...) bypasses the new type guard in stute_test, yatchew_hr_test, and stute_joint_pretest, so the alias path fails later with the wrong exception/message instead of the documented TypeError.
  • P2: the PR removes or de-indexes existing API docs for HeterogeneousAdoptionDiD and StaggeredTripleDifference while README still points to those pages/anchors, creating broken/orphaned public docs.
  • P2: the bundled practitioner guide regresses by removing dCDH HonestDiD support even though the implementation, registry, and tests still support compute_honest_did() on ChaisemartinDHaultfoeuille results.
  • P2: docs/methodology/REGISTRY.md becomes internally contradictory on HAD survey support, which is risky because the registry is supposed to be the load-bearing methodology contract.
  • P3: alias removal/back-end cleanup is properly tracked in TODO.md; that follow-up is not a blocker.

Methodology

  • Severity: P2. Impact: docs/methodology/REGISTRY.md:L2350-L2351 says both that HAD pretests “still do NOT accept survey= / weights=” and that “all 8 HAD surfaces” now accept survey_design= plus deprecated aliases. Because the registry is the source of truth for method behavior, leaving both statements in place makes the survey-support contract ambiguous for future reviewers and users. Concrete fix: remove/update the stale pre-Phase-4.5 note so the registry has one HAD survey-support contract.
  • No unmitigated paper/math/SE deviation found in the estimator logic itself.

Code Quality

  • Severity: P1. Impact: in diff_diff/had_pretests.py:L1556-L1593, L2029-L2058, and L2707-L2736, the SurveyDesign type guard runs before deprecated alias rebinding. So survey_design=SurveyDesign(...) gets the intended TypeError, but survey=SurveyDesign(...) does not; it is later treated as if it were a ResolvedSurveyDesign. That breaks the advertised “survey= is a deprecation alias of survey_design=” contract on three public surfaces. Concrete fix: move alias rebinding ahead of the type guard, or rerun the same guard immediately after survey_design = survey; add regression tests for survey=SurveyDesign(...) on all three helpers. tests/test_had_dual_knob_deprecation.py:L136-L165 and L251-L430 currently cover only the canonical-kwarg guard, not the deprecated-alias version.

Performance

  • No findings.

Maintainability

  • No additional findings beyond the documentation regressions below.

Tech Debt

  • Severity: P3. Impact: the next-minor cleanup to remove deprecated HAD aliases and fold remaining legacy back-end paths is properly tracked in TODO.md:L102. Concrete fix: none for this PR.

Security

  • No findings.

Documentation/Tests

  • Severity: P2. Impact: the diff deletes or de-indexes major public API docs for existing estimators (docs/api/had.rst deleted; docs/api/index.rst removes HAD entries; docs/api/staggered.rst:L128-L145 removes StaggeredTripleDifference) while README still links to those pages (README.md:L113-L117). That leaves published docs with broken/orphaned links for two public estimators. Concrete fix: either keep the HAD / StaggeredTripleDifference API pages in the Sphinx toctree, or update README/guides to point to the new canonical destination before merge.
  • Severity: P2. Impact: the Step 6 edit in diff_diff/guides/llms-practitioner.txt removes dCDH from the list of result types supported by compute_honest_did(), but the implementation and tests still support it (docs/methodology/REGISTRY.md:L637; tests/test_honest_did.py:L1343-L1415). This turns the bundled guide into a false negative for a supported analysis path. Concrete fix: restore the dCDH-specific HonestDiD guidance, including its placebo-based caveats, instead of collapsing the section to MultiPeriodDiD/CallawaySantAnna only.
  • Severity: P3. Impact: the README survey-support text remains stale for HAD by describing survey= / weights= as the HAD entry point instead of survey_design= with deprecated aliases (README.md:L129-L133 in the changed block). Concrete fix: update that sentence to reflect the new canonical kwarg.

Path to Approval

  1. Make the deprecated survey= alias behaviorally identical to survey_design= on stute_test, yatchew_hr_test, and stute_joint_pretest, and add regression tests for survey=SurveyDesign(...).
  2. Reconcile docs/methodology/REGISTRY.md so it contains one HAD survey-support contract, not mutually inconsistent old and new notes.
  3. Restore or replace the deleted/orphaned API docs for HeterogeneousAdoptionDiD and StaggeredTripleDifference, and ensure README links resolve.
  4. Restore the dCDH HonestDiD guidance in the bundled practitioner guide.

igerber and others added 2 commits April 25, 2026 15:45
Adds survey_design= as the canonical kwarg on all 8 HAD surfaces (HAD.fit,
did_had_pretest_workflow, qug_test, stute_test, yatchew_hr_test,
stute_joint_pretest, joint_pretrends_test, joint_homogeneity_test) to match
the rest of the library (ContinuousDiD/EfficientDiD/dCDH already use
survey_design=). The existing survey= and weights= kwargs become deprecated
aliases (DeprecationWarning, removal next minor); internal back-end paths
unchanged so numerical results are bit-exact pre-PR.

Promotes survey._make_trivial_resolved → public make_pweight_design helper
for the pweight-only convenience on array-in pretest helpers (which take
ResolvedSurveyDesign, not column-referencing SurveyDesign). Underscore name
kept as permanent private alias for back-compat.

Three-way mutex (survey_design + survey + weights) extends the prior 2-way;
two distinct error messages per surface group point users to the right
migration target (SurveyDesign(weights='col') for data-in surfaces vs
make_pweight_design(arr) for array-in helpers).

535 tests pass (489 pre-PR + 46 new in tests/test_had_dual_knob_deprecation.py
covering 8 surfaces × {survey_design= smoke, weights= warn, survey= warn,
parity, mutex} plus surface-spanning tests for type guards, normalization-
order invariant, and public-helper export). Bit-exact regression locked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
R1 P1: deprecated `survey=SurveyDesign(...)` alias didn't trigger the
SurveyDesign type guard on stute_test, yatchew_hr_test, stute_joint_pretest
because the guard ran BEFORE the alias rebinding. Move the guard AFTER the
soft-deprecation block so it covers both `survey_design=SurveyDesign(...)`
(canonical) and `survey=SurveyDesign(...)` (deprecated alias) identically.
Adds 3 regression tests in TestArrayInTypeGuard covering the alias path on
all 3 array-in surfaces.

R1 P2: REGISTRY.md had two contradictory notes on HAD survey support — the
pre-Phase-4.5-C bullet said "pretests still do NOT accept survey/weights"
while the Phase 4.5 C bullet listed all 8 surfaces as supporting them.
Rewrote the older bullet to reflect the current Phase 4.5 B + C state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber igerber force-pushed the had-survey-design-consolidation branch from efdcb2d to eef8af4 Compare April 25, 2026 19:45
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: eef8af4d46d13f910f875c7b5f3a2086455885d1


Overall Assessment
⚠️ Needs changes

Executive Summary

  • The previous array-in alias-parity bug appears fixed: survey=SurveyDesign(...) now goes through the same guard path as survey_design=... in the three array-in linearity helpers.
  • I did not find a paper-level estimator, weighting, or variance rewrite here; this diff is an API-front-door consolidation, and the registry now documents it as such.
  • P1 [Newly identified]: the new survey_design= front door is not directly tested across the full HAD dispatch matrix it now fronts. The added coverage only exercises HeterogeneousAdoptionDiD.fit on the two-period continuous_at_zero/overall path and did_had_pretest_workflow on the overall path, leaving weighted event-study and mass-point canonical entry points unverified.
  • P3: did_had_pretest_workflow(..., weights=..., aggregate="event_study") still re-emits inner deprecation warnings from the joint wrappers.
  • P3: the shared array-in mutex/docs text is still slightly misleading for qug_test, which permanently rejects any non-None survey input.

Methodology

  • No unmitigated P0/P1 methodology findings. Affected methods are the HAD survey-design entry points only (HeterogeneousAdoptionDiD.fit, did_had_pretest_workflow, qug_test, stute_test, yatchew_hr_test, stute_joint_pretest, joint_pretrends_test, joint_homogeneity_test), and I did not find an undocumented estimator/SE change relative to docs/methodology/REGISTRY.md:L2350-L2351 and docs/methodology/REGISTRY.md:L2435-L2440.
  • P3 Impact: the shared array-in migration text says array-in surfaces should prefer survey_design=make_pweight_design(arr) in diff_diff/survey.py:L754-L760, and the registry repeats that wording for qug_test in docs/methodology/REGISTRY.md:L2351-L2351, but qug_test still permanently rejects any non-None survey_design/survey/weights in diff_diff/had_pretests.py:L1294-L1323. That makes the QUG exception slightly self-contradictory in the load-bearing docs. Concrete fix: special-case qug_test in the mutex text and registry note so the canonical-name guidance does not imply a supported migration path.

Code Quality

  • P3 Impact: did_had_pretest_workflow comments say the internal event-study calls use canonical kwargs to avoid duplicate deprecation warnings, but on the deprecated weights= path it still forwards weights=joint_weights into both joint wrappers at diff_diff/had_pretests.py:L4239-L4255 and diff_diff/had_pretests.py:L4261-L4276; those wrappers re-warn on any non-None weights at diff_diff/had_pretests.py:L3360-L3365 and diff_diff/had_pretests.py:L3642-L3647. One deprecated workflow call can therefore emit three DeprecationWarnings. Concrete fix: suppress inner deprecation warnings on the workflow’s deprecated weights= event-study path, or route through a private non-warning helper.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • P3 Impact: the next-minor cleanup to remove the deprecated HAD aliases is properly tracked in TODO.md:L102-L102. Concrete fix: none for this PR.

Security

  • No findings.

Documentation/Tests

  • P1 [Newly identified] Impact: the common survey_design= front door added in diff_diff/had.py:L2893-L2924 now fronts all HAD design × aggregate combinations, but the new direct coverage only exercises the two-period continuous_at_zero path in tests/test_had_dual_knob_deprecation.py:L618-L650. Likewise, the workflow front door changed in diff_diff/had_pretests.py:L4109-L4134 now fronts both overall and event-study dispatch, but the new direct workflow coverage is only the overall path in tests/test_had_dual_knob_deprecation.py:L654-L720. Per the parameter-interaction checklist, the new kwarg is still unverified on the weighted event-study and mass-point branches it now fronts. Concrete fix: add direct survey_design= smoke/parity tests for fit(design="mass_point", ...), fit(aggregate="event_study", ..., cband=True), and did_had_pretest_workflow(aggregate="event_study", ...), plus legacy-alias parity on those same branches.

Path to Approval

  1. Add direct survey_design= regression coverage for HeterogeneousAdoptionDiD.fit on at least the weighted mass_point path and the weighted aggregate="event_study" path, with parity checks against the legacy survey=/weights= entry points.
  2. Add direct did_had_pretest_workflow(..., aggregate="event_study", survey_design=SurveyDesign(...)) coverage, plus parity checks for deprecated survey= and weights= on that same front door.
  3. Clean up the two user-facing contract mismatches: suppress nested deprecation warnings on the workflow event-study weights= path, and special-case the qug_test mutex/registry wording so it does not point users to an unsupported make_pweight_design migration.

R2 P1: extended dispatch-matrix coverage on the new survey_design= front
door. Added 3 test classes covering paths that PR #376 fronted but didn't
directly test:

- TestHADFitMassPointSurveyDesign: design='mass_point' + survey_design=
  smoke + legacy-alias att-parity (vcov_type='hc1' required by the Phase
  4.5 B mass-point + survey deviation).
- TestHADFitEventStudySurveyDesign: aggregate='event_study' + cband=True +
  survey_design= smoke + legacy survey= parity (full bit-equality on att,
  se under same seed + design).
- TestDidHadPretestWorkflowEventStudySurveyDesign: workflow event-study
  smoke via survey_design=, plus legacy survey= and weights= parity. The
  weights= parity test also locks the R2 P3 nested-warning suppression
  (asserts exactly ONE DeprecationWarning fires from the workflow front
  door, not three from cascading joint wrappers).

R2 P3 #1: workflow's event-study `weights=` path was emitting up to 3
DeprecationWarnings (one at workflow front door + one each from the
joint wrappers' internal weights= path). Wrap the internal joint wrapper
calls in `warnings.catch_warnings() + simplefilter("ignore",
DeprecationWarning)` since the user-facing warning has already fired at
the workflow front door. Joint wrappers can't accept ResolvedSurveyDesign
(their `_resolve_pretest_unit_weights` requires a SurveyDesign with
.resolve()), so converting weights= to survey_design= via
make_pweight_design isn't an option here. Locked by the new
test_legacy_alias_parity_weights assertion `n_dep_warnings == 1`.

R2 P3 #2: qug_test mutex error pointed users to
`survey_design=make_pweight_design(arr)` as a migration target via the
shared HAD_DUAL_KNOB_MUTEX_MSG_ARRAY_IN constant, but qug_test
permanently rejects ALL survey_design/survey/weights inputs (Phase 4.5 C0
deferral). Replaced with a qug-specific mutex message that says "no
migration path; see NotImplementedError below" instead of suggesting
make_pweight_design.

545 tests pass (was 538 + 7 new dispatch-matrix tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 3a81e735f5f88bf76ba0bff094a606ce1cf8deb3


Overall Assessment
⚠️ Needs changes

Executive Summary

  • Previous P1s from the last review look addressed: array-in alias parity now goes through the same type guard, the workflow event-study path suppresses nested deprecation warnings, and the new regression file adds direct coverage for mass-point overall, weighted event-study fit, and workflow event-study (diff_diff/had_pretests.py:L1568-L1604, diff_diff/had_pretests.py:L2045-L2073, diff_diff/had_pretests.py:L2727-L2755, diff_diff/had_pretests.py:L4245-L4292, tests/test_had_dual_knob_deprecation.py:L727-L964).
  • I did not find an estimator, weighting, or variance rewrite here; the registry documents this PR as an API consolidation and the changed entry points mostly just rebind into the legacy back ends (docs/methodology/REGISTRY.md:L2350-L2351, diff_diff/had.py:L2898-L2924, diff_diff/had_pretests.py:L4107-L4144).
  • P1: the new make_pweight_design() / array-in weights= shim path skips 1-D validation, so malformed scalar weights can raise a raw low-level exception instead of the library’s normal ValueError contract (diff_diff/survey.py:L723-L725, diff_diff/had_pretests.py:L1314-L1320, diff_diff/had_pretests.py:L1577-L1583, diff_diff/had_pretests.py:L2051-L2057, diff_diff/had_pretests.py:L2733-L2739).
  • P3: the new registry/changelog wording still describes qug_test as if it shared the make_pweight_design(arr) migration path, but the implementation correctly says there is no migration path and permanently rejects all survey-aware inputs (docs/methodology/REGISTRY.md:L2351, CHANGELOG.md:L11, diff_diff/had_pretests.py:L1294-L1307).

Methodology

  • No unmitigated P0/P1 findings. The affected methods are the HAD survey-design entry points only, and the implementation matches the registry’s “API consolidation, back-end unchanged” note plus the existing Phase 4.5 C / C0 methodology notes (docs/methodology/REGISTRY.md:L2350-L2351, docs/methodology/REGISTRY.md:L2429-L2449).
  • Severity: P3. Impact: the methodology registry/changelog still overstate qug_test’s migration contract by grouping it with the array-in helpers that point users to make_pweight_design(arr), while the code explicitly says qug_test has no survey-aware migration path at all. This is a documentation-contract mismatch, not a numerical defect (docs/methodology/REGISTRY.md:L2351, CHANGELOG.md:L11, diff_diff/had_pretests.py:L1294-L1307). Concrete fix: special-case qug_test in the consolidation note/changelog so only stute_test, yatchew_hr_test, and stute_joint_pretest advertise make_pweight_design(arr).

Code Quality

  • Severity: P1. Impact: make_pweight_design() assumes weights.shape[0] exists, and the new array-in deprecation shims call it before any _validate_1d_numeric() gate. A scalar deprecated weights= input therefore now fails with a low-level exception instead of the package’s normal front-door ValueError; this regresses the prior validation behavior on these public helper paths (diff_diff/survey.py:L681-L725, diff_diff/had_pretests.py:L1314-L1320, diff_diff/had_pretests.py:L1577-L1583, diff_diff/had_pretests.py:L2051-L2057, diff_diff/had_pretests.py:L2733-L2739, versus the intended 1-D validation at diff_diff/had_pretests.py:L1675-L1694, diff_diff/had_pretests.py:L2113-L2128, diff_diff/had_pretests.py:L2946-L2965). Concrete fix: validate weights as 1-D inside make_pweight_design() (preferred, since it is now a public top-level helper) or validate before every shim call, and add regression tests for scalar / 0-D inputs.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings. The next-minor alias-removal cleanup is properly tracked in TODO.md:L102.

Security

  • No findings.

Documentation/Tests

  • No blocking findings. The previous direct-coverage gap is addressed by the new regression cases for mass-point overall, weighted event-study fit, and workflow event-study in tests/test_had_dual_knob_deprecation.py:L727-L964.
  • I could not execute the tests in this environment because pytest is not installed.

Path to Approval

  1. Add 1-D front-door validation to make_pweight_design() (or before each array-in weights= shim) so malformed scalar / 0-D inputs raise a clean ValueError instead of bubbling a low-level exception.
  2. Add regression coverage for that error contract on both the new helper and at least one deprecated alias path, e.g. make_pweight_design(1.0), stute_test(..., weights=1.0), and/or qug_test(..., weights=1.0).

R3 P1: make_pweight_design() now validates 1-D input at the front door.
Was: scalar / 0-D / column-vector inputs reached `int(w.shape[0])` and
failed with cryptic low-level exceptions (IndexError on scalars,
inconsistent results on column vectors). Now: clear ValueError
("weights must be 1-dimensional...") that points users to common
mistakes (e.g. df[['w']].to_numpy() vs df['w'].to_numpy()). The
validation also propagates to the deprecated `weights=` shim path on
all 4 array-in helpers (stute_test, yatchew_hr_test, stute_joint_pretest,
qug_test), since the shim routes through make_pweight_design.

5 new regression tests in TestPublicHelpers cover scalar, 0-D ndarray,
and column-vector inputs to make_pweight_design directly, plus the
deprecated `weights=scalar` path on the 3 linearity helpers and qug_test.

R3 P3: REGISTRY note + CHANGELOG entry now special-case qug_test as
having no migration path. Was: both lumped qug_test in with the array-in
helpers' "use survey_design=make_pweight_design(arr)" advice, but
qug_test permanently rejects all survey-aware inputs (Phase 4.5 C0
deferral) regardless of which kwarg variant is used. Now: REGISTRY +
CHANGELOG explicitly distinguish the 3 linearity helpers (which have a
migration path) from qug_test (which doesn't).

Error message uses "1-dimensional" wording to also satisfy the existing
TestPhase45CR1Regressions tests (test_stute_test_rejects_2d_weights /
test_yatchew_hr_test_rejects_2d_weights) that were already gating on the
"1-dimensional" regex; both my new tests and the pre-PR tests pass.

550 tests pass (was 545 + 5 new R3 P1 regression tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 03c5c9d0c9b3989f9938ba90066f74a916455c13


Overall Assessment

⚠️ Needs changes

Executive Summary

  • Severity: P1 [Newly identified]. HeterogeneousAdoptionDiD.fit() now inserts * before survey / weights / cband, so legacy positional callers will fail with TypeError even though this PR is described as an additive, non-breaking deprecation shim (diff_diff/had.py:L2783-L2796, CHANGELOG.md:L11-L14).
  • Previous re-review issues look addressed: make_pweight_design() now rejects non-1D inputs, the array-in alias/type-guard parity is in place, and the registry/changelog now correctly special-case qug_test (diff_diff/survey.py:L681-L754, diff_diff/had_pretests.py:L1294-L1328,L1563-L1604, docs/methodology/REGISTRY.md:L2350-L2450).
  • I did not find a methodology, weighting, or variance/SE mismatch in the changed HAD estimator/pretest logic. The new code is an API/deprecation layer over the existing Phase 4.5 B/C/C0 back ends (diff_diff/had.py:L2893-L2924, diff_diff/had_pretests.py:L1568-L1604,L2045-L2073,L2727-L2755, docs/methodology/REGISTRY.md:L2350-L2450).
  • The new regression file is strong on keyword-based alias parity, but it does not cover legacy positional fit() calls, so the break above is currently unguarded (tests/test_had_dual_knob_deprecation.py:L660-L693,L864-L900).
  • I could not execute the tests here because pytest is not installed.

Methodology

  • No findings. Affected methods are the HAD survey-design entry points only (HeterogeneousAdoptionDiD.fit, did_had_pretest_workflow, qug_test, stute_test, yatchew_hr_test, stute_joint_pretest, joint_pretrends_test, joint_homogeneity_test). The registry documents this PR as an API consolidation with unchanged back-end behavior, and the changed code matches that contract (docs/methodology/REGISTRY.md:L2350-L2450, diff_diff/had.py:L2893-L2924, diff_diff/had_pretests.py:L1309-L1328,L1568-L1604,L2045-L2073,L2727-L2755).

Code Quality

  • Severity: P1 [Newly identified]. Impact: HeterogeneousAdoptionDiD.fit() changes survey, weights, and cband from positional-or-keyword to keyword-only by inserting * ahead of the new survey_design parameter. Any existing call sites using the pre-PR positional order now error immediately, which contradicts the changelog’s “patch-level addition / no breaking changes” statement and the TODO entry that schedules alias removal for the next minor release, not this one (diff_diff/had.py:L2783-L2796, CHANGELOG.md:L11-L14, TODO.md:L102). Concrete fix: preserve the legacy positional order (survey, weights, cband) and make only survey_design the new keyword-only addition.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No blocking findings. The planned alias-removal cleanup is properly tracked in TODO.md:L102.

Security

  • No findings.

Documentation/Tests

  • Severity: P3. Impact: the new regression coverage exercises keyword-form survey_design=, survey=, and weights= calls, but it never asserts that legacy positional HeterogeneousAdoptionDiD.fit(...) calls still work, so it does not protect the compatibility contract this PR is trying to preserve (tests/test_had_dual_knob_deprecation.py:L660-L693,L864-L900). Concrete fix: add at least one regression using the old positional order for survey and/or weights/cband.
  • I could not run the suite because pytest is unavailable in this environment.

Path to Approval

  1. Reorder HeterogeneousAdoptionDiD.fit’s parameters so the pre-PR positional call shapes remain valid, with survey_design added as a new keyword-only argument rather than making survey / weights / cband keyword-only.
  2. Add regression coverage for at least one legacy positional fit() call shape next to the existing keyword-based deprecation tests.

R4 P1: HeterogeneousAdoptionDiD.fit() inadvertently made `survey`,
`weights`, and `cband` keyword-only when adding the new `survey_design=`
kwarg, by inserting `*,` before all four. This broke pre-PR positional
callers, contradicting the "additive, non-breaking" CHANGELOG claim.
Reorder so `survey`, `weights`, `cband` keep their pre-PR
positional-or-keyword status; only `survey_design=` is the new
keyword-only addition (placed after the `*,` separator at the end).

R4 P3: added test_legacy_positional_call_back_compat in
TestHADFitDeprecation that exercises the full pre-PR positional call
shape: `fit(df, "y", "d", "time", "unit", None, "overall", sd, None,
True)` — locks the back-compat contract.

The 6 array-in pretest helpers + workflow + qug_test were unaffected by
this issue: their pre-PR signatures already had `*,` before survey/
weights/etc, so those kwargs were already keyword-only.

551 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: af998df267482848edf7aeb80d90f492575168ee


Overall Assessment

✅ Looks good

Executive Summary

  • Previous P1 is resolved: HeterogeneousAdoptionDiD.fit() again preserves the pre-PR positional survey / weights / cband calling convention, and a regression now locks that shape. diff_diff/had.py:L2783-L2800, tests/test_had_dual_knob_deprecation.py:L695-L719
  • Methodology cross-check passed: the Methodology Registry documents this PR as an API consolidation with unchanged HAD estimator/pretest back ends, and the code matches that contract by rebinding survey_design= to the existing survey/weights paths before the established weighting/bootstrap logic runs. docs/methodology/REGISTRY.md:L2350-L2351, diff_diff/had.py:L2897-L2928, diff_diff/had_pretests.py:L1294-L1328, diff_diff/had_pretests.py:L1563-L1604, diff_diff/had_pretests.py:L2040-L2073, diff_diff/had_pretests.py:L2722-L2755, diff_diff/had_pretests.py:L4107-L4144
  • The new test coverage closes the earlier surface gaps, including positional fit(), mass-point weighted fit, weighted event-study fit, and workflow event-study parity. tests/test_had_dual_knob_deprecation.py:L695-L719, tests/test_had_dual_knob_deprecation.py:L863-L1032
  • No new P0/P1 issues found in estimator math, weighting, variance/SE, identification checks, or default behaviors.
  • One minor documentation drift remains in fit()’s docstring: it still describes deprecated aliases as keyword-only and omits survey_design= from the cband paragraph. diff_diff/had.py:L2864-L2883
  • I could not execute the test suite here because the environment lacks numpy, so the test assessment below is from static inspection only.

Methodology

  • No findings. Affected methods are HeterogeneousAdoptionDiD.fit, did_had_pretest_workflow, qug_test, stute_test, yatchew_hr_test, stute_joint_pretest, joint_pretrends_test, and joint_homogeneity_test. The registry explicitly treats this as a canonical-kwarg/deprecation change, not a change to the underlying estimators or inference, and the implementation follows that design. docs/methodology/REGISTRY.md:L2350-L2450, diff_diff/had.py:L2897-L2928, diff_diff/had_pretests.py:L1294-L1328, diff_diff/had_pretests.py:L1563-L1604, diff_diff/had_pretests.py:L2040-L2073, diff_diff/had_pretests.py:L2722-L2755, diff_diff/had_pretests.py:L4107-L4144

Code Quality

  • No findings. The prior compatibility blocker is fixed by restoring positional compatibility in fit() and adding a regression for the legacy call shape. diff_diff/had.py:L2783-L2800, tests/test_had_dual_knob_deprecation.py:L695-L719

Performance

  • No findings. The PR stays in argument-validation / alias-routing territory and does not materially alter the hot estimator or bootstrap loops.

Maintainability

  • No findings. Centralizing mutex/deprecation strings and promoting make_pweight_design reduces duplicated API glue across the eight HAD surfaces. diff_diff/survey.py:L681-L791

Tech Debt

  • Severity: P3
    Impact: The remaining alias-removal cleanup is properly tracked and therefore non-blocking for this release. TODO.md:L102-L102
    Concrete fix: None in this PR; remove survey= / weights= and collapse the legacy routing in the next minor as already tracked.

Security

  • No findings.

Documentation/Tests

  • Severity: P3
    Impact: HeterogeneousAdoptionDiD.fit()’s updated docstring still says deprecated survey / weights are “keyword-only”, and its cband text only mentions the old survey= / weights= knobs even though the canonical public surface is now survey_design=. Runtime behavior is correct, but generated help text slightly misstates the API. diff_diff/had.py:L2864-L2883
    Concrete fix: Update that docstring to say survey / weights remain positional-or-keyword deprecated aliases for one minor cycle, and explicitly mention that cband applies on the weighted event-study survey_design= path too.
  • No blocking test gap found in the diff. The new regression file covers the previous re-review gaps, including positional fit() compatibility and the mass-point / event-study / workflow survey-design paths. tests/test_had_dual_knob_deprecation.py:L695-L719, tests/test_had_dual_knob_deprecation.py:L863-L1032
  • I could not execute the suite here because the environment is missing numpy.

R5 P3: HAD.fit() docstring still described `survey` and `weights` as
"keyword-only" (they're positional-or-keyword after the R4 P1 fix that
restored back-compat) and the `cband` paragraph only mentioned the old
`survey=` / `weights=` kwargs (omitted the canonical `survey_design=`).
Runtime behavior was correct; just generated help text drift.

- Drop "keyword-only" qualifier from `survey` / `weights` parameter lines
  in the docstring; add note that they remain positional-or-keyword for
  one minor cycle for back-compat.
- Update `cband` paragraph to cover all three of `survey_design=` /
  `survey=` / `weights=` (was: only the deprecated two).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 86ca61c35d040cf685b99e98648bce47381f56c9


Overall Assessment

Looks good

Executive Summary

  • Re-review focus passed: the prior fit() compatibility/doc drift concern appears resolved. HeterogeneousAdoptionDiD.fit() now preserves the legacy positional survey / weights / cband call shape, and the new regression file locks that behavior. diff_diff/had.py:L2783-L2800, tests/test_had_dual_knob_deprecation.py:L695-L719
  • Methodology cross-check passed. The registry documents this PR as an API consolidation with unchanged HAD estimator/pretest back ends, permanent QUG deferral under survey, and a data-in vs array-in surface split; the implementation matches that contract. docs/methodology/REGISTRY.md:L2350-L2351, docs/methodology/REGISTRY.md:L2429-L2448, diff_diff/had.py:L2900-L2931, diff_diff/had_pretests.py:L1294-L1328, diff_diff/had_pretests.py:L1563-L1604, diff_diff/had_pretests.py:L2040-L2073, diff_diff/had_pretests.py:L2722-L2755, diff_diff/had_pretests.py:L3358-L3377, diff_diff/had_pretests.py:L3640-L3659, diff_diff/had_pretests.py:L4107-L4144
  • I found no new P0/P1 issues in weighting, variance/SE, identification checks, or default behavior.
  • One minor documentation drift remains: several had_pretests.py Raises blocks still describe the old two-way survey/weights mutex and omit survey_design=. diff_diff/had_pretests.py:L1503-L1510, diff_diff/had_pretests.py:L1958-L1965, diff_diff/had_pretests.py:L4049-L4054
  • Static review only: I could not run the suite here because the environment is missing both numpy and pytest.

Methodology

  • No findings. The affected methods are HeterogeneousAdoptionDiD.fit, did_had_pretest_workflow, qug_test, stute_test, yatchew_hr_test, stute_joint_pretest, joint_pretrends_test, and joint_homogeneity_test; I did not find an undocumented change to estimator math, weighting, or inference. docs/methodology/REGISTRY.md:L2350-L2351, docs/methodology/REGISTRY.md:L2429-L2448

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • Severity: P3
    Impact: The next-minor cleanup to remove deprecated survey= / weights= aliases is explicitly tracked, so the deferred cleanup is non-blocking for this PR. TODO.md:L102-L102
    Concrete fix: None in this PR; remove the aliases in the next minor as already tracked.

Security

  • No findings.

Documentation/Tests

  • Severity: P3
    Impact: The Raises sections for stute_test, yatchew_hr_test, and did_had_pretest_workflow still document the pre-PR two-way survey/weights mutex instead of the new three-way survey_design / survey / weights surface, so generated help text is slightly out of sync with runtime behavior. diff_diff/had_pretests.py:L1503-L1510, diff_diff/had_pretests.py:L1958-L1965, diff_diff/had_pretests.py:L4049-L4054
    Concrete fix: Update those Raises blocks to describe the three-way mutex and the canonical survey_design= entry point.
  • No blocking test-gap finding from static inspection. The new regression file covers helper export/type-guard behavior, legacy positional fit() compatibility, and the expanded dispatch matrix including mass-point and event-study paths. tests/test_had_dual_knob_deprecation.py:L109-L260, tests/test_had_dual_knob_deprecation.py:L695-L1032
  • I could not execute the tests here because the environment lacks numpy and pytest.

R6 P3: Raises blocks on stute_test, yatchew_hr_test, and
did_had_pretest_workflow still described the pre-PR 2-way
`survey`/`weights` mutex. Updated to the 3-way `survey_design + survey +
weights` form, with `survey=` / `weights=` flagged as deprecated aliases.
Also added the TypeError raise to the array-in helpers' Raises blocks
(stute_test, yatchew_hr_test) since they reject SurveyDesign instances
on `survey_design=` (and equivalently on the deprecated `survey=` alias
after R1's guard reorder).

Runtime behavior was correct; just generated help text drift.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 83329a2ef6639b6728dfc1befc3ecd17150409a5


Overall Assessment

✅ Looks good

Executive Summary

  • Re-review focus passed: the prior had_pretests.py Raises-block drift is fixed; the updated docstrings now describe the 3-way mutex and survey_design= surface. diff_diff/had_pretests.py:L1503-L1518, diff_diff/had_pretests.py:L1966-L1981, diff_diff/had_pretests.py:L4065-L4072
  • Methodology cross-check passed. The registry documents this as an API-only consolidation with unchanged HAD weighting/inference back ends, canonical survey_design=, documented QUG permanent rejection under survey, and documented data-in vs array-in split; the implementation matches that contract. docs/methodology/REGISTRY.md:L2350-L2351, docs/methodology/REGISTRY.md:L2429-L2450, diff_diff/had.py:L2895-L2931, diff_diff/had_pretests.py:L1294-L1328, diff_diff/had_pretests.py:L1571-L1612, diff_diff/had_pretests.py:L4125-L4408
  • I found no new P0/P1 issues in estimator math, weighting, variance/SE, identification checks, or default behavior.
  • One minor documentation drift remains: HeterogeneousAdoptionDiD.fit()’s survey_design parameter doc still reads as continuous-path-only even though mass-point support is documented and regression-tested. diff_diff/had.py:L2850-L2863, docs/methodology/REGISTRY.md:L2350-L2351, tests/test_had_dual_knob_deprecation.py:L855-L926
  • Static review only: I could not execute the test suite here because numpy, pandas, and pytest are not installed.

Methodology

  • No findings. The affected methods are HeterogeneousAdoptionDiD.fit, did_had_pretest_workflow, qug_test, stute_test, yatchew_hr_test, stute_joint_pretest, joint_pretrends_test, and joint_homogeneity_test; the code follows the documented API-only consolidation and keeps the previously documented weighting / variance behavior intact. docs/methodology/REGISTRY.md:L2350-L2351, docs/methodology/REGISTRY.md:L2429-L2450

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • Severity: P3
    Impact: The alias-removal and routing unification work is intentionally deferred, but it is explicitly tracked for the next minor release, so it is non-blocking for this PR. TODO.md:L102-L102
    Concrete fix: None in this PR; complete the tracked cleanup in the next minor.

Security

  • No findings.

Documentation/Tests

  • Severity: P3
    Impact: HeterogeneousAdoptionDiD.fit() now presents survey_design= as the canonical public kwarg, but its parameter doc still says survey design-based inference is for the two continuous paths only. That understates actual mass-point support on both overall and event_study, which the registry and new regression tests now cover. diff_diff/had.py:L2850-L2863, docs/methodology/REGISTRY.md:L2350-L2351, tests/test_had_dual_knob_deprecation.py:L855-L926
    Concrete fix: Update that parameter docstring to mention mass-point support and its variance behavior, or point readers directly to the registry note.
  • No blocking test-gap finding from static inspection. The new regression file covers helper export/validation, array-in type guards, positional fit() compatibility, mass-point and event-study front doors, and the nested-warning suppression case. tests/test_had_dual_knob_deprecation.py:L109-L284, tests/test_had_dual_knob_deprecation.py:L660-L792, tests/test_had_dual_knob_deprecation.py:L855-L1032

R7 P3: HAD.fit()'s `survey_design` parameter docstring still scoped to
"the two continuous-dose paths" even though Phase 4.5 B added mass-point
support and Phase 4.5 B added event-study survey composition (both with
test coverage in TestHADFitMassPointSurveyDesign and
TestHADFitEventStudySurveyDesign). Widened the description to cover the
full dispatch matrix: continuous × {overall, event_study} + mass_point ×
{overall, event_study}. Notes mass-point's vcov_type='hc1' requirement,
event-study's cband=True simultaneous CI, and points readers to
REGISTRY.md for the full matrix.

Proactive sweep (per user request): also updated
HeterogeneousAdoptionDiDEventStudyResults.variance_formula's class
docstring to clarify that the "weights= shortcut" / "survey= path"
labels refer to internal variance-source families (still accurate
internally) — added explicit "including via the deprecated weights=
alias" / "via survey_design= or the deprecated survey= alias" so the
field-level help text agrees with the consolidation.

Other surfaces audited (no drift found): did_had_pretest_workflow,
joint_pretrends_test, joint_homogeneity_test, qug_test, stute_test,
yatchew_hr_test, stute_joint_pretest survey_design= docstrings; all
already align with the canonical kwarg + 3-way mutex contract. Internal
back-end comments using "weights= shortcut" / "survey= path" describe
the (unchanged) routing mechanism; left as-is.

551 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 41d7d45e12868db809e5e78d82fcb4a241a35b3e


Overall Assessment

⚠️ Needs changes

Executive Summary

  • Re-review focus mostly passed: the prior docstring drift on HeterogeneousAdoptionDiD.fit() is fixed, and the revised survey_design= docs now match the methodology registry on mass-point and event-study support. diff_diff/had.py:L2853-L2891, docs/methodology/REGISTRY.md:L2352-L2353
  • Methodology cross-check passed. I found no undocumented change to HAD estimator math, weighting, variance/SE, or identification assumptions; this PR remains an API-surface consolidation/deprecation cycle as documented in the registry. docs/methodology/REGISTRY.md:L2352-L2450
  • Severity P1 [Newly identified]: HeterogeneousAdoptionDiD.fit() does not enforce the documented data-in type split for survey_design=. Passing the new make_pweight_design(...)/ResolvedSurveyDesign into fit() falls through to low-level aggregate-dependent errors instead of a front-door TypeError. diff_diff/had.py:L2918-L2944, diff_diff/had.py:L3011-L3057, diff_diff/had.py:L3815-L3892
  • The new regression file covers alias parity and the array-in type guard well, but it does not lock the corresponding data-in misuse case on fit(). tests/test_had_dual_knob_deprecation.py:L181-L250, tests/test_had_dual_knob_deprecation.py:L661-L719, tests/test_had_dual_knob_deprecation.py:L893-L926
  • Static review only: I could not run the test suite here because numpy, pandas, and pytest are not installed.

Methodology

No findings. The affected methods are HeterogeneousAdoptionDiD.fit, did_had_pretest_workflow, qug_test, stute_test, yatchew_hr_test, stute_joint_pretest, joint_pretrends_test, and joint_homogeneity_test. The registry explicitly documents the canonical survey_design= surface, deprecated aliases, QUG’s permanent survey rejection, and unchanged weighting/variance back ends, and the modified docstrings align with that contract. docs/methodology/REGISTRY.md:L2352-L2450, diff_diff/had.py:L2853-L2891

Code Quality

  • Severity: P1 [Newly identified]
    Impact: The PR introduces a surface split where data-in APIs take SurveyDesign and array-in APIs take pre-resolved designs, and it adds explicit type guards for the array-in helpers. HeterogeneousAdoptionDiD.fit() does not add the matching data-in guard. After the alias rebinding, the overall path assumes survey.weights is a column name, while the event-study path unconditionally calls survey.resolve(data_filtered), so fit(..., survey_design=make_pweight_design(w)) will fall through to low-level failures instead of the documented front-door contract. This is a production-facing edge-case bug on the new public API surface. diff_diff/had_pretests.py:L1571-L1605, diff_diff/had_pretests.py:L2056-L2085, diff_diff/had_pretests.py:L2738-L2767, diff_diff/had.py:L2918-L2944, diff_diff/had.py:L3011-L3057, diff_diff/had.py:L3815-L3892, docs/methodology/REGISTRY.md:L2352-L2353
    Concrete fix: After the deprecation rebinding in fit(), reject non-SurveyDesign inputs on the data-in surface with a clear TypeError that points users to survey_design=SurveyDesign(weights='col', ...) and reserves make_pweight_design(...) for the array-in helpers. Reusing the existing shared survey validator in diff_diff/survey.py:L1202-L1213 would align HAD with the rest of the codebase. Add regression tests for both aggregate="overall" and aggregate="event_study" covering survey_design=make_pweight_design(w) and the deprecated survey=make_pweight_design(w) alias.

Performance

No findings.

Maintainability

No findings.

Tech Debt

  • Severity: P3
    Impact: The alias-removal and deeper routing cleanup are intentionally deferred, but that work is already tracked in TODO.md, so it is non-blocking for this PR. TODO.md:L102-L102
    Concrete fix: None in this PR; complete the tracked next-minor cleanup.

Security

No findings.

Documentation/Tests

  • Severity: P3
    Impact: The new deprecation suite explicitly tests the array-in wrong-type guard, but it does not add the corresponding negative coverage for the data-in HeterogeneousAdoptionDiD.fit() surface, which is why the P1 above is still unpinned. tests/test_had_dual_knob_deprecation.py:L181-L250, tests/test_had_dual_knob_deprecation.py:L661-L719, tests/test_had_dual_knob_deprecation.py:L893-L926
    Concrete fix: Add negative tests asserting a clear TypeError for fit(..., survey_design=make_pweight_design(...)) and fit(..., survey=make_pweight_design(...)) on both aggregates.

Path to Approval

  1. Add a front-door type guard in HeterogeneousAdoptionDiD.fit() so the data-in survey_design=/survey= surface accepts only SurveyDesign-like inputs with .resolve, and rejects ResolvedSurveyDesign/make_pweight_design(...) with migration guidance.
  2. Add regression tests for the four misuse cases above: aggregate="overall" and "event_study", each via both survey_design= and deprecated survey=.

R8 P1: HAD.fit() lacked the data-in symmetric type guard that was already
present on the array-in pretest helpers. Result: passing
`survey_design=make_pweight_design(arr)` (or the deprecated
`survey=make_pweight_design(arr)` alias) to `fit()` would fall through to
low-level errors -- `survey.resolve(data)` AttributeError on event-study,
or `survey.weights` (a numpy array on Resolved) misinterpreted as a
column name on overall.

Fix: after the alias rebinding in `fit()`, reject any non-`SurveyDesign`-
like input (no `.resolve()` method) with a clear TypeError that points
users to `survey_design=SurveyDesign(weights='col_name', ...)` for the
data-in surface and reserves `make_pweight_design(arr)` for the array-in
pretest helpers. Mirrors the array-in helpers'
`isinstance(survey_design, SurveyDesign)` rejection — the data-in /
array-in surface split is now symmetric on type guards in both
directions.

The `did_had_pretest_workflow` and joint data-in wrappers already had
this protection via `_resolve_pretest_unit_weights`'s
`if not hasattr(survey, "resolve"): raise TypeError(...)` check; HAD.fit
was the missing surface.

4 new regression tests in TestHADFitDeprecation cover the misuse cases
the reviewer named: `survey_design=ResolvedSurveyDesign` and
`survey=ResolvedSurveyDesign` on both `aggregate="overall"` and
`aggregate="event_study"`.

Updated existing test_survey_bad_type_raises regex from "SurveyDesign-
like" → "SurveyDesign" since the new guard fires before the prior
hasattr-weights check (both messages contain "SurveyDesign", so the
broader regex matches both).

555 tests pass (was 551 + 4 new R8 P1 regression tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 9c0d74266894b26eb9cac9693fe525e2741fd898


Overall Assessment

✅ Looks good

Executive Summary

  • Prior P1 is resolved: HeterogeneousAdoptionDiD.fit() now explicitly rejects pre-resolved survey objects on the data-in surface, and the new tests cover overall/event-study plus deprecated-alias variants. diff_diff/had.py:L2908-L2968, tests/test_had_dual_knob_deprecation.py:L695-L766
  • Methodology cross-check passed. The registry documents this PR as a HAD survey-API consolidation with unchanged estimator, weighting, variance, and identification back ends; I found no undocumented methodology drift. docs/methodology/REGISTRY.md:L2352-L2353, docs/methodology/REGISTRY.md:L2437-L2448
  • The new public make_pweight_design() helper matches the documented contract and top-level export plan, including 1-D front-door validation and retention of the private alias. diff_diff/survey.py:L681-L792, diff_diff/__init__.py:L149-L155, diff_diff/__init__.py:L444-L449
  • Severity: P3. The data-in pretest wrappers still reject canonical survey_design= misuse through a helper error that says survey= must be a SurveyDesign, so the canonical kwarg name is not surfaced consistently outside HeterogeneousAdoptionDiD.fit(). diff_diff/had_pretests.py:L3264-L3269, diff_diff/had_pretests.py:L3374-L3393, diff_diff/had_pretests.py:L3656-L3675, diff_diff/had_pretests.py:L4125-L4160
  • Severity: P3. The new deprecation suite does not yet lock legacy-vs-canonical parity on joint_pretrends_test, joint_homogeneity_test, or did_had_pretest_workflow(..., aggregate="overall"). tests/test_had_dual_knob_deprecation.py:L540-L606, tests/test_had_dual_knob_deprecation.py:L609-L656, tests/test_had_dual_knob_deprecation.py:L795-L865
  • Static review only: I could not run pytest, and package imports were unavailable because numpy is not installed in this environment.

Methodology

No findings. The affected surfaces are the eight HAD survey-design entry points, and the registry’s API-consolidation note matches the implementation: canonical survey_design=, deprecated survey= / weights=, unchanged numerical back ends, array-in vs data-in split preserved, and QUG still permanently rejected under survey inputs. docs/methodology/REGISTRY.md:L2352-L2353, docs/methodology/REGISTRY.md:L2431-L2448

Code Quality

  • Severity: P3
    Impact: did_had_pretest_workflow, joint_pretrends_test, and joint_homogeneity_test enforce the new data-in contract indirectly through _resolve_pretest_unit_weights(). If a caller misuses canonical survey_design= with make_pweight_design(...) / ResolvedSurveyDesign, the TypeError still says survey= must be a SurveyDesign, which is inconsistent with the new public surface and less clear than the dedicated HeterogeneousAdoptionDiD.fit() guard. diff_diff/had_pretests.py:L3264-L3269, diff_diff/had.py:L2946-L2967
    Concrete fix: add the same explicit front-door type guard used in HeterogeneousAdoptionDiD.fit(), or parameterize _resolve_pretest_unit_weights() so the error names survey_design= and points array-in callers to make_pweight_design(...).

Performance

No findings.

Maintainability

No findings.

Tech Debt

  • Severity: P3
    Impact: alias removal and full cleanup of the legacy weights= routing are intentionally deferred, but the PR correctly tracks that work in TODO.md, so it is non-blocking. TODO.md:L102-L102
    Concrete fix: none in this PR; complete the tracked next-minor cleanup.

Security

No findings.

Documentation/Tests

  • Severity: P3
    Impact: the deprecation suite covers smoke/warning/mutex behavior for joint_pretrends_test, joint_homogeneity_test, and the overall workflow, but unlike the other surfaces it does not assert that deprecated survey= / weights= remain numerically identical to canonical survey_design= there. For an API-only consolidation, those are the missing regression locks most likely to catch future rebinding drift. tests/test_had_dual_knob_deprecation.py:L540-L606, tests/test_had_dual_knob_deprecation.py:L609-L656, tests/test_had_dual_knob_deprecation.py:L795-L865
    Concrete fix: add parity tests for survey_design=SurveyDesign(weights="w") vs deprecated survey=SurveyDesign(weights="w"), and uniform-weight weights=np.ones(n) where applicable, on joint_pretrends_test, joint_homogeneity_test, and did_had_pretest_workflow(..., aggregate="overall").

R9 P3 #1 (helper error message canonical-kwarg consistency):
`_resolve_pretest_unit_weights`'s TypeError on non-`SurveyDesign`-like
input still said `survey=` must be a SurveyDesign — but on the data-in
wrappers (workflow / joint_pretrends_test / joint_homogeneity_test) the
canonical kwarg is now `survey_design=`. Updated the message to name
`survey_design=` (with `survey=` flagged as the deprecated alias) and
to point pre-resolved-design users to the array-in pretest helpers,
mirroring HAD.fit's data-in guard.

R9 P3 #2 (legacy-vs-canonical parity coverage on data-in pretests):
Added 3 parity tests (test_legacy_alias_parity_survey on
joint_pretrends_test + joint_homogeneity_test, plus
test_legacy_alias_parity_survey_overall on did_had_pretest_workflow
overall path). Locks the rebinding contract on the data-in surfaces
that previously only had smoke / warning / mutex coverage.

558 tests pass (was 555 + 3 new R9 P3 parity tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: b2370229739cea9c69572c6abbcacf59ab2facba


Overall Assessment

✅ Looks good

No unmitigated P0/P1 findings.

Executive Summary

  • Prior P3 code-quality issue is resolved: the data-in pretest wrappers now reject pre-resolved inputs with a canonical survey_design= message and correct migration guidance. diff_diff/had_pretests.py:L3264-L3280
  • Methodology cross-check passed. The registry documents this as an API consolidation with unchanged back-end estimator, weighting, and variance behavior; QUG remains permanently deferred under survey, and the Stute/Yatchew survey mechanics still match the documented Phase 4.5 C note. docs/methodology/REGISTRY.md:L2352-L2353, docs/methodology/REGISTRY.md:L2431-L2448
  • Severity P3: qug_test still emits generic deprecation warnings that tell users to migrate to survey_design= / make_pweight_design(...), even though QUG has no supported survey-aware migration path. diff_diff/had_pretests.py:L1309-L1320, diff_diff/survey.py:L779-L792, docs/methodology/REGISTRY.md:L2353-L2353, docs/methodology/REGISTRY.md:L2431-L2437
  • Severity P3: the prior parity-test gap is only partially closed. The new suite adds survey= parity for joint_pretrends_test, joint_homogeneity_test, and did_had_pretest_workflow(..., aggregate="overall"), but weights= on those same surfaces is still warning-only with no direct legacy-vs-canonical parity lock. tests/test_had_dual_knob_deprecation.py:L557-L640, tests/test_had_dual_knob_deprecation.py:L661-L725, tests/test_had_dual_knob_deprecation.py:L882-L968
  • The next-minor alias cleanup is properly tracked in TODO.md. TODO.md:L102-L102
  • Static review only: pytest is not available in this environment.

Methodology

No findings. The affected methods are the HAD survey-design entry surfaces, and the implementation matches the registry’s consolidation note plus the unchanged QUG/Stute/Yatchew method notes. docs/methodology/REGISTRY.md:L2352-L2353, docs/methodology/REGISTRY.md:L2431-L2448

Code Quality

  • Severity: P3
    Impact: qug_test’s deprecation warnings contradict both the new registry text and the function’s own mutex/NotImplementedError: they tell users to migrate to survey_design= even though qug_test permanently rejects all survey-aware kwargs. That makes the deprecation guidance internally inconsistent. diff_diff/had_pretests.py:L1309-L1320, diff_diff/survey.py:L779-L792, docs/methodology/REGISTRY.md:L2353-L2353, docs/methodology/REGISTRY.md:L2431-L2437
    Concrete fix: give qug_test its own deprecation warning text for survey= / weights= that says the aliases are deprecated but survey-aware QUG remains unsupported, and point users to unweighted qug_test or did_had_pretest_workflow(..., survey_design=...) for the survey-aware linearity family.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings. The deprecated-alias removal / legacy-path folding is already tracked in TODO.md:L102-L102.

Security

No findings.

Documentation/Tests

  • Severity: P3
    Impact: The earlier test-gap finding is only partially resolved. The new tests lock survey= parity on the joint data-in wrappers and the overall workflow, but the corresponding weights= paths still only assert warning emission, not numerical parity against canonical survey_design=. That leaves those rebinding paths without the same regression protection now present on other surfaces. tests/test_had_dual_knob_deprecation.py:L557-L640, tests/test_had_dual_knob_deprecation.py:L661-L725, tests/test_had_dual_knob_deprecation.py:L882-L968
    Concrete fix: add direct parity tests comparing weights=np.ones(n) to survey_design=SurveyDesign(weights="w") for joint_pretrends_test, joint_homogeneity_test, and did_had_pretest_workflow(..., aggregate="overall"), asserting identical cvm_stat_joint / p_value and workflow stute / yatchew statistics.

R10 P3 #1 (qug_test deprecation warning text): qug_test was using the
shared array-in deprecation messages that point users to migrate to
`survey_design=` / `make_pweight_design(arr)`, but qug_test permanently
rejects ALL survey-aware kwargs (Phase 4.5 C0 deferral). Replaced with
qug-specific warning text that says the aliases are deprecated AND
that survey-aware QUG remains unsupported, pointing users to
`did_had_pretest_workflow(..., survey_design=...)` for the survey-aware
linearity family instead.

R10 P3 #2 (weights= parity tests on data-in wrappers): the previous
round added survey= parity for joint_pretrends_test,
joint_homogeneity_test, and did_had_pretest_workflow(aggregate='overall')
but left the weights= rebinding paths warning-only with no numerical
parity lock. Added 3 new tests:
test_legacy_alias_parity_weights (joint_pretrends_test +
joint_homogeneity_test) and test_legacy_alias_parity_weights_overall
(workflow). Each asserts `weights=np.ones(n)` ≡
`survey_design=SurveyDesign(weights="w")` (uniform 1.0 column) on
identical-numerical-output, locking the rebinding contract.

561 tests pass (was 558 + 3 new R10 P3 parity tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 236bdcebe6247775a3160d29918463981968f770


Overall Assessment

✅ Looks good

Executive Summary

  • Methodology cross-check passed for the 8 affected HAD entry surfaces: HeterogeneousAdoptionDiD.fit, did_had_pretest_workflow, qug_test, stute_test, yatchew_hr_test, stute_joint_pretest, joint_pretrends_test, and joint_homogeneity_test. The registry now documents this as an API consolidation with unchanged Phase 4.5 B/C/C0 estimator, weighting, and variance behavior underneath the new front door. docs/methodology/REGISTRY.md:L2352-L2353, docs/methodology/REGISTRY.md:L2431-L2452
  • Previous re-review P3 is resolved: qug_test now has qug-specific mutex/deprecation text and no longer advertises make_pweight_design(...) as a valid migration path for a permanently unsupported survey-QUG surface. diff_diff/had_pretests.py:L1294-L1377
  • Previous re-review P3 is resolved: direct weights= parity locks were added for joint_pretrends_test, joint_homogeneity_test, and did_had_pretest_workflow on the relevant overall/event-study paths. tests/test_had_dual_knob_deprecation.py:L608-L676, tests/test_had_dual_knob_deprecation.py:L729-L796, tests/test_had_dual_knob_deprecation.py:L1040-L1072, tests/test_had_dual_knob_deprecation.py:L1267-L1312
  • Severity P3 informational: the shared data-in weights= deprecation warning is still slightly misleading on HeterogeneousAdoptionDiD.fit, because the suggested migration changes the variance family today; the next-minor cleanup is already tracked in TODO.md. diff_diff/survey.py:L783-L787, TODO.md:L102-L102
  • Static review only: pytest is not installed in this environment, so I could not execute the suite.

Methodology

No findings. The PR rebadges the entry kwargs and adds front-door type/mutex handling, but the underlying methodology remains the documented split: QUG stays permanently rejected under survey; Stute/joint Stute keep the Phase 4.5 C PSU-level Mammen multiplier bootstrap; Yatchew keeps the documented weighted closed-form variance components. docs/methodology/REGISTRY.md:L2352-L2353, docs/methodology/REGISTRY.md:L2431-L2452, diff_diff/had.py:L2908-L2968, diff_diff/had_pretests.py:L1595-L1636, diff_diff/had_pretests.py:L2080-L2112, diff_diff/had_pretests.py:L2762-L2795, diff_diff/had_pretests.py:L4162-L4199

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

  • Severity: P3
    Impact: HeterogeneousAdoptionDiD.fit(weights=...) emits a deprecation warning that points users to survey_design=SurveyDesign(weights='col_name'), but that migration is not inference-equivalent today: the deprecated shortcut still uses variance_formula="pweight" / "pweight_2sls", while the canonical survey-design path uses survey_binder_tsl / survey_binder_tsl_2sls. This is already tracked for the next minor cleanup, so it is informational rather than blocking. diff_diff/survey.py:L783-L787, diff_diff/had.py:L2882-L2891, diff_diff/had.py:L3514-L3576, diff_diff/had.py:L4402-L4433, tests/test_had.py:L3314-L3335, TODO.md:L102-L102
    Concrete fix: split the shared data-in weights= deprecation message so HeterogeneousAdoptionDiD.fit explicitly states that survey_design=SurveyDesign(weights='col') is the long-term API but that current releases preserve a distinct pweight-shortcut variance family until the TODO-tracked unification lands.

Security

No findings.

Documentation/Tests

No findings. The earlier re-review test-gap items are now closed by dedicated parity coverage on the joint wrappers and workflow, plus a warning-count lock on the workflow event-study weights= path. tests/test_had_dual_knob_deprecation.py:L608-L676, tests/test_had_dual_knob_deprecation.py:L729-L796, tests/test_had_dual_knob_deprecation.py:L1040-L1072, tests/test_had_dual_knob_deprecation.py:L1267-L1312

R11 P3 (informational): HAD.fit's `weights=` deprecation message used the
shared HAD_DEPRECATION_MSG_WEIGHTS_KWARG_DATA_IN constant, which suggests
migrating to `survey_design=SurveyDesign(weights='col_name')`. On the
workflow + joint data-in wrappers that migration is inference-equivalent
(both go through _resolve_pretest_unit_weights to per-unit weights, then
through the shared Stute / Yatchew kernels). On HAD.fit it is NOT
inference-equivalent today: the deprecated `weights=arr` shortcut keeps
`variance_formula="pweight"` / `"pweight_2sls"` (CCT-2014 weighted-robust
/ 2SLS pweight-sandwich), while `survey_design=SurveyDesign(weights=col)`
composes Binder-TSL (`"survey_binder_tsl"` / `"survey_binder_tsl_2sls"`).
Following the migration changes the SE family — the long-term unification
is tracked in TODO row 102 for the next minor.

Fix: add a HAD.fit-specific HAD_DEPRECATION_MSG_WEIGHTS_KWARG_HAD_FIT
constant that says the long-term API is still survey_design= but
explicitly notes the SE-family caveat applies in the current release.
HAD.fit consumes this new constant; the workflow + joint wrappers keep
the original HAD_DEPRECATION_MSG_WEIGHTS_KWARG_DATA_IN (no SE-family
divergence on those surfaces).

561 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

1 similar comment
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 6fb44e6aaf505ed9825902fb5964838f987e54d4


Overall Assessment

✅ Looks good

Static review only: I could not execute the test suite in this environment because runtime deps are missing (pytest and numpy were unavailable).

Executive Summary

  • No P0/P1 findings. The changed surfaces are API/front-door consolidations; the registry and code both preserve the underlying methodology: QUG remains permanently unsupported under survey, the Stute family still uses PSU-level Mammen multipliers, Yatchew still uses the documented weighted closed-form variance components, and HeterogeneousAdoptionDiD.fit keeps its existing pweight-vs-Binder split. docs/methodology/REGISTRY.md:L2352-L2353, docs/methodology/REGISTRY.md:L2431-L2445, diff_diff/had.py:L2918-L2945, diff_diff/had_pretests.py:L1294-L1352, diff_diff/had_pretests.py:L1595-L1636, diff_diff/had_pretests.py:L2080-L2113, diff_diff/had_pretests.py:L2762-L2795
  • The prior re-review items appear addressed: qug_test now has qug-specific runtime messaging, alias-parity coverage was added across the joint wrappers/workflow, and the HeterogeneousAdoptionDiD.fit(weights=...) warning now explicitly discloses the current SE-family caveat while the cleanup is tracked in TODO.md. diff_diff/had_pretests.py:L1294-L1345, tests/test_had_dual_knob_deprecation.py:L608-L676, tests/test_had_dual_knob_deprecation.py:L729-L796, tests/test_had_dual_knob_deprecation.py:L1006-L1072, tests/test_had_dual_knob_deprecation.py:L1231-L1312, diff_diff/survey.py:L788-L809, TODO.md:L102-L102
  • One non-blocking documentation inconsistency remains: qug_test prose still partially implies a migration to survey_design= / make_pweight_design(...), but runtime and the registry say there is no migration path because all survey-aware QUG kwargs are permanently rejected. diff_diff/had_pretests.py:L1236-L1245, diff_diff/had_pretests.py:L1351-L1367, CHANGELOG.md:L11-L14, docs/methodology/REGISTRY.md:L2352-L2353, docs/methodology/REGISTRY.md:L2431-L2437

Methodology

  • No findings. The PR’s registry note documents this as an API consolidation with unchanged backend behavior, and the touched estimator/pretest code still routes into the pre-existing methodology-specific kernels rather than changing estimator math, weighting, or variance formulas. docs/methodology/REGISTRY.md:L2352-L2353, docs/methodology/REGISTRY.md:L2437-L2445, diff_diff/had.py:L2918-L2945, diff_diff/had_pretests.py:L1595-L1636, diff_diff/had_pretests.py:L2080-L2113, diff_diff/had_pretests.py:L2762-L2795

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings. The next-minor cleanup for removing the deprecated knobs and folding the legacy back-end routing is explicitly tracked in TODO.md, so the remaining duplication is properly deferred rather than silently accruing. TODO.md:L102-L102

Security

  • No findings.

Documentation/Tests

  • Severity P3 | Impact: qug_test documentation still describes weights= as a deprecated alias of survey_design=make_pweight_design(arr), and the changelog’s deprecation bullet broadly says the pretest kwargs are deprecated “in favor of” survey_design=. That is inconsistent with the actual contract: qug_test permanently rejects all non-None survey_design / survey / weights inputs, with no supported migration path. diff_diff/had_pretests.py:L1236-L1245, diff_diff/had_pretests.py:L1351-L1367, CHANGELOG.md:L11-L14, docs/methodology/REGISTRY.md:L2352-L2353, docs/methodology/REGISTRY.md:L2431-L2437
    Concrete fix: update the qug_test parameter docs and the changelog deprecation note to carve out qug_test explicitly: the kwarg is surface-symmetric only, all non-None values still raise NotImplementedError, and make_pweight_design(...) is not a valid QUG migration target.

R12 P3: qug_test parameter docs and CHANGELOG deprecation entry still
described `weights=` as a deprecated alias of
`survey_design=make_pweight_design(arr)` -- but qug_test permanently
rejects all non-`None` survey_design / survey / weights values (Phase 4.5
C0 deferral). The deprecation on qug_test is kwarg-name-consolidation
only, NOT a migration path; `make_pweight_design(arr)` is not a valid
QUG migration target.

Updated:
- qug_test parameter docs (had_pretests.py:1236-1257) explicitly carve
  out: surface-symmetric kwarg, all non-None still raises
  NotImplementedError, no migration path.
- CHANGELOG Deprecated entry adds an explicit qug_test carve-out paragraph
  pointing users to did_had_pretest_workflow(..., survey_design=...) for
  survey-aware HAD pretesting (which skips QUG under survey).

The runtime warning text was already qug-specific from R10; this round
aligns the static docs with the runtime contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: f869576500d655d8e3269c199e5d778ddb4973e3


Overall Assessment

✅ Looks good

Static review only: I could not execute the test suite here because pytest is not installed, and importing diff_diff in this environment also fails because numpy is unavailable.

Executive Summary

  • No unmitigated P0/P1 findings. The affected methods are the HAD survey/pretest entry points, and the registry now explicitly documents this PR as an API consolidation with unchanged backend methodology rather than a math/variance rewrite: docs/methodology/REGISTRY.md:L2352-L2452, diff_diff/had.py:L2908-L2968, diff_diff/had_pretests.py:L1605-L1646, diff_diff/had_pretests.py:L4172-L4455.
  • The prior P3 documentation concern around qug_test appears resolved. The changelog, registry, docstrings, warnings, and runtime error now consistently state that QUG remains permanently unsupported under survey/weights and that the survey-aware alternative is the workflow, not make_pweight_design(...): CHANGELOG.md:L13-L14, docs/methodology/REGISTRY.md:L2431-L2438, diff_diff/had_pretests.py:L1236-L1255, diff_diff/had_pretests.py:L1319-L1387.
  • Parameter propagation looks complete across all 8 affected surfaces. Data-in surfaces take SurveyDesign, array-in linearity helpers take pre-resolved ResolvedSurveyDesign, and HeterogeneousAdoptionDiD.fit now explicitly rejects pre-resolved designs on the data-in surface: docs/methodology/REGISTRY.md:L2353-L2353, diff_diff/had.py:L2946-L2968, diff_diff/had_pretests.py:L1605-L1646, diff_diff/had_pretests.py:L2090-L2123, diff_diff/had_pretests.py:L2772-L2805, diff_diff/had_pretests.py:L3298-L3315.
  • Test coverage is materially stronger than in the prior review: the new suite covers helper export/input guards, alias parity, type guards, positional back-compat on HAD.fit, event-study/mass-point dispatch, and nested-warning suppression on the workflow event-study path: tests/test_had_dual_knob_deprecation.py:L1-L22, tests/test_had_dual_knob_deprecation.py:L109-L220, tests/test_had_dual_knob_deprecation.py:L540-L676, tests/test_had_dual_knob_deprecation.py:L799-L920, tests/test_had_dual_knob_deprecation.py:L1006-L1072, tests/test_had_dual_knob_deprecation.py:L1075-L1312.
  • The remaining cleanup is already properly tracked for the next minor release, so it is informational only: TODO.md:L102-L102.

Methodology

  • No findings. Affected methods: HeterogeneousAdoptionDiD.fit, qug_test, stute_test, yatchew_hr_test, stute_joint_pretest, joint_pretrends_test, joint_homogeneity_test, and did_had_pretest_workflow. The registry’s new API-consolidation note matches the code, and I did not find an undocumented change to estimator math, weighting, identification, or variance/SE behavior: docs/methodology/REGISTRY.md:L2352-L2452, diff_diff/had.py:L2908-L2968, diff_diff/had_pretests.py:L1209-L1387, diff_diff/had_pretests.py:L1605-L1646, diff_diff/had_pretests.py:L2090-L2123, diff_diff/had_pretests.py:L2772-L2805, diff_diff/had_pretests.py:L4172-L4455.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • Severity P3 | Impact: The deprecated survey= / weights= shims and duplicate legacy routing remain until the next minor release, but this is explicitly tracked and does not create a current correctness issue. TODO.md:L102-L102, diff_diff/had.py:L2918-L2945, diff_diff/had_pretests.py:L1605-L1646, diff_diff/had_pretests.py:L4172-L4209
    Concrete fix: In the planned next-minor cleanup, remove the deprecated kwargs and collapse the remaining legacy survey= / weights= branches into the unified survey_design= path.

Security

  • No findings.

Documentation/Tests

  • No findings. The prior qug_test documentation inconsistency looks resolved, and the new regression file substantially improves front-door coverage for the consolidated API: CHANGELOG.md:L13-L14, diff_diff/had_pretests.py:L1236-L1255, diff_diff/had_pretests.py:L1319-L1387, tests/test_had_dual_knob_deprecation.py:L1-L22, tests/test_had_dual_knob_deprecation.py:L1075-L1312.
  • Residual risk: I could not run the tests in this environment because pytest and numpy are unavailable, so this is a static review only.

@igerber igerber added the ready-for-ci Triggers CI test workflows label Apr 25, 2026
@igerber igerber merged commit 631bfc5 into main Apr 26, 2026
26 of 27 checks passed
@igerber igerber deleted the had-survey-design-consolidation branch April 26, 2026 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant