Add yatchew_hr_test(null='mean_independence') mode by igerber · Pull Request #397 · igerber/diff-diff

igerber · 2026-04-26T20:28:31Z

Summary

Adds keyword-only null: Literal["linearity", "mean_independence"] = "linearity" to yatchew_hr_test. New mode mirrors R YatchewTest::yatchew_test(order=0) — fits intercept-only OLS so residuals are dy - mean(dy) instead of dy - (a + b·d). Downstream sigma2_diff / sigma2_W / sort-by-d machinery is shared between the two modes.
Wired into both unweighted and survey-weighted code paths (4-arm dispatch on (null × weighted)).
Closes the placebo Yatchew R-parity gap from PR HAD Phase 4: trends_lin (Eq 17/18) + R-package end-to-end parity #392 — tests/test_did_had_parity.py::TestYatchewParity now routes effect rows through null="linearity" and placebo rows through null="mean_independence". Both modes share the documented × G/(G-1) finite-sample convention shift and parity holds at atol=1e-10.
YatchewTestResults gained null_form: str = "linearity" field; summary() renders the correct null-hypothesis title; __repr__ and to_dict() updated.

Patch-level (additive keyword-only kwarg + additive dataclass field with default). Default null="linearity" is bit-exact backcompat across all existing tests.

Test plan

pytest tests/test_had_pretests.py::TestYatchewHRTest tests/test_had_pretests.py::TestYatchewHRTestSurvey tests/test_had_pretests.py::TestYatchewHRTestMeanIndependence tests/test_did_had_parity.py (58 tests)
pytest tests/test_had_pretests.py (full file regression — 237 tests)
pytest tests/test_had.py tests/test_had_dual_knob_deprecation.py (HAD estimator + deprecation regressions — 368 tests)
CI's pure-Python and Rust matrices

🤖 Generated with Claude Code

github-actions · 2026-04-26T20:33:03Z

Overall Assessment

✅ Looks good — no unmitigated P0 or P1 findings.

Executive Summary

Affected method: diff_diff.had_pretests.yatchew_hr_test. The PR adds an additive null switch and changes only residual construction; the variance-ratio core and existing guardrails remain intact (diff_diff/had_pretests.py:L2008-L2498).
The new null="mean_independence" mode is documented in the Methodology Registry as the R-placebo parity extension, so this is not an undocumented methodology deviation (docs/methodology/REGISTRY.md:L2502-L2504).
Default behavior stays on the paper’s linearity null, including the internal workflow caller, so existing post-treatment workflow semantics are preserved (diff_diff/had_pretests.py:L2046-L2056, diff_diff/had_pretests.py:L4895-L4903).
Result metadata is propagated consistently via null_form through the dataclass, summary, and dict export (diff_diff/had_pretests.py:L323-L430).
Test coverage is strong for the new branch: hand-computed unweighted/weighted baselines, tie/low-G/invalid-null edges, and end-to-end R parity routing for placebo vs. effect rows (tests/test_had_pretests.py:L3243-L3515, tests/test_did_had_parity.py:L360-L450).
Runtime validation was not executed in this sandbox because pytest and numpy are unavailable here.

Methodology

No findings. The affected estimator surface is yatchew_hr_test; the PR preserves the documented 2G/G-1 variance conventions and one-sided normal inference, and limits the new mode to an intercept-only residual definition that is explicitly registry-documented for R placebo parity (diff_diff/had_pretests.py:L2031-L2063, diff_diff/had_pretests.py:L2350-L2484, docs/methodology/REGISTRY.md:L2502-L2504).

Code Quality

No findings. null_form is threaded through every return path and every user-facing renderer/serializer, so the new public parameter does not leave stale or partially propagated result state (diff_diff/had_pretests.py:L380-L429, diff_diff/had_pretests.py:L2303-L2498).

Performance

No findings. The change adds only a small dispatch branch plus intercept-only helper(s); the asymptotic cost of the Yatchew path is unchanged (diff_diff/had_pretests.py:L1016-L1050, diff_diff/had_pretests.py:L2350-L2433).

Maintainability

No findings. The intercept-only mode is isolated in dedicated helpers and reuses the existing downstream variance code instead of duplicating formula logic (diff_diff/had_pretests.py:L1016-L1050, diff_diff/had_pretests.py:L2350-L2433).

Tech Debt

No findings. This PR closes a previously tracked follow-up rather than introducing new untracked deferred work.

Security

No findings. The diff is limited to numerical logic, documentation, and tests; it does not add new I/O, parsing, credential, or command-execution surface.

Documentation/Tests

No findings. The added tests cover the exact interaction points that mattered here: default back-compat, weighted/unweighted composition, duplicate-dose and low-G guards, and R parity for placebo/effect routing (tests/test_had_pretests.py:L3243-L3515, tests/test_did_had_parity.py:L360-L450).

igerber · 2026-04-26T20:50:48Z

/ai-review

github-actions · 2026-04-26T20:55:28Z

🔁 AI review rerun (requested by @igerber)

Head SHA: d7c45db722cc3e440c780372f59110e879151890

Overall Assessment

✅ Looks good

Executive Summary

Re-review scope is clean: the prior AI review had no P1+ findings, and I did not identify any new unmitigated P0/P1 issues in this PR.
Affected surface is diff_diff.had_pretests.yatchew_hr_test; the PR adds null={"linearity","mean_independence"} while preserving the default null="linearity" behavior and the workflow’s post-treatment call site (diff_diff/had_pretests.py:L2008-L2063, diff_diff/had_pretests.py:L4895-L4903).
Methodology-wise, null="mean_independence" is documented in the Methodology Registry as an R-parity extension, so this is not an undocumented deviation (docs/methodology/REGISTRY.md:L2502-L2502).
The implementation changes only residual construction; sigma2_diff, sigma2_W, low-G handling, tie handling, and weighted normalization remain shared across both nulls (diff_diff/had_pretests.py:L2350-L2498).
Parameter/result propagation is complete inside the changed surface: every YatchewTestResults return path now carries null_form, and summary() / to_dict() / __repr__ were updated (diff_diff/had_pretests.py:L323-L430, diff_diff/had_pretests.py:L2303-L2497).
Runtime verification was not possible here because this environment does not have numpy, scipy, pandas, or pytest installed.

Methodology

No findings. The new placebo-facing null is explicitly documented as an R-parity extension, and the workflow still uses the paper’s linearity null for the post-treatment Yatchew step (docs/methodology/REGISTRY.md:L2502-L2502, diff_diff/had_pretests.py:L4895-L4903).

Code Quality

No findings. Front-door validation and all YatchewTestResults(...) construction sites were updated consistently (diff_diff/had_pretests.py:L2176-L2180, diff_diff/had_pretests.py:L2303-L2497).

Performance

No findings. The change is a small dispatch plus two intercept-only helpers; asymptotic cost is unchanged (diff_diff/had_pretests.py:L1016-L1050, diff_diff/had_pretests.py:L2350-L2498).

Maintainability

No findings. The new mode is isolated in dedicated helpers and reuses the existing variance-ratio path instead of duplicating formula logic (diff_diff/had_pretests.py:L1016-L1050, diff_diff/had_pretests.py:L2350-L2498).

Tech Debt

No findings. The previously deferred mean_independence item is explicitly marked shipped rather than silently left open (TODO.md:L106-L106).

Security

No findings. The diff does not add new I/O, parsing, shelling, or secret-handling surface.

Documentation/Tests

Severity P3. Impact: the source-of-truth narrative is internally inconsistent. The new mode is documented in the parity note, but the main Yatchew algorithm block and some in-code docstrings still describe yatchew_hr_test as linearity-only, which can mislead future maintainers about what is paper-derived versus what is the documented R-parity extension (docs/methodology/REGISTRY.md:L2469-L2477, docs/methodology/REGISTRY.md:L2546-L2546, diff_diff/had_pretests.py:L14-L16, diff_diff/had_pretests.py:L323-L330). Concrete fix: update those narratives to describe yatchew_hr_test as a specification test with null={"linearity","mean_independence"}, and label mean_independence explicitly as the documented R-parity extension.
Severity P3. Impact: the new null_form propagation is only lightly regression-tested. Current tests cover arithmetic and the attribute itself, but not whether summary(), to_dict(), and __repr__ actually surface the new mode correctly, so a future regression there could slip through (tests/test_had_pretests.py:L840-L843, tests/test_had_pretests.py:L1189-L1194, tests/test_had_pretests.py:L3242-L3515). Concrete fix: add explicit assertions that to_dict()["null_form"] matches the selected mode, summary() switches headings for both nulls, and repr() includes null_form=.

P3: REGISTRY.md / docstring narrative inconsistency. Updates main Yatchew algorithm block (REGISTRY.md:2469-2477), Phase 3 row (REGISTRY.md:2546), module-level docstring (had_pretests.py:14-16), and YatchewTestResults class docstring (had_pretests.py:323-330) to describe yatchew_hr_test as a specification test with both nulls, labelling mean_independence explicitly as the documented R-parity extension (vs paper-derived linearity). P3: null_form propagation lightly tested. Adds three explicit regression tests in the existing test classes the reviewer cited: - TestJSONSerialization.test_yatchew_to_dict_carries_null_form: locks to_dict()["null_form"] for both modes. - TestSummary.test_yatchew_summary_switches_title_on_null_form: locks summary() title switching with both positive and negative controls. - TestSummary.test_yatchew_repr_includes_null_form: locks null_form= appears in repr() for both modes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-04-26T21:07:11Z

/ai-review

github-actions · 2026-04-26T21:11:23Z

🔁 AI review rerun (requested by @igerber)

Head SHA: ac66214d2338e0cb9d8077fbb7aeae737a381d53

Overall Assessment

✅ Looks good

Executive Summary

Re-review scope is clean: I did not identify any new unmitigated P0/P1 issues in this diff.
Affected methodology surface is yatchew_hr_test; the PR adds null={"linearity","mean_independence"} via dedicated intercept-only helpers and a 4-arm (null × weighted) dispatch while preserving default null="linearity" behavior in diff_diff/had_pretests.py:L1029-L1057 and diff_diff/had_pretests.py:L2021-L2511.
The paper deviation is documented correctly: null="mean_independence" is explicitly labeled as an R-parity extension in docs/methodology/REGISTRY.md:L2469-L2478, while the workflow keeps the paper-derived linearity null for its post-treatment Yatchew step in diff_diff/had_pretests.py:L4908-L4916.
Parameter/result propagation looks complete: all YatchewTestResults return paths carry null_form, and summary(), __repr__, and to_dict() were updated in diff_diff/had_pretests.py:L328-L443.
Test coverage for the new mode is strong: the suite adds direct arithmetic checks, weighted/unweighted coverage, edge-case guards, result-surface checks, and parity routing for placebo vs effect rows in tests/test_had_pretests.py:L845-L853, tests/test_had_pretests.py:L1206-L1224, tests/test_had_pretests.py:L3288-L3545, and tests/test_did_had_parity.py:L360-L449.
I could not execute the tests here because this environment does not have pytest, numpy, scipy, or pandas installed.

Methodology

No findings.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings. TODO.md appropriately marks the previously deferred mean_independence item as shipped in TODO.md:L106-L106.

Security

No findings.

Documentation/Tests

No findings.
Residual risk: runtime verification was not possible in this environment because the Python scientific stack and pytest are unavailable.

Mirrors R YatchewTest::yatchew_test(order=0). Closes the placebo Yatchew R-parity gap from PR #392. - New keyword-only `null: Literal["linearity", "mean_independence"]` on `yatchew_hr_test` (default `"linearity"` is bit-exact backcompat). - `"mean_independence"` fits intercept-only OLS (residuals = dy - mean(dy)); the downstream sigma2_diff / sigma2_W / sort-by-d machinery is shared. - Wired through both unweighted and survey-weighted code paths (4-arm dispatch on (null × weighted)). - `YatchewTestResults` gained `null_form: str = "linearity"` field; `summary()` renders the correct null-hypothesis title; `__repr__` and `to_dict()` updated. - `tests/test_did_had_parity.py::TestYatchewParity` removed the placebo skip; routes effect rows through `null="linearity"` (R order=1) and placebo rows through `null="mean_independence"` (R order=0); both modes share the documented `× G/(G-1)` finite-sample convention shift and parity holds at `atol=1e-10`. - New `TestYatchewHRTestMeanIndependence` class (15 tests) covering happy path, naive Python baseline at `atol=1e-12`, population-variance closed form, invalid value, default backcompat, mode-agnostic tie/constant-d rejection, NaN handling, weighted reduction at w=ones(G) at `atol=1e-14`, weighted non-uniform baseline, default-under-weights, survey×null orthogonality, the (linearity, weighted) baseline (4-arm coverage), zero/replicate-weight rejection, and G<3 mode-agnostic. One additive backcompat case in each of `TestYatchewHRTest` and `TestYatchewHRTestSurvey`. - REGISTRY.md HAD § Yatchew note: TODO marker replaced with shipped description. CHANGELOG.md and TODO.md updated. Patch-level (additive keyword-only kwarg + additive dataclass field with default). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

P3: REGISTRY.md / docstring narrative inconsistency. Updates main Yatchew algorithm block (REGISTRY.md:2469-2477), Phase 3 row (REGISTRY.md:2546), module-level docstring (had_pretests.py:14-16), and YatchewTestResults class docstring (had_pretests.py:323-330) to describe yatchew_hr_test as a specification test with both nulls, labelling mean_independence explicitly as the documented R-parity extension (vs paper-derived linearity). P3: null_form propagation lightly tested. Adds three explicit regression tests in the existing test classes the reviewer cited: - TestJSONSerialization.test_yatchew_to_dict_carries_null_form: locks to_dict()["null_form"] for both modes. - TestSummary.test_yatchew_summary_switches_title_on_null_form: locks summary() title switching with both positive and negative controls. - TestSummary.test_yatchew_repr_includes_null_form: locks null_form= appears in repr() for both modes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-04-26T22:01:28Z

/ai-review

github-actions · 2026-04-26T22:12:30Z

🔁 AI review rerun (requested by @igerber)

Head SHA: ebdceded75bcbbffd7f7c47837da7f8d58d3955a

Overall Assessment

✅ Looks good

Executive Summary

Re-review scope is clean: I did not identify any new unmitigated P0/P1 findings in this diff.
The affected methodology surface is diff_diff/had_pretests.py:L2021-L2511; the new null={"linearity","mean_independence"} branch only changes the residual fit, via new intercept-only helpers in diff_diff/had_pretests.py:L1029-L1060, while preserving the shared Yatchew variance-ratio machinery and default null="linearity" behavior.
The paper deviation is documented correctly in docs/methodology/REGISTRY.md:L2469-L2478: null="mean_independence" is explicitly labeled as an R-parity extension, not a paper-derived Theorem 7 mode.
Result propagation looks complete: YatchewTestResults now carries null_form, and summary(), __repr__(), to_dict(), plus the NaN-return paths all preserve it in diff_diff/had_pretests.py:L328-L443 and diff_diff/had_pretests.py:L2316-L2360.
Wrapper/parity behavior is consistent: the workflow keeps the paper-derived linearity null in diff_diff/had_pretests.py:L4908-L4916, while parity routes effect rows to null="linearity" and placebo rows to null="mean_independence" in tests/test_did_had_parity.py:L360-L455.
Added coverage is strong across backcompat, result-surface propagation, weighted/unweighted branches, edge cases, and parity in tests/test_had_pretests.py:L446-L457, tests/test_had_pretests.py:L845-L853, tests/test_had_pretests.py:L1206-L1224, tests/test_had_pretests.py:L3184-L3202, and tests/test_had_pretests.py:L3272-L3545. I could not run the suite locally because this environment lacks pytest, numpy, scipy, and pandas.

Methodology

Severity: P3-informational. Impact: null="mean_independence" is a documented R-parity extension rather than an undocumented methodology drift. The registry and in-code docstrings agree that only the residual definition changes, and the implementation matches that contract by swapping dy ~ 1 + d vs dy ~ 1 while leaving sigma2_diff, sigma2_W, sort-by-d, and default backcompat unchanged. Concrete fix: none. References: docs/methodology/REGISTRY.md:L2469-L2478, diff_diff/had_pretests.py:L2021-L2511.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings. The previously deferred mean_independence item is appropriately closed as shipped in TODO.md:L106-L106.

Security

No findings.

Documentation/Tests

No findings. The registry was updated alongside the code, and the new tests cover the new null mode’s arithmetic, weighted/unweighted dispatch, result serialization/summary surfaces, edge-case NaN behavior, and R parity routing. Local execution was not possible here because the Python scientific stack and pytest are unavailable.

igerber force-pushed the had-yatchew-mean-independence branch from 35833fb to d7c45db Compare April 26, 2026 20:50

igerber added the ready-for-ci Triggers CI test workflows label Apr 26, 2026

igerber and others added 2 commits April 26, 2026 18:00

igerber force-pushed the had-yatchew-mean-independence branch from ac66214 to ebdcede Compare April 26, 2026 22:01

igerber merged commit 5443c13 into main Apr 26, 2026
22 checks passed

igerber deleted the had-yatchew-mean-independence branch April 26, 2026 23:18

igerber mentioned this pull request Apr 26, 2026

Release 3.3.2: dCDH by_path × trends extensions, Yatchew mean_independence, HAD Phase 4 R-parity #400

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add yatchew_hr_test(null='mean_independence') mode#397

Add yatchew_hr_test(null='mean_independence') mode#397
igerber merged 2 commits intomainfrom
had-yatchew-mean-independence

igerber commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

igerber commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

igerber commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

igerber commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Apr 26, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

igerber commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

igerber commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

igerber commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant