WooldridgeDiD: outcome-fit hint for OLS on binary/count outcomes by igerber · Pull Request #513 · igerber/diff-diff

igerber · 2026-05-31T22:19:42Z

Summary

Add a non-fatal UserWarning (fit-time gate 0g, OLS-only) to WooldridgeDiD.fit() when method="ols" is used on a binary ({0, 1}) or non-negative integer-count outcome, noting that a matching nonlinear model (logit / Poisson) is often the more appropriate specification for such outcomes.
Framing is paper-faithful (Wooldridge 2023): the nonlinear paths impose parallel trends on the link/index scale (Index-PT) rather than in levels (Linear-PT, which the paper notes is "only valid for continuous/unbounded outcomes"), and the paper's Section 5 simulations show the linear model both biased and less precise where the nonlinear mean holds. It is a different identifying assumption — a recommended comparison, not an automatic switch or a free efficiency upgrade. OLS remains a valid QMLE for any response (Table 1).
Pure, non-fatal detector _suggest_nonlinear_method (binary → logit; non-negative count with >2 distinct → poisson; fractional / continuous / negative / non-numeric → None). Advisory only — it never alters the ATT/SE/inference paths and never raises.
Detection reads the full outcome column; _filter_sample is row-preserving (the control group is expressed via the design matrix, not by dropping rows), so the full column and the estimation sample always share the same outcome support (pinned by a regression test).
Retires the WooldridgeDiD efficiency-hint Tier A + Methodology/Correctness rows in TODO.md (Feature/wooldridge did #216), migrating the corrected-framing provenance (PR Wave 3 estimator observability: HonestDiD M=0 test, Wooldridge canonical-link warning, ARP vertex diagnostic #453 R1) into the REGISTRY note.

Methodology references (required if estimator / math changes)

Method name(s): WooldridgeDiD (ETWFE) — advisory outcome-fit hint only; no change to estimation or inference.
Paper / source link(s): Wooldridge, J. M. (2023), "Simple approaches to nonlinear difference-in-differences with panel data," The Econometrics Journal 26(3), C31–C66 — Table 1 (canonical link / LEF pairings), Eq. 2.5 (Linear PT) vs Eq. 2.6–2.7 (Index PT), Section 5 (simulations). Reconciled against docs/methodology/papers/wooldridge-2023-review.md.
Any intentional deviations from the source (and why): None (advisory warning, no estimand change). Documented design choices in docs/methodology/REGISTRY.md §WooldridgeDiD "Nonlinear extensions": detection is a high-signal heuristic (binary {0,1}; non-negative integer count with >2 distinct → Poisson) that does not separate bounded binomial from unbounded counts or detect fractional outcomes — that would require user-supplied outcome-family metadata.

Validation

Tests added/updated: tests/test_wooldridge.py — new TestOutcomeFitHint (detector unit cases incl. bool dtype / bounded-support / non-numeric; binary → logit and count → poisson emission; continuous + logit silence; cohort_trends smoke; warnings.filterwarnings suppression; paper-faithful framing guard; and a _filter_sample row-preserving invariant). Full tests/test_wooldridge.py passes.
Backtest / simulation / notebook evidence (if applicable): N/A (advisory warning; tutorial 16's OLS fits use continuous outcomes, so the hint does not fire there).

Security / privacy

Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

github-actions · 2026-05-31T22:23:28Z

Overall Assessment

✅ Looks good

Executive Summary

Affected method: WooldridgeDiD, but only on the OLS fit() warning path; I did not find any change to ATT estimation, weighting, variance/SE, identification checks, or defaults beyond emitting an advisory UserWarning.
The warning’s framing is consistent with Wooldridge (2023) and the Methodology Registry: nonlinear ETWFE for binary/count outcomes is presented as a different identifying assumption and often a more appropriate specification, not an automatic switch or a validity requirement.
The only methodological shortcut I found is explicitly documented: the heuristic does not distinguish known-upper-bound/binomial integer outcomes from unbounded counts, and it does not auto-detect fractional outcomes.
Edge-case coverage for the new path is solid: detector behavior, OLS-only gating, suppression, cohort_trends, framing text, and the _filter_sample support invariant are all tested.
No unmitigated P0 or P1 issues identified.

Methodology

Severity: P3-informational. Impact: WooldridgeDiD is the only affected method, and the change is advisory only: _suggest_nonlinear_method() classifies outcome support and fit() emits a warning, but the estimator and inference paths are unchanged (diff_diff/wooldridge.py:L134-L170, diff_diff/wooldridge.py:L762-L799). The one substantive deviation is the documented heuristic boundary: non-negative integers with >2 distinct values route to a Poisson suggestion, bounded/binomial integer outcomes are not separately identified, and fractional outcomes are not auto-detected (docs/methodology/REGISTRY.md:L1582-L1582, docs/api/wooldridge_etwfe.rst:L108-L122). Concrete fix: none required; if finer routing is wanted later, add explicit user-supplied outcome-family metadata rather than inferring it from support alone.

Code Quality

Severity: none. Impact: No code-quality defect identified in the changed lines. The helper is pure and fail-open on non-numeric / empty inputs, and the gate is narrowly scoped to method="ols" (diff_diff/wooldridge.py:L134-L170, diff_diff/wooldridge.py:L779-L799). Concrete fix: none.

Performance

Severity: none. Impact: No material performance regression identified. The added full-column np.asarray/np.unique scan is minor relative to ETWFE fitting. Concrete fix: none.

Maintainability

Severity: none. Impact: No maintainability defect identified. The behavior is documented consistently in code, API docs, changelog, registry, and tests, which lowers drift risk (CHANGELOG.md:L8-L11, docs/api/wooldridge_etwfe.rst:L108-L122, docs/methodology/REGISTRY.md:L1582-L1582). Concrete fix: none.

Tech Debt

Severity: P3-informational. Impact: The prior TODO item for the Wooldridge efficiency hint was properly retired and replaced with a registry note, so this PR does not appear to introduce new untracked methodology debt (TODO.md:L187-L189, docs/methodology/REGISTRY.md:L1582-L1582). Concrete fix: none.

Security

Severity: none. Impact: No security issue identified in the changed files. Concrete fix: none.

Documentation/Tests

Severity: none. Impact: No documentation or test gap specific to this warning path stood out. The new tests cover detector cases, OLS-only emission, suppression, cohort_trends, framing lock, and the sample-support invariant (tests/test_wooldridge.py:L2238-L2353). Concrete fix: none.

Review note: this was a static review. I could not execute pytest in this environment because pytest and numpy were not installed.

Add a non-fatal UserWarning when WooldridgeDiD(method="ols") is used on a binary ({0,1}) or non-negative integer-count outcome. Following Wooldridge (2023), the warning notes that a matching nonlinear model (logit / Poisson) is often the MORE APPROPRIATE specification for such outcomes: it imposes parallel trends on the link/index scale rather than in levels (level-PT is only valid for continuous/unbounded outcomes), and the paper's Section 5 simulations show the linear model both biased and less precise where the nonlinear mean holds. It is a different identifying assumption than linear OLS -- which fits depends on which parallel-trends restriction holds -- so the hint is a recommended comparison, not an automatic switch or a free efficiency upgrade. OLS remains a valid QMLE for any response (Table 1). - _suggest_nonlinear_method: pure, non-fatal detector (binary -> logit, non-negative count with >2 distinct -> poisson; fractional/continuous/ negative/non-numeric -> None). Bounded binomial-style integers are not separately distinguished from unbounded counts (documented heuristic). - fit() gate 0g (OLS-only, stacklevel=2) emits the hint on the full outcome column before sample filtering; never alters the fit or raises - TestOutcomeFitHint: detector units (incl. the bounded-support case) + gate behavior + suppression + paper-faithful framing guard (more-appropriate / biased / different assumption / recommended-comparison) - Docs: REGISTRY Note (LPT-vs-IPT + Section 5 evidence + two-sided framing provenance), method docstring, llms-full.txt, wooldridge_etwfe.rst, CHANGELOG - TODO: remove the Tier A + Methodology/Correctness rows (#216) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-31T22:54:31Z

🔁 AI review rerun (requested by @igerber)

Head SHA: a214168eb144b9c63ba65c68fff74222c64dcfcb

Overall Assessment

✅ Looks good

Executive Summary

Affected method: WooldridgeDiD. The only behavioral change is an OLS-only advisory UserWarning for binary / count outcomes; I did not find any change to ATT estimation, weighting, variance/SE, or inference paths (diff_diff/wooldridge.py:L134-L170, diff_diff/wooldridge.py:L762-L799).
The warning’s framing matches the local Wooldridge (2023) review: LPT is for continuous/unbounded outcomes, nonlinear ETWFE moves PT to the link/index scale, and Section 5 reports large OLS bias and worse precision when the nonlinear mean is correct (docs/methodology/papers/wooldridge-2023-review.md:L22-L28, docs/methodology/papers/wooldridge-2023-review.md:L141-L149, docs/methodology/papers/wooldridge-2023-review.md:L219-L239).
The only methodology shortcut I found is explicitly documented in the Methodology Registry: the heuristic does not distinguish known-upper-bound/binomial integer outcomes from unbounded counts, and it does not auto-detect fractional outcomes, so this is mitigated and informational rather than a defect (docs/methodology/REGISTRY.md:L1582-L1582).
Re-review scope: the prior AI review had no P1+ findings to re-check, and I did not find any new unmitigated P0/P1 issues in the changed lines.
Test coverage for the new path is solid: detector cases, OLS-only gating, warning suppression, cohort_trends, framing text, and the _filter_sample support invariant are all covered (tests/test_wooldridge.py:L2238-L2353).

Methodology

Severity: P3-informational. Impact: _suggest_nonlinear_method() intentionally uses a high-signal heuristic: it maps exact {0,1} to logit, and non-negative integers with >2 distinct values to poisson, which means bounded/binomial integer outcomes are not separated from unbounded counts and fractional outcomes are not auto-detected. This is explicitly documented under a **Note:** in the registry, so it is not a defect under the review policy. Concrete fix: none required; if finer routing is desired later, add explicit outcome-family metadata rather than infer it from support alone (diff_diff/wooldridge.py:L134-L170, docs/methodology/REGISTRY.md:L1582-L1582).

Code Quality

Severity: none. Impact: No code-quality defect identified in the changed implementation. The helper is pure/fail-open, and the fit-time gate is narrowly scoped to method="ols" without touching downstream estimation or inference (diff_diff/wooldridge.py:L134-L170, diff_diff/wooldridge.py:L762-L799). Concrete fix: none.

Performance

Severity: none. Impact: No material performance issue identified. The added full-column np.asarray/np.unique scan is minor relative to ETWFE fitting. Concrete fix: none.

Maintainability

Severity: none. Impact: Behavior, rationale, and limitations are documented consistently across code, API docs, changelog, and registry, which lowers drift risk (docs/api/wooldridge_etwfe.rst:L108-L122, docs/methodology/REGISTRY.md:L1582-L1582, CHANGELOG.md:L12-L12). Concrete fix: none.

Tech Debt

Severity: none. Impact: I did not find new untracked deferred work in this PR. The prior TODO item for this hint was retired, and the remaining heuristic limits are carried as documented registry notes rather than silent debt (TODO.md:L188-L188, docs/methodology/REGISTRY.md:L1582-L1582). Concrete fix: none.

Security

Severity: none. Impact: No security issue identified in the changed files. Concrete fix: none.

Documentation/Tests

Severity: none. Impact: No documentation or test gap specific to this warning path stood out. The new tests cover detector behavior, warning emission/suppression, cohort_trends, framing, and the row-preserving sample invariant (tests/test_wooldridge.py:L2238-L2353). Concrete fix: none.

Static review only; I could not execute pytest here because the environment is missing numpy/pytest.

igerber added the ready-for-ci Triggers CI test workflows label May 31, 2026

igerber force-pushed the feature/wooldridge-efficiency-hint branch from 4b3215a to a214168 Compare May 31, 2026 22:48

igerber merged commit 7506221 into main Jun 1, 2026
26 checks passed

igerber deleted the feature/wooldridge-efficiency-hint branch June 1, 2026 00:23

igerber mentioned this pull request Jun 1, 2026

Bump version to 3.5.0 #516

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WooldridgeDiD: outcome-fit hint for OLS on binary/count outcomes#513

WooldridgeDiD: outcome-fit hint for OLS on binary/count outcomes#513
igerber merged 1 commit into
mainfrom
feature/wooldridge-efficiency-hint

igerber commented May 31, 2026

Uh oh!

github-actions Bot commented May 31, 2026

Uh oh!

github-actions Bot commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented May 31, 2026

Summary

Methodology references (required if estimator / math changes)

Validation

Security / privacy

Uh oh!

github-actions Bot commented May 31, 2026

Uh oh!

github-actions Bot commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant