Skip to content

WooldridgeDiD: outcome-fit hint for OLS on binary/count outcomes#513

Merged
igerber merged 1 commit into
mainfrom
feature/wooldridge-efficiency-hint
Jun 1, 2026
Merged

WooldridgeDiD: outcome-fit hint for OLS on binary/count outcomes#513
igerber merged 1 commit into
mainfrom
feature/wooldridge-efficiency-hint

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 31, 2026

Summary

  • Add a non-fatal UserWarning (fit-time gate 0g, OLS-only) to WooldridgeDiD.fit() when method="ols" is used on a binary ({0, 1}) or non-negative integer-count outcome, noting that a matching nonlinear model (logit / Poisson) is often the more appropriate specification for such outcomes.
  • Framing is paper-faithful (Wooldridge 2023): the nonlinear paths impose parallel trends on the link/index scale (Index-PT) rather than in levels (Linear-PT, which the paper notes is "only valid for continuous/unbounded outcomes"), and the paper's Section 5 simulations show the linear model both biased and less precise where the nonlinear mean holds. It is a different identifying assumption — a recommended comparison, not an automatic switch or a free efficiency upgrade. OLS remains a valid QMLE for any response (Table 1).
  • Pure, non-fatal detector _suggest_nonlinear_method (binary → logit; non-negative count with >2 distinct → poisson; fractional / continuous / negative / non-numeric → None). Advisory only — it never alters the ATT/SE/inference paths and never raises.
  • Detection reads the full outcome column; _filter_sample is row-preserving (the control group is expressed via the design matrix, not by dropping rows), so the full column and the estimation sample always share the same outcome support (pinned by a regression test).
  • Retires the WooldridgeDiD efficiency-hint Tier A + Methodology/Correctness rows in TODO.md (Feature/wooldridge did #216), migrating the corrected-framing provenance (PR Wave 3 estimator observability: HonestDiD M=0 test, Wooldridge canonical-link warning, ARP vertex diagnostic #453 R1) into the REGISTRY note.

Methodology references (required if estimator / math changes)

  • Method name(s): WooldridgeDiD (ETWFE) — advisory outcome-fit hint only; no change to estimation or inference.
  • Paper / source link(s): Wooldridge, J. M. (2023), "Simple approaches to nonlinear difference-in-differences with panel data," The Econometrics Journal 26(3), C31–C66 — Table 1 (canonical link / LEF pairings), Eq. 2.5 (Linear PT) vs Eq. 2.6–2.7 (Index PT), Section 5 (simulations). Reconciled against docs/methodology/papers/wooldridge-2023-review.md.
  • Any intentional deviations from the source (and why): None (advisory warning, no estimand change). Documented design choices in docs/methodology/REGISTRY.md §WooldridgeDiD "Nonlinear extensions": detection is a high-signal heuristic (binary {0,1}; non-negative integer count with >2 distinct → Poisson) that does not separate bounded binomial from unbounded counts or detect fractional outcomes — that would require user-supplied outcome-family metadata.

Validation

  • Tests added/updated: tests/test_wooldridge.py — new TestOutcomeFitHint (detector unit cases incl. bool dtype / bounded-support / non-numeric; binary → logit and count → poisson emission; continuous + logit silence; cohort_trends smoke; warnings.filterwarnings suppression; paper-faithful framing guard; and a _filter_sample row-preserving invariant). Full tests/test_wooldridge.py passes.
  • Backtest / simulation / notebook evidence (if applicable): N/A (advisory warning; tutorial 16's OLS fits use continuous outcomes, so the hint does not fire there).

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

@github-actions
Copy link
Copy Markdown

Overall Assessment

✅ Looks good

Executive Summary

  • Affected method: WooldridgeDiD, but only on the OLS fit() warning path; I did not find any change to ATT estimation, weighting, variance/SE, identification checks, or defaults beyond emitting an advisory UserWarning.
  • The warning’s framing is consistent with Wooldridge (2023) and the Methodology Registry: nonlinear ETWFE for binary/count outcomes is presented as a different identifying assumption and often a more appropriate specification, not an automatic switch or a validity requirement.
  • The only methodological shortcut I found is explicitly documented: the heuristic does not distinguish known-upper-bound/binomial integer outcomes from unbounded counts, and it does not auto-detect fractional outcomes.
  • Edge-case coverage for the new path is solid: detector behavior, OLS-only gating, suppression, cohort_trends, framing text, and the _filter_sample support invariant are all tested.
  • No unmitigated P0 or P1 issues identified.

Methodology

  • Severity: P3-informational. Impact: WooldridgeDiD is the only affected method, and the change is advisory only: _suggest_nonlinear_method() classifies outcome support and fit() emits a warning, but the estimator and inference paths are unchanged (diff_diff/wooldridge.py:L134-L170, diff_diff/wooldridge.py:L762-L799). The one substantive deviation is the documented heuristic boundary: non-negative integers with >2 distinct values route to a Poisson suggestion, bounded/binomial integer outcomes are not separately identified, and fractional outcomes are not auto-detected (docs/methodology/REGISTRY.md:L1582-L1582, docs/api/wooldridge_etwfe.rst:L108-L122). Concrete fix: none required; if finer routing is wanted later, add explicit user-supplied outcome-family metadata rather than inferring it from support alone.

Code Quality

  • Severity: none. Impact: No code-quality defect identified in the changed lines. The helper is pure and fail-open on non-numeric / empty inputs, and the gate is narrowly scoped to method="ols" (diff_diff/wooldridge.py:L134-L170, diff_diff/wooldridge.py:L779-L799). Concrete fix: none.

Performance

  • Severity: none. Impact: No material performance regression identified. The added full-column np.asarray/np.unique scan is minor relative to ETWFE fitting. Concrete fix: none.

Maintainability

  • Severity: none. Impact: No maintainability defect identified. The behavior is documented consistently in code, API docs, changelog, registry, and tests, which lowers drift risk (CHANGELOG.md:L8-L11, docs/api/wooldridge_etwfe.rst:L108-L122, docs/methodology/REGISTRY.md:L1582-L1582). Concrete fix: none.

Tech Debt

  • Severity: P3-informational. Impact: The prior TODO item for the Wooldridge efficiency hint was properly retired and replaced with a registry note, so this PR does not appear to introduce new untracked methodology debt (TODO.md:L187-L189, docs/methodology/REGISTRY.md:L1582-L1582). Concrete fix: none.

Security

  • Severity: none. Impact: No security issue identified in the changed files. Concrete fix: none.

Documentation/Tests

  • Severity: none. Impact: No documentation or test gap specific to this warning path stood out. The new tests cover detector cases, OLS-only emission, suppression, cohort_trends, framing lock, and the sample-support invariant (tests/test_wooldridge.py:L2238-L2353). Concrete fix: none.

Review note: this was a static review. I could not execute pytest in this environment because pytest and numpy were not installed.

@igerber igerber added the ready-for-ci Triggers CI test workflows label May 31, 2026
Add a non-fatal UserWarning when WooldridgeDiD(method="ols") is used on a
binary ({0,1}) or non-negative integer-count outcome. Following Wooldridge
(2023), the warning notes that a matching nonlinear model (logit / Poisson)
is often the MORE APPROPRIATE specification for such outcomes: it imposes
parallel trends on the link/index scale rather than in levels (level-PT is
only valid for continuous/unbounded outcomes), and the paper's Section 5
simulations show the linear model both biased and less precise where the
nonlinear mean holds. It is a different identifying assumption than linear
OLS -- which fits depends on which parallel-trends restriction holds -- so
the hint is a recommended comparison, not an automatic switch or a free
efficiency upgrade. OLS remains a valid QMLE for any response (Table 1).

- _suggest_nonlinear_method: pure, non-fatal detector (binary -> logit,
  non-negative count with >2 distinct -> poisson; fractional/continuous/
  negative/non-numeric -> None). Bounded binomial-style integers are not
  separately distinguished from unbounded counts (documented heuristic).
- fit() gate 0g (OLS-only, stacklevel=2) emits the hint on the full
  outcome column before sample filtering; never alters the fit or raises
- TestOutcomeFitHint: detector units (incl. the bounded-support case) +
  gate behavior + suppression + paper-faithful framing guard
  (more-appropriate / biased / different assumption / recommended-comparison)
- Docs: REGISTRY Note (LPT-vs-IPT + Section 5 evidence + two-sided framing
  provenance), method docstring, llms-full.txt, wooldridge_etwfe.rst, CHANGELOG
- TODO: remove the Tier A + Methodology/Correctness rows (#216)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@igerber igerber force-pushed the feature/wooldridge-efficiency-hint branch from 4b3215a to a214168 Compare May 31, 2026 22:48
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: a214168eb144b9c63ba65c68fff74222c64dcfcb


Overall Assessment

✅ Looks good

Executive Summary

  • Affected method: WooldridgeDiD. The only behavioral change is an OLS-only advisory UserWarning for binary / count outcomes; I did not find any change to ATT estimation, weighting, variance/SE, or inference paths (diff_diff/wooldridge.py:L134-L170, diff_diff/wooldridge.py:L762-L799).
  • The warning’s framing matches the local Wooldridge (2023) review: LPT is for continuous/unbounded outcomes, nonlinear ETWFE moves PT to the link/index scale, and Section 5 reports large OLS bias and worse precision when the nonlinear mean is correct (docs/methodology/papers/wooldridge-2023-review.md:L22-L28, docs/methodology/papers/wooldridge-2023-review.md:L141-L149, docs/methodology/papers/wooldridge-2023-review.md:L219-L239).
  • The only methodology shortcut I found is explicitly documented in the Methodology Registry: the heuristic does not distinguish known-upper-bound/binomial integer outcomes from unbounded counts, and it does not auto-detect fractional outcomes, so this is mitigated and informational rather than a defect (docs/methodology/REGISTRY.md:L1582-L1582).
  • Re-review scope: the prior AI review had no P1+ findings to re-check, and I did not find any new unmitigated P0/P1 issues in the changed lines.
  • Test coverage for the new path is solid: detector cases, OLS-only gating, warning suppression, cohort_trends, framing text, and the _filter_sample support invariant are all covered (tests/test_wooldridge.py:L2238-L2353).

Methodology

  • Severity: P3-informational. Impact: _suggest_nonlinear_method() intentionally uses a high-signal heuristic: it maps exact {0,1} to logit, and non-negative integers with >2 distinct values to poisson, which means bounded/binomial integer outcomes are not separated from unbounded counts and fractional outcomes are not auto-detected. This is explicitly documented under a **Note:** in the registry, so it is not a defect under the review policy. Concrete fix: none required; if finer routing is desired later, add explicit outcome-family metadata rather than infer it from support alone (diff_diff/wooldridge.py:L134-L170, docs/methodology/REGISTRY.md:L1582-L1582).

Code Quality

  • Severity: none. Impact: No code-quality defect identified in the changed implementation. The helper is pure/fail-open, and the fit-time gate is narrowly scoped to method="ols" without touching downstream estimation or inference (diff_diff/wooldridge.py:L134-L170, diff_diff/wooldridge.py:L762-L799). Concrete fix: none.

Performance

  • Severity: none. Impact: No material performance issue identified. The added full-column np.asarray/np.unique scan is minor relative to ETWFE fitting. Concrete fix: none.

Maintainability

  • Severity: none. Impact: Behavior, rationale, and limitations are documented consistently across code, API docs, changelog, and registry, which lowers drift risk (docs/api/wooldridge_etwfe.rst:L108-L122, docs/methodology/REGISTRY.md:L1582-L1582, CHANGELOG.md:L12-L12). Concrete fix: none.

Tech Debt

  • Severity: none. Impact: I did not find new untracked deferred work in this PR. The prior TODO item for this hint was retired, and the remaining heuristic limits are carried as documented registry notes rather than silent debt (TODO.md:L188-L188, docs/methodology/REGISTRY.md:L1582-L1582). Concrete fix: none.

Security

  • Severity: none. Impact: No security issue identified in the changed files. Concrete fix: none.

Documentation/Tests

  • Severity: none. Impact: No documentation or test gap specific to this warning path stood out. The new tests cover detector behavior, warning emission/suppression, cohort_trends, framing, and the row-preserving sample invariant (tests/test_wooldridge.py:L2238-L2353). Concrete fix: none.

Static review only; I could not execute pytest here because the environment is missing numpy/pytest.

@igerber igerber merged commit 7506221 into main Jun 1, 2026
26 checks passed
@igerber igerber deleted the feature/wooldridge-efficiency-hint branch June 1, 2026 00:23
@igerber igerber mentioned this pull request Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant