Skip to content

Signal silent sparse -> dense lstsq fallback in ImputationDiD and TwoStageDiD#319

Merged
igerber merged 2 commits intomainfrom
fix/sparse-to-dense-lstsq-fallback
Apr 18, 2026
Merged

Signal silent sparse -> dense lstsq fallback in ImputationDiD and TwoStageDiD#319
igerber merged 2 commits intomainfrom
fix/sparse-to-dense-lstsq-fallback

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Apr 18, 2026

Summary

  • Three sites where a sparse factorization failure silently fell back to dense lstsq now emit a UserWarning with the original exception type and a note about the degraded-path implication.
    • diff_diff/imputation.py:1516 — variance path, scipy.sparse.linalg.spsolve on (A_0' W A_0) z = A_1' w. Previous bare except Exception was swallowing the root cause.
    • diff_diff/two_stage.py:1647 — GMM sandwich Stage 1 normal-equations via sparse_factorized.
    • diff_diff/two_stage_bootstrap.py:134 — bootstrap path, same pattern.
  • Each site is called at most a handful of times per .fit() (per aggregation level for the main variance path; per bootstrap preparation stage for bootstrap), so one warning per fallback event is appropriate and the axis-B aggregation wrapper pattern is not needed here.
  • No numerical change: the returned gamma_hat / z is identical to the silent path; only the signal is new.

Methodology references

  • Methods: Borusyak-Jaravel-Spiess (2024) imputation DiD (variance via Theorem 3 auxiliary-residual sparse solve); Gardner (2022) two-stage DiD (GMM sandwich via sparse factorization of Stage 1 normal equations).
  • REGISTRY updated under both estimators. The ImputationDiD Sparse variance solver bullet now notes the warning-on-fallback; the TwoStageDiD section gains a new **Note:** describing the signal on both the GMM sandwich and bootstrap fallback paths.

Validation

  • 3 new tests use unittest.mock.patch to force spsolve / sparse_factorized to raise RuntimeError, then run .fit() and assert the expected UserWarning fires:
    • tests/test_imputation.py::TestImputationVariance::test_sparse_solver_dense_fallback_emits_warning
    • tests/test_two_stage.py::TestTwoStageDiDVariance::test_sparse_factorized_dense_fallback_emits_warning
    • tests/test_two_stage.py::TestTwoStageDiDVariance::test_sparse_factorized_bootstrap_dense_fallback_emits_warning
  • Existing pre-PR test test_sparse_solver_dense_fallback (which verifies the fallback produces finite SE) continues to pass.
  • 14 tests pass in the two affected variance test classes.

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

…StageDiD

Addresses axis-C findings #8, #9, and #10 from the silent-failures audit:
three sites where a sparse factorization failure silently fell back to
dense lstsq without any user-facing signal.

- diff_diff/imputation.py:1516 (variance path: scipy.sparse.linalg.spsolve
  on (A_0' W A_0) z = A_1' w). Bare `except Exception` was swallowing
  the root cause before dense lstsq. Now emits a UserWarning identifying
  the exception type and explaining the fallback implication.
- diff_diff/two_stage.py:1647 (GMM sandwich: sparse_factorized on
  X'_{10} W X_{10} for Stage 1 normal equations). `except RuntimeError`
  was silent; now emits a UserWarning.
- diff_diff/two_stage_bootstrap.py:134 (bootstrap path: same pattern as
  above). `except RuntimeError` was silent; now emits a UserWarning.

All three are single-call sites (per fit, or per aggregation level, or
per bootstrap replicate at most a handful of times) so no aggregation
wrapper pattern is needed — one warning per fallback event is
appropriate.

REGISTRY.md updated under ImputationDiD and TwoStageDiD.

New tests (3): monkey-patch the sparse entry point to raise a
RuntimeError, run .fit(), assert the UserWarning fires with the
expected message prefix. Works against both the variance and bootstrap
surfaces.

Axis-C baseline: 3 major silent-fallback sites (imputation, two_stage,
two_stage_bootstrap) -> 0 remaining in these files. PowerAnalysis
simulation counter (finding #11) and ContinuousDiD B-spline (#12)
still open as separate follow-ups.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall Assessment

Looks good.

Executive Summary

  • The PR touches only the sparse-to-dense fallback sites in ImputationDiD’s Theorem 3 variance path and TwoStageDiD’s analytical/bootstrap Stage 1 normal-equations solve; I did not find any change to the underlying estimand, identification assumptions, weighting, or variance formula. citeturn3search2turn2search0
  • The new warning behavior is documented in the registry at docs/methodology/REGISTRY.md:L1082-L1082 and docs/methodology/REGISTRY.md:L1165-L1165.
  • I found no unmitigated P0/P1 issues in methodology adherence, NaN/inference handling, control-group logic, or parameter propagation in the changed code.
  • P3: the new TwoStage tests assert warning emission but do not also verify that the dense fallback still returns usable SEs/results.
  • The pre-existing ImputationDiD dense-fallback memory risk remains tracked in TODO.md:L62-L62 and is non-blocking for this review.

Methodology

No findings. The affected methods are the ImputationDiD variance solve in diff_diff/imputation.py:L1516-L1531, the TwoStageDiD GMM sandwich solve in diff_diff/two_stage.py:L1647-L1668, and the TwoStage bootstrap solve in diff_diff/two_stage_bootstrap.py:L134-L154. Cross-checking those sites against the estimator docstrings and the cited Borusyak-Jaravel-Spiess and Gardner/did2s methodology indicates this PR only adds UserWarning signaling around an existing sparse-to-dense linear-algebra fallback. That is a documented implementation choice, not a methodology deviation. citeturn3search2turn2search0

Code Quality

No findings. The edits are limited to warnings.warn(...) calls immediately ahead of the existing dense lstsq fallback branches in the three touched sites.

Performance

No findings. The fallback still uses the same dense solve path as before, so the PR does not introduce a new runtime or asymptotic performance regression in the reviewed code.

Maintainability

No findings. The three warning sites are mirrored by corresponding registry updates and targeted tests, so the new behavior is documented where users and maintainers will look first.

Tech Debt

  • Severity: P3 (tracked in TODO.md). Impact: the ImputationDiD fallback still materializes A0tA0_sparse.toarray(), so the existing degraded-path O((U+T+K)^2) memory risk remains. This is already tracked and is not approval-blocking for this PR. References: diff_diff/imputation.py:L1530-L1531, TODO.md:L62-L62. Concrete fix: none required for approval; address via the existing TODO item or a follow-up memory-bounded fallback.

Security

No findings. I did not see any new secret-handling, data-exposure, or unsafe input-surface changes in the diff.

Documentation/Tests

  • Severity: P3. Impact: the new TwoStage tests in tests/test_two_stage.py:L493-L530 prove warning emission, but unlike the existing ImputationDiD fallback regression in tests/test_imputation.py:L862-L885, they do not assert that the dense fallback still produces finite, usable SEs/results. A future control-flow regression could therefore keep the warning while breaking the degraded path. Concrete fix: after each warned fit(), assert results.overall_se is finite and positive, or compare the warned run against an unpatched fit on a fixed dataset.
  • Residual validation gap: I could not execute the tests in this environment because pytest is not available on PATH, and the default python interpreter here does not have numpy installed, so the test assessment is by inspection only.

AI review on PR #319 flagged that my new TwoStage warning tests only
verify that the UserWarning fires but not that the dense fallback still
produces finite, usable SEs. A future control-flow regression could
keep the warning while breaking the degraded path.

Mirror the assertion shape used in the pre-existing ImputationDiD
test_sparse_solver_dense_fallback: after the warned .fit(), assert
overall_se is finite and > 0 for both the GMM sandwich and bootstrap
paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 18, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 28507ec7fd32a44cfecc87fdc628d0eca3e1158e


Overall Assessment

✅ Looks good

Executive Summary

Methodology

Code Quality

  • No findings. The diff is narrowly scoped to warning on degraded-path entry and does not introduce any inline inference computation, partial NaN guards, or parameter-propagation issues in the changed code.

Performance

Maintainability

Tech Debt

  • Severity: P3 (tracked in TODO.md). Impact: the pre-existing ImputationDiD degraded path still densifies (A_0' A_0) via .toarray(), so the large-panel O((U+T+K)^2) memory risk remains unchanged by this PR diff_diff/imputation.py:L1530-L1531 TODO.md:L62-L62. Concrete fix: none required for approval; retain the existing TODO or replace the dense fallback with a memory-bounded solver in a follow-up.

Security

  • No findings. The diff adds warning emission and tests only; it introduces no new secret-handling, data-exposure, or unsafe execution surface.

Documentation/Tests

  • No findings. The previous P3 test gap is resolved: tests/test_two_stage.py:L493-L548 now verifies both warning emission and finite usable SEs on the warned fallback paths, and Imputation still has both the existing finite-SE fallback regression and the new warning assertion in tests/test_imputation.py:L862-L911.
  • Execution note: I could not run the tests in this environment because pytest is not installed (pytest and python -m pytest were both unavailable).

@igerber igerber added the ready-for-ci Triggers CI test workflows label Apr 18, 2026
@igerber igerber merged commit 7514cbe into main Apr 18, 2026
22 of 24 checks passed
@igerber igerber deleted the fix/sparse-to-dense-lstsq-fallback branch April 18, 2026 21:29
igerber added a commit that referenced this pull request Apr 18, 2026
…odo config

Packages five merged PRs since v3.1.2 as patch release 3.1.3:

- #311 Replicate-weight variance and PSU-level bootstrap for dCDH — new
  variance_method="replicate" (BRR / Fay / JK1 / JKn / SDR) and PSU-level
  multiplier bootstrap, with df-aware inference and group-level PSU map.
- #321 Zenodo DOI auto-minting config — .zenodo.json + top-level LICENSE
  so the next GitHub Release mints a concept + versioned DOI automatically.
- #319 Silent sparse->dense lstsq fallback signaling in ImputationDiD and
  TwoStageDiD — emits ConvergenceWarning instead of switching paths silently.
- #317 Non-convergence signaling in TROP alternating-minimization solvers,
  including LOOCV and bootstrap aggregation. Top-level warning aggregation.
- #320 /bump-version skill now updates CITATION.cff; single RELEASE_DATE
  resolved upfront and threaded through all date-bearing files.

Version strings bumped in diff_diff/__init__.py, pyproject.toml,
rust/Cargo.toml, diff_diff/guides/llms-full.txt, and CITATION.cff
(version: 3.1.3, date-released: 2026-04-18). CHANGELOG populated with
Added / Fixed / Changed sections and comparison-link footer.

Per project SemVer convention, minor bumps are reserved for new estimators
or new module-level API; additive extensions to existing estimators (like
PR #311's new variance_method values) are patch-level.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 18, 2026
CI review on #322 flagged that the 3.1.3 entries for PR #319 (sparse->dense
lstsq fallback) and PR #317 (TROP non-convergence) claimed ConvergenceWarning,
but the actual implementations emit UserWarning (imputation.py, two_stage.py,
utils.py, trop.py, and the REGISTRY.md contract all use UserWarning). Users
filtering warnings by category would be misled.

Same factual error was in the 3.1.2 entries I wrote in PR #316 for PR #314
and PR #315. Fixing both entries in this PR — CHANGELOG is a living doc and
the warning-category drift is actionable-ly misleading.

No code or test changes; CHANGELOG-only edit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant