Signal non-convergence in TROP alternating-minimization solvers#317
Signal non-convergence in TROP alternating-minimization solvers#317
Conversation
Addresses axis B findings #6 and #7 from the silent-failures audit: trop_global.py:448 outer alternating-min loop, trop_global.py:466 hard-coded range(20) inner FISTA loop, and trop_local.py:680 alternating-minimization loop all exited silently on max_iter exhaustion, returning the current iterate as if converged. - trop_global._solve_global_with_lowrank: thread a converged flag through the outer loop; count non-convergence events from the inner FISTA and surface the count in the outer warning for diagnostic context. One warn_if_not_converged call per solver invocation. - trop_local._estimate_model: thread a converged flag through the outer alternating-min loop; call warn_if_not_converged on exhaustion. - REGISTRY updated under TROP. New TestTROPConvergenceWarnings class (4 tests) exercises both global and local paths with forced non-convergence (max_iter=1, tol=1e-15) and a convergent negative control. Notable: the default TROP local config (max_iter=100, tol=1e-6) does not converge within max_iter on typical synthetic panels, so this PR surfaces a previously silent non-convergence that affected routine user fits. No numerical change in the returned iterate; the warning is additive. Axis-B regression-lint baseline: 5 -> 2 silent range(max_iter) loops remaining (minor loops in honest_did/power not yet addressed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Overall Assessment ✅ Looks good Highest unmitigated issue I found is P2. I did not find a P0/P1 methodology or correctness defect in the touched TROP code. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
AI review on PR #317 flagged a P2 fan-out problem: the new warnings fired from low-level solver helpers that are called inside LOOCV, per-treated-observation, and bootstrap loops. A single non-convergent configuration could emit dozens to hundreds of duplicate warnings, which is noisy, slow on Python fallback paths, and a hard failure in environments that escalate warnings to errors. Fix pattern: add an optional _nonconvergence_tracker: list kwarg to _solve_global_with_lowrank, _solve_global_model, _fit_global_with_fixed_lambda, _estimate_model, and _fit_with_fixed_lambda. When provided, the solver appends non-convergence events to the tracker instead of warning directly. Each top-level caller (LOOCV, bootstrap, Rao-Wu bootstrap, per-treated-observation fit) supplies a tracker and emits a single consolidated warning summarizing the count of non-converged fits. Six call sites wrapped: - trop.py:768 local per-treated-observation main fit loop - trop_local.py:815 local LOOCV - trop_local.py:1044 local bootstrap - trop_local.py:1199 local Rao-Wu bootstrap - trop_global.py:283 global LOOCV - trop_global.py:1048 global bootstrap - trop_global.py:1226 global Rao-Wu bootstrap Also addresses the P3 test-shape finding: the convergence tests now pass observation-level `control_mask = (D == 0)` matching the production call contract at trop.py:567 and trop_local.py:625 (not the unit-level mask I had earlier). Plus a new fit()-level test `test_local_fit_emits_single_aggregate_warning` that pins the aggregate-per-call warning contract. Smoke check: test_basic_fit previously emitted one per-observation warning; it now emits a single aggregate like "TROP local per-treated-observation fit: 15 of 15 fits did not converge". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Highest unmitigated issue I found is P3. I do not see a P0/P1 methodology or correctness defect in the touched TROP code. Executive Summary
Methodology
Code Quality No findings. The previous warning fan-out concern appears resolved by the new tracker plumbing in diff_diff/trop.py:L749-L795, diff_diff/trop_local.py:L811-L855, and diff_diff/trop_global.py:L278-L309. Performance No findings. Maintainability No findings. Tech Debt No findings. Security No findings. Documentation/Tests
|
AI review on PR #317 flagged that my earlier fit()-level test only covered the per-treated-observation aggregation path, not the LOOCV or bootstrap wrapper paths. A regression in _nonconvergence_tracker plumbing for those paths could slip through. - test_local_fit_emits_single_aggregate_warning: expanded to assert per-obs, LOOCV, and bootstrap warnings each appear at most once per .fit(). - test_global_fit_emits_single_aggregate_warning: new test mirroring the local one for method="global" (LOOCV + bootstrap paths). Both use n_bootstrap=2, minimal lambda grid, and max_iter=1/tol=1e-15 to keep cost low: ~3.4s for all 6 TROP convergence tests combined. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Highest unmitigated finding is P3. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
AI review on PR #317 flagged that my fit()-level tests did not force HAS_RUST_BACKEND=False, so in Rust-enabled environments they could pass without exercising the Python aggregation code they were intended to cover. The earlier <= 1 assertion also would not catch a dropped warning. Changes: - patch.object(sys.modules[...], "HAS_RUST_BACKEND", False) across diff_diff.trop, diff_diff.trop_local, diff_diff.trop_global so the LOOCV and bootstrap paths route through the Python aggregation wrappers. (Uses sys.modules to bypass the name collision between the trop() convenience function and the trop module at diff_diff.trop.) - Tightened assertions: per-treated-observation and bootstrap aggregation are called exactly once per fit() so assert == 1. LOOCV is called multiple times by the coordinate-descent grid refinement in trop.py, so the per-call single-emission contract is verified via message format ("N of M per-observation fits") on every LOOCV aggregate rather than by global occurrence count. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment✅ Looks good Highest unmitigated finding is P3. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…e warning AI review noted that the per-treated-observation aggregate warning used len(treated_observations) as the denominator, but the loop skips cells with non-finite outcomes before calling _estimate_model(). On panels with missing treated outcomes, the reported non-convergence rate would be understated because attempted-but-failed fits were compared against a total that included never-attempted cells. Track n_fits_attempted separately and use that as the denominator. Report is now "X of N-attempted fits did not converge" rather than "X of N-treated-cells fits did not converge". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good No unmitigated P0/P1 findings. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…odo config Packages five merged PRs since v3.1.2 as patch release 3.1.3: - #311 Replicate-weight variance and PSU-level bootstrap for dCDH — new variance_method="replicate" (BRR / Fay / JK1 / JKn / SDR) and PSU-level multiplier bootstrap, with df-aware inference and group-level PSU map. - #321 Zenodo DOI auto-minting config — .zenodo.json + top-level LICENSE so the next GitHub Release mints a concept + versioned DOI automatically. - #319 Silent sparse->dense lstsq fallback signaling in ImputationDiD and TwoStageDiD — emits ConvergenceWarning instead of switching paths silently. - #317 Non-convergence signaling in TROP alternating-minimization solvers, including LOOCV and bootstrap aggregation. Top-level warning aggregation. - #320 /bump-version skill now updates CITATION.cff; single RELEASE_DATE resolved upfront and threaded through all date-bearing files. Version strings bumped in diff_diff/__init__.py, pyproject.toml, rust/Cargo.toml, diff_diff/guides/llms-full.txt, and CITATION.cff (version: 3.1.3, date-released: 2026-04-18). CHANGELOG populated with Added / Fixed / Changed sections and comparison-link footer. Per project SemVer convention, minor bumps are reserved for new estimators or new module-level API; additive extensions to existing estimators (like PR #311's new variance_method values) are patch-level. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI review on #322 flagged that the 3.1.3 entries for PR #319 (sparse->dense lstsq fallback) and PR #317 (TROP non-convergence) claimed ConvergenceWarning, but the actual implementations emit UserWarning (imputation.py, two_stage.py, utils.py, trop.py, and the REGISTRY.md contract all use UserWarning). Users filtering warnings by category would be misled. Same factual error was in the 3.1.2 entries I wrote in PR #316 for PR #314 and PR #315. Fixing both entries in this PR — CHANGELOG is a living doc and the warning-category drift is actionable-ly misleading. No code or test changes; CHANGELOG-only edit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
trop_global._solve_global_with_lowrankattrop_global.py:448-488now threads aconvergedflag through its outer alternating-minimization loop. The hard-codedrange(20)inner FISTA loop (line 466) contributes a non-convergence count that is surfaced in the outer warning for diagnostic context — one consolidatedwarn_if_not_convergedcall per solver invocation.trop_local._estimate_modelattrop_local.py:680now threads aconvergedflag through its outer alternating-min loop and calls the same helper.diff_diff.utils.warn_if_not_converged(introduced in Signal non-convergence in FE imputation alternating-projection solvers #314).Methodology references
Validation
TestTROPConvergenceWarningsclass (4 tests) intests/test_trop.py:test_global_alternating_min_warns_on_nonconvergence— forced non-convergence viamax_iter=1, tol=1e-15, asserts warningtest_global_alternating_min_no_warning_on_convergence— convergent negative control (max_iter=500, tol=1e-6)test_local_alternating_min_warns_on_nonconvergence— same for local pathtest_local_alternating_min_no_warning_on_convergence— convergent negative controlTestTROP::test_basic_fitpasses unchanged.max_iter=100, tol=1e-6) does not converge withinmax_iteron the simple synthetic panel fixture, so this PR will surface a previously silent non-convergence that was affecting routine user fits. The warning is the correct signal; users who want to silence it can raisemax_iteror loosentol.feedback_targeted_testsconvention.Security / privacy