Skip to content

Update roadmap with current implementation limitations#21

Merged
igerber merged 2 commits intomainfrom
claude/update-roadmap-doc-u8rcF
Jan 3, 2026
Merged

Update roadmap with current implementation limitations#21
igerber merged 2 commits intomainfrom
claude/update-roadmap-doc-u8rcF

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Jan 3, 2026

Add Priority 1 section documenting features that are partially implemented
or have known limitations in existing estimators:

  • CallawaySantAnna bootstrap inference (n_bootstrap raises NotImplementedError)
  • CallawaySantAnna covariate adjustment (parameter accepted but unused)
  • MultiPeriodDiD wild bootstrap (warns and falls back to analytical)
  • DifferenceInDifferences.predict() (raises NotImplementedError)
  • SyntheticDiD robustness (silent bootstrap failures)

Also add:

  • Quick overview table for at-a-glance status
  • Goodman-Bacon decomposition to usability section
  • Code quality & technical debt section
  • Future considerations for alternative inference methods
  • Updated visualization and formula interface status

claude added 2 commits January 3, 2026 22:28
Add Priority 1 section documenting features that are partially implemented
or have known limitations in existing estimators:

- CallawaySantAnna bootstrap inference (n_bootstrap raises NotImplementedError)
- CallawaySantAnna covariate adjustment (parameter accepted but unused)
- MultiPeriodDiD wild bootstrap (warns and falls back to analytical)
- DifferenceInDifferences.predict() (raises NotImplementedError)
- SyntheticDiD robustness (silent bootstrap failures)

Also add:
- Quick overview table for at-a-glance status
- Goodman-Bacon decomposition to usability section
- Code quality & technical debt section
- Future considerations for alternative inference methods
- Updated visualization and formula interface status
Reorganize priorities based on what practitioners actually need:

1.0 Blockers (essential for credibility):
- Honest DiD / Rambachan-Roth sensitivity analysis
- CallawaySantAnna covariate adjustment
- API documentation site

1.0 Target (strengthen release):
- Goodman-Bacon decomposition
- Power analysis tools
- CallawaySantAnna bootstrap inference

Post-1.0 (future versions):
- Sun-Abraham, Borusyak-Jaravel-Spiess, ML extensions

Demoted to technical debt:
- predict() method (rarely needed)
- MultiPeriodDiD wild bootstrap (edge case)

Added clear rationale for each feature explaining why it matters
to practitioners and how it compares to R ecosystem.
@igerber igerber merged commit 1774ffd into main Jan 3, 2026
@igerber igerber deleted the claude/update-roadmap-doc-u8rcF branch January 3, 2026 22:37
igerber added a commit that referenced this pull request Apr 19, 2026
Phase 2 silent-failures audit — axis-G (backend parity). Closes the
coverage gap the audit flagged in three Rust-backed solver surfaces.
Test-only PR; any discovered divergences are marked `xfail(strict=True)`
and logged to `TODO.md` as P1 follow-ups rather than fixed in-scope.

Finding #21 — `solve_ols` skip-rank-check parity (`linalg.py:369-373,
597-639`): three parity tests in `TestSolveOLSSkipRankCheckParity`
covering mixed-scale columns (norm ratio > 1e6), near-singular full-rank
(cond > 1e10), and rank-deficient collinear designs under
`skip_rank_check=True` on HC1. Backends agree on fitted values within
`rtol=1e-6, atol=1e-8`. All pass; no Rust-side code change needed.

Finding #22 — `compute_synthetic_weights` parity (`utils.py:1134-1199`):
three parity tests in `TestSyntheticWeightsBackendParity`. Near-singular
`Y'Y` passes at `atol=1e-7`; extreme Y scale (1e9) and lambda_reg
variations are `xfail(strict=True)` with a baselined ~15-80% weight
divergence. Root cause: Rust path is Frank-Wolfe, Python fallback is
projected gradient descent (`utils.py:1228`) — same QP, different
simplex vertices under near-degenerate inputs.

Finding #23 — TROP Rust grid-search + bootstrap parity
(`trop_global.py:688-750, 966-1006`): two parity tests in
`TestTROPRustEdgeCaseParity`, `@pytest.mark.slow` class-level. Both
`xfail(strict=True)`: grid-search ATT on rank-deficient Y (~6%
divergence), bootstrap SE under `seed=42` (~28% divergence, RNG
backend mismatch — Rust `rand` crate vs numpy `default_rng`).

Plan governance:
- Per `feedback_ci_reviewer_pattern_checks`, greped adjacent Rust
  entry points (`_solve_ols_rust`, `_rust_synthetic_weights`,
  `_rust_loocv_grid_search_global`, `_rust_bootstrap_trop_variance_global`);
  no additional silent-fallback surfaces identified.
- Per plan Non-goal #4, did not open an axis-H finding on TROP's
  `seed=None → 0` substitution at `trop_global.py:994` (out of scope).
- No behavioral changes, no warnings, no REGISTRY changes, no flags.

TODO.md logs three P1 follow-up entries: algorithmic unification for
`compute_synthetic_weights` (FW vs PGD), TROP grid-search divergence on
rank-deficient Y, TROP bootstrap RNG unification.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants