Skip to content

Fix quick start guide output accuracy#17

Merged
igerber merged 1 commit intomainfrom
claude/fix-quickstart-output-072BQ
Jan 3, 2026
Merged

Fix quick start guide output accuracy#17
igerber merged 1 commit intomainfrom
claude/fix-quickstart-output-072BQ

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Jan 3, 2026

The documented output values were incorrect. Updated to match actual output from running the Quick Start example code (ATT=3.0, SE=1.7321, p=0.1583).

The documented output values were incorrect. Updated to match actual
output from running the Quick Start example code (ATT=3.0, SE=1.7321,
p=0.1583).
@igerber igerber merged commit 73c3ae4 into main Jan 3, 2026
@igerber igerber deleted the claude/fix-quickstart-output-072BQ branch January 3, 2026 19:37
igerber added a commit that referenced this pull request Apr 19, 2026
…aths

Addresses findings #17, #18, #19 from the Phase 2 silent-failures audit (axis A,
all Minor). Each site previously ran np.linalg.solve against a matrix that
could be rank-deficient or near-singular with no user-facing signal.

- StaggeredTripleDifference: `_compute_did_panel` now appends a condition-number
  sample to an instance tracker on LinAlgError; `fit()` emits ONE aggregate
  UserWarning listing affected (g, g_c, t) cells and the max condition number
  instead of silently falling back to np.linalg.lstsq per pair. Tracker resets
  on repeat fit.
- EfficientDiD covariate sieve (estimate_propensity_ratio_sieve,
  estimate_inverse_propensity_sieve): precondition-check the normal-equations
  matrix via np.linalg.cond before solve and reject K values above
  1/sqrt(eps); partial-K skips now surface via UserWarning listing the
  skipped K values, instead of being swallowed by `continue`.
- compute_survey_vcov: check cond(X'WX) before the sandwich solve; emit
  UserWarning above the 1/sqrt(eps) threshold so ill-conditioned bread
  matrices don't silently produce unstable variance estimates.

Sibling sites picked up via repo-wide lstsq-fallback pattern grep (per
the pattern-check feedback memory):
- two_stage.py:1768 (TSL variance bread)
- two_stage_bootstrap.py:197 (multiplier bootstrap bread)
Both now warn before the silent lstsq fallback.

Adds 8 targeted tests across test_staggered_triple_diff.py,
test_efficient_did.py, and test_survey.py, covering collinear/ill-conditioned
triggers and happy-path negatives. REGISTRY.md notes added for each affected
estimator section. No behavioral change on well-conditioned inputs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igerber added a commit that referenced this pull request Apr 19, 2026
…oStage tests

CI review on PR #334 flagged one P1 (incomplete audit of silent solve fallbacks)
and two P3s.

P1 — Callaway-Sant'Anna `_safe_inv()` sibling site:
  The shared `_safe_inv(A)` helper in staggered.py was still silently falling
  back from np.linalg.solve to np.linalg.lstsq. It feeds ~13 analytical SE
  paths (propensity-score Hessian, OR bread, event-study bread, etc.), so
  a rank-deficient design could silently ship degraded analytical SEs.

  Extended `_safe_inv(A, tracker: Optional[list] = None)` to append a
  condition-number sample on LinAlgError when a tracker is passed. Initialize
  `self._safe_inv_tracker: List[float] = []` at the top of
  `CallawaySantAnna.fit()`, thread `tracker=self._safe_inv_tracker` through
  all 13 callsites, and emit ONE aggregate UserWarning at the end of fit()
  listing the number of fallbacks and max condition number. Matches the
  tracker pattern established in STD finding #17.

  Added TestCallawaySantAnnaSafeInvFallback with two tests: collinear
  covariates trigger the aggregate warning; well-conditioned data emits no
  warning (happy-path regression guard).

  REGISTRY.md §CallawaySantAnna notes the new warning contract.

P3 — Sieve docstrings lag behavior: Updated estimate_propensity_ratio_sieve
  and estimate_inverse_propensity_sieve docstrings to describe the new
  cond(A) > 1/sqrt(eps) precondition check, the partial-K skip warning,
  and the all-K fallback semantics.

P3 — No TwoStage regression coverage: Added TestTwoStageStage2BreadWarning
  with two tests covering both the analytical TSL and bootstrap bread
  paths (contract: if the lstsq fallback triggers, it must warn).

TODO.md: logged honest_did.py:1907 basis-enumeration skip as an intentional
algorithm behavior (not a silent failure per the Phase 2 audit definition)
but notable for a future diagnostic enhancement.

No behavioral change on well-conditioned inputs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants