Skip to content

chore: prune completed-vcov-initiative leftovers + audit-verified stale TODO rows#502

Merged
igerber merged 1 commit into
mainfrom
chore/todo-cleanup
May 30, 2026
Merged

chore: prune completed-vcov-initiative leftovers + audit-verified stale TODO rows#502
igerber merged 1 commit into
mainfrom
chore/todo-cleanup

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 30, 2026

Summary

  • Prune TODO.md leftovers from the now-complete vcov_type threading initiative (all 8 standalone estimators merged, TwoStageDiD TwoStageDiD: thread vcov_type as narrow {hc1} contract (Phase 1b interstitial #5, final) #498 last): the | Done | umbrella row, the Tier B threading bullet, the stale duplicate TwoStageDiD-Conley row, the "Rows 104-105 LIFTED" comment block + the two ~~LIFTED~~ weighted-BM rows + the Tier C LIFTED bullet, and two resolved-marker HTML comments; rewrite the Standard Error Consistency prose to "complete" and repoint its weighted-CR2 gate at the open multi-absorb row.
  • Audit-verified prune of stale follow-up rows (5 subagents over ~50 rows, every finding re-verified against current source before acting — the vast majority are genuine open deferrals): remove the bias_corrected_local_linear: weights= row (shipped) and narrow the Tier D lprobust bullet's weightsweight-aware auto-bandwidth DPI; remove the compute_survey_metadata/raw_w_for_meta dedup row (done via the shared survey._resolve_survey_for_fit helper); tighten the HAD Phase-4.5-C survey-aware-pretests row to its two remaining open items.
  • Net: TODO.md 363 → 346 lines. Documentation/tracker only — no methodology or source changes.

Methodology references (required if estimator / math changes)

  • Method name(s): N/A — no methodology changes (TODO.md tracker only; nothing changed in diff_diff/, rust/src/, or docs/methodology/)
  • Paper / source link(s): N/A
  • Any intentional deviations from the source (and why): None

Validation

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown

Overall Assessment
✅ Looks good

Executive Summary

  • This is a tracker/documentation-only PR; I found no methodology or implementation regressions.
  • The methodology-facing removals and rewrites are consistent with the current registry and code: shipped weighted CR2-BM support, completed vcov_type threading, retained {"hc1"} narrowing for IF/GMM estimators, and the narrowed HAD / local-linear follow-up rows all check out.
  • No P0/P1 issues found.
  • One minor P3 documentation drift remains: TODO.md still cites REGISTRY:2382, but the relevant HAD sup-t singleton-strata deviation now lives at docs/methodology/REGISTRY.md.
  • I could not run pytest in this environment because pytest is not installed; validation here was static against source and existing test files.

Methodology

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • Severity: P3
    Impact: TODO.md says the Stute singleton-strata gap is “the same gap as the HAD sup-t deviation at REGISTRY:2382,” but docs/methodology/REGISTRY.md is now in the TROP section. The relevant HAD note is docs/methodology/REGISTRY.md. This does not affect behavior, but it weakens the accuracy of an audit/pruning PR.
    Concrete fix: Replace the brittle absolute line reference with the current HAD location, or better, use a section title/anchor instead of a raw line number.

Tech Debt

  • No findings. The deleted helper/weights rows are consistent with current helper/wrapper surfaces, including _resolve_survey_for_fit in diff_diff/survey.py and the documented weighted local-linear gap in diff_diff/local_linear.py.

Security

  • No findings.

Documentation/Tests

@igerber igerber force-pushed the chore/todo-cleanup branch from 23f390e to 770af78 Compare May 30, 2026 22:34
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 770af785917f66d2da6791f5cd917239ea2aaeb2


Overall Assessment
✅ Looks good

Executive Summary

Methodology

  • Severity: P3
    Impact: TODO.md:231 correctly says TwoStageDiD is permanently narrow to {"hc1"}, but the rationale is slightly overstated: the four IF-based estimators are narrow because of IF-based variance, while TwoStageDiD is narrow because its Gardner GMM-corrected meat has no single cross-stage hat-matrix analogue for classical/hc2/hc2_bm. The conclusion is right; the audit trail is just less precise than the registry. See docs/methodology/REGISTRY.md:348, docs/methodology/REGISTRY.md:1408, and diff_diff/two_stage.py:3262.
    Concrete fix: Split the rationale in that sentence, e.g. “the four IF-based estimators … are permanently narrow to {"hc1"} per their IF variance, and TwoStageDiD is likewise permanently narrow because its GMM-corrected meat has no derived HC2/CR2/classical analogue.”

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings. The previous TODO→REGISTRY drift called out in the last review is resolved in the changed file at TODO.md:128.

Tech Debt

Security

  • No findings.

Documentation/Tests

  • No findings. Static review only; I did not run the test suite in this environment.

…ed stale rows

Cleanup of TODO.md now that the vcov_type threading initiative is complete (all 8
standalone estimators merged, TwoStageDiD #498 last). TODO.md only — no methodology
or source changes.

Compact prune of the completed initiative's leftovers:
- the `| Done |` umbrella row + its orphan blank lines (rejoins the methodology
  table to its header), the Tier B threading bullet, and the stale duplicate
  TwoStageDiD-Conley row whose "`__init__` lacks `vcov_type`" premise is false post-#498
- the "Rows 104-105 LIFTED" comment block, the two ~~LIFTED~~ weighted-BM rows, and
  the Tier C LIFTED bullet (clubSandwich WLS-CR2 port, #475)
- two resolved-marker HTML comments (WooldridgeDiD cohort_share; PreTrendsPower)
- rewrote the Standard Error Consistency prose to "complete" and repointed its
  weighted-CR2 gate at the open multi-absorb row

Staleness audit of the ~50 remaining follow-up rows (5 subagents; every finding
re-verified against current source before acting -- the vast majority are genuine
open deferrals):
- removed the `bias_corrected_local_linear: weights=` row (shipped; residual
  unweighted-DPI gap already tracked by the sibling row) and narrowed the Tier D
  lprobust bullet's stale `weights` -> `weight-aware auto-bandwidth DPI`
- removed the `compute_survey_metadata`/`raw_w_for_meta` dedup row (done via the
  shared `survey._resolve_survey_for_fit` helper)
- tightened the HAD Phase-4.5-C survey-aware-pretests row: dropped the shipped
  pweight+PSU+FPC+strata narration, kept the two open items (replicate-weight
  designs; lonely_psu='adjust'+singleton on the Stute family)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@igerber igerber force-pushed the chore/todo-cleanup branch from 770af78 to 68c0026 Compare May 30, 2026 22:41
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 68c0026e1f2e92ffdba867be97037b1673c2c4e3


Overall Assessment
✅ Looks good

Executive Summary

Methodology

  • No findings. The changed prose in TODO.md:L229-L231 now matches the Methodology Registry and runtime guard for TwoStageDiD; the prior P3 wording issue is resolved.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings. The tracker deletions I spot-checked correspond to shipped state or genuine de-duplication, not disappearing active work.

Tech Debt

  • No findings. Remaining deferred items are still tracked where expected, including standalone vcov_type="conley" follow-ups and the HAD pretest gaps at TODO.md:L103-L128.

Security

  • No findings.

Documentation/Tests

  • No findings. Static review only; I did not run the test suite in this environment.

@igerber igerber added the ready-for-ci Triggers CI test workflows label May 30, 2026
@igerber igerber merged commit 56db3a8 into main May 30, 2026
5 of 6 checks passed
@igerber igerber deleted the chore/todo-cleanup branch May 30, 2026 22:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant