diff --git a/TODO.md b/TODO.md index c2fb6662..faf56ce2 100644 --- a/TODO.md +++ b/TODO.md @@ -174,7 +174,6 @@ Deferred items from PR reviews that were not addressed before merge. | CS R helpers hard-code `xformla = ~ 1`; no covariate-adjusted R benchmark for IRLS path | `tests/test_methodology_callaway.py` | #202 | Low | | Validating the `.txt` AI guides (`diff_diff/guides/llms-full.txt`, `llms-practitioner.txt`) as executable snippets is **not low-lift** (re-scoped 2026-06-01): of their ~112 fenced Python blocks only ~20% are standalone-runnable — the rest are API-signature references (`Foo(param: type = default)` pseudo-signatures that are `SyntaxError` by design), context fragments (e.g. `results.att` on an undefined `results`), or dataset-shape-specific blocks. The guides are reference documentation, not runnable examples; a real implementation needs signature-block detection + a context/data skip-allowlist + per-snippet fixtures (multi-round curation), unlike the curated `.rst` files the existing smoke test covers. | `tests/test_doc_snippets.py` | #239 | Low | | SyntheticDiD: rename internal `placebo_effects` variable to `variance_effects` (or `resampled_effects`). Misleading name across the placebo/bootstrap/jackknife dispatch paths — holds three different contents depending on variance method. Low-risk refactor; user-facing field rename should preserve `placebo_effects` as a deprecated alias for one release. | `synthetic_did.py`, `results.py` | follow-up | Medium | -| AI review CI: pin workflow contract via test (uses `openai/codex-action@v1`, passes `prompt-file`, reads `steps.run_codex.outputs.final-message`, preserves diff-exclude paths and comment markers). Currently only the wrapper-tag and closing-tag-escape strings are asserted. | `tests/test_openai_review.py`, `.github/workflows/ai_pr_review.yml` | #416 | Low | | `TestWorkflowDoesNotExecutePRHeadCode` (CodeQL #14 dismissal guard) does not model: `bash