diff --git a/.github/workflows/docs-tests.yml b/.github/workflows/docs-tests.yml index 66b87c5e..40ae1adf 100644 --- a/.github/workflows/docs-tests.yml +++ b/.github/workflows/docs-tests.yml @@ -7,6 +7,7 @@ on: - 'docs/**' - 'diff_diff/**' - 'tests/test_doc_snippets.py' + - 'tests/test_doc_deps_integrity.py' # tests/conftest.py is auto-loaded by pytest for the snippet # test run and mutates sys.path + MPLBACKEND (conftest.py:14, 18); # changes there can break snippet exec without touching the test @@ -24,6 +25,7 @@ on: - 'docs/**' - 'diff_diff/**' - 'tests/test_doc_snippets.py' + - 'tests/test_doc_deps_integrity.py' - 'tests/conftest.py' - 'pyproject.toml' # sphinx-build job mirrors RTD setup; trigger when RTD config drifts @@ -40,7 +42,7 @@ permissions: jobs: doc-snippets: - name: Validate RST code snippets + name: Validate doc snippets + doc-deps integrity # Skip unrelated label churn: a non-ready-for-ci label add/remove won't run this job. if: >- github.event_name != 'pull_request' @@ -70,6 +72,13 @@ jobs: # `python-fallback` job in rust-test.yml). run: PYTHONPATH=. DIFF_DIFF_BACKEND=python pytest tests/test_doc_snippets.py -v + - name: Run doc-deps integrity check + # Validates docs/doc-deps.yaml: no stale referenced paths, no unmapped + # public source modules. Same PYTHONPATH=. prefix as above because + # tests/conftest.py imports diff_diff at collection and the install + # step does not install diff_diff. + run: PYTHONPATH=. DIFF_DIFF_BACKEND=python pytest tests/test_doc_deps_integrity.py -v + sphinx-build: name: Sphinx HTML build (-W warnings as errors) # Skip unrelated label churn: a non-ready-for-ci label add/remove won't run this job. diff --git a/TODO.md b/TODO.md index fe7971cd..5b04c6c6 100644 --- a/TODO.md +++ b/TODO.md @@ -171,8 +171,7 @@ Deferred items from PR reviews that were not addressed before merge. | Port the CI `` extraction into the reviewer-eval harness so `docs/tutorials/*.ipynb` cases (currently guarded out of `verify-corpus`/`run`) can be reviewed with CI-equivalent context | `tools/reviewer-eval/adapters/ci_prompt.py` | local-review | Low | | R comparison tests spawn separate `Rscript` per test (slow CI) | `tests/test_methodology_twfe.py:294` | #139 | Low | | CS R helpers hard-code `xformla = ~ 1`; no covariate-adjusted R benchmark for IRLS path | `tests/test_methodology_callaway.py` | #202 | Low | -| Doc-snippet smoke tests only cover `.rst` files; `.txt` AI guides outside CI validation | `tests/test_doc_snippets.py` | #239 | Low | -| Add CI validation for `docs/doc-deps.yaml` integrity (stale paths, unmapped source files) | `docs/doc-deps.yaml` | #269 | Low | +| Validating the `.txt` AI guides (`diff_diff/guides/llms-full.txt`, `llms-practitioner.txt`) as executable snippets is **not low-lift** (re-scoped 2026-06-01): of their ~112 fenced Python blocks only ~20% are standalone-runnable — the rest are API-signature references (`Foo(param: type = default)` pseudo-signatures that are `SyntaxError` by design), context fragments (e.g. `results.att` on an undefined `results`), or dataset-shape-specific blocks. The guides are reference documentation, not runnable examples; a real implementation needs signature-block detection + a context/data skip-allowlist + per-snippet fixtures (multi-round curation), unlike the curated `.rst` files the existing smoke test covers. | `tests/test_doc_snippets.py` | #239 | Low | | SyntheticDiD: rename internal `placebo_effects` variable to `variance_effects` (or `resampled_effects`). Misleading name across the placebo/bootstrap/jackknife dispatch paths — holds three different contents depending on variance method. Low-risk refactor; user-facing field rename should preserve `placebo_effects` as a deprecated alias for one release. | `synthetic_did.py`, `results.py` | follow-up | Medium | | AI review CI: pin workflow contract via test (uses `openai/codex-action@v1`, passes `prompt-file`, reads `steps.run_codex.outputs.final-message`, preserves diff-exclude paths and comment markers). Currently only the wrapper-tag and closing-tag-escape strings are asserted. | `tests/test_openai_review.py`, `.github/workflows/ai_pr_review.yml` | #416 | Low | | `TestWorkflowDoesNotExecutePRHeadCode` (CodeQL #14 dismissal guard) does not model: `bash