Skip to content

feat: shared test coverage audit across plan/ship/review (v0.9.5.0)#259

Open
garrytan wants to merge 10 commits intomainfrom
garrytan/test-coverage-catalog
Open

feat: shared test coverage audit across plan/ship/review (v0.9.5.0)#259
garrytan wants to merge 10 commits intomainfrom
garrytan/test-coverage-catalog

Conversation

@garrytan
Copy link
Owner

Summary

  • Extract {{TEST_COVERAGE_AUDIT}} shared resolver with 3 mode-specific placeholders (plan/ship/review)
  • Add /review Step 4.75 for codepath tracing during pre-landing review
  • Shared methodology: ASCII coverage diagrams, quality rubric (★/★★/★★★), E2E decision matrix, regression detection iron rule, test framework auto-detection
  • Plan mode traces the plan (not git diff); ship auto-generates tests; review uses Fix-First ASK

Test Coverage

  • 451 free tests pass (skill validation + gen-skill-docs)
  • 2/2 coverage audit E2Es pass (ship + review, stable across 2 consecutive runs)
  • Full E2E suite: 7 pre-existing failures (design-lite, qa-only, etc.), all unrelated
  • Codex code review: GATE PASS (0 findings)
  • Codex adversarial: 1 finding fixed (plan mode git diff text)

Tests: 417 → 451 (+34 from main merge)

Pre-Landing Review

No issues found. All auto-fixable items addressed in prior commits.

Reviews

  • Eng Review: CLEARED (2 runs, clean)
  • Codex Review: PASS (gate clean, 0 findings)
  • Codex Adversarial: 1 finding → fixed in commit aa9f186

Test plan

  • All free tests pass (451 tests, 0 failures)
  • Ship coverage audit E2E passes
  • Review coverage audit E2E passes
  • Merge conflicts resolved cleanly
  • Regenerated SKILL.md files are fresh

🤖 Generated with Claude Code

garrytan and others added 10 commits March 20, 2026 07:57
DRY extraction of the test coverage audit methodology into a shared
generator function with three explicit placeholders:
- TEST_COVERAGE_AUDIT_PLAN (plan-eng-review)
- TEST_COVERAGE_AUDIT_SHIP (ship)
- TEST_COVERAGE_AUDIT_REVIEW (review)

Shared across all modes: codepath tracing, ASCII diagram format,
quality scoring rubric, E2E test decision matrix, regression rule,
and test framework detection via CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the thin 6-line Section 3 test review with the full shared
methodology via {{TEST_COVERAGE_AUDIT_PLAN}}. Plan mode now:
- Traces every codepath with full ASCII diagrams
- Adds missing tests to the plan (not just "check for tests")
- Writes test plan artifact for /qa consumption
- Includes E2E/eval recommendations and regression detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace 135 lines of inline Step 3.4 methodology with
{{TEST_COVERAGE_AUDIT_SHIP}}. Functionally identical output plus:
- E2E test decision matrix (marks paths needing E2E vs unit)
- Eval recommendations for LLM prompt changes
- Regression detection iron rule
- Test framework detection via CLAUDE.md first
- Test plan artifact for /qa consumption

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add codepath tracing to the pre-landing review via
{{TEST_COVERAGE_AUDIT_REVIEW}}. Review mode:
- Produces ASCII coverage diagram (same methodology as plan/ship)
- Generates tests for gaps via Fix-First (ASK user)
- Subsumes Pass 2 "Test Gaps" checklist category
- Gaps are INFORMATIONAL findings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
10 new tests verifying the three TEST_COVERAGE_AUDIT placeholders:
- All modes share: codepath tracing, E2E matrix, regression rule
- Plan mode: adds to plan + artifact, no ship-specific content
- Ship mode: auto-generates + before/after count + coverage summary
- Review mode: Fix-First ASK + INFORMATIONAL, no artifact
- Regression guard: ship SKILL.md preserves all key phrases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract billing.ts fixture into coverage-audit-fixture.ts (DRY)
- Refactor ship-coverage-audit E2E to use shared fixture
- Add review-coverage-audit E2E for Step 4.75
- Update touchfiles: both E2Es depend on shared fixture

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The coverage audit E2E tests (ship + review) were only asserting
exitReason === 'success' and readCalls > 0 — they passed even
if the agent produced no coverage diagram. Add assertion that
the output contains either GAP or TESTED markers.

Found during /review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codex adversarial review caught that plan-eng-review was inheriting
"git diff origin/<base>...HEAD" from the shared resolver, but plan mode
reviews a plan document, not a code diff. Plan mode now says:
"Trace every codepath in the plan" and "Read the plan document."

Ship and review modes keep the git diff instruction.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e-catalog

# Conflicts:
#	scripts/gen-skill-docs.ts
#	test/gen-skill-docs.test.ts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant