Skip to content

feat(build): TDD integration — Red→Green enforced by state machine (v1.14.0)#2

Merged
anbangr merged 23 commits intomainfrom
feat/tdd-build-skill
Apr 28, 2026
Merged

feat(build): TDD integration — Red→Green enforced by state machine (v1.14.0)#2
anbangr merged 23 commits intomainfrom
feat/tdd-build-skill

Conversation

@anbangr
Copy link
Copy Markdown
Owner

@anbangr anbangr commented Apr 28, 2026

Summary

  • New 3-checkbox TDD plan format per phase: Test Specification → Implementation → Review & QA
  • 7-step execution loop: Gemini writes failing tests → VERIFY_RED confirms they fail → Gemini implements → recursive test+fix loop until green → Codex review → flip all 3 checkboxes → context save
  • detectTestCmd(cwd) auto-detects test runner (package.json scripts.test, pytest.ini, pyproject.toml, go.mod, Cargo.toml); --test-cmd flag overrides
  • Full backward compat: legacy 2-checkbox plans skip TDD steps entirely (parser sets testSpecDone=true)
  • 105 unit tests across 9 files (23 new), 0 failing — all TDD state transitions covered

Key files changed

File Change
build/SKILL.md.tmpl + build/SKILL.md v1.14.0 — 3-checkbox format + 7-step loop
build/orchestrator/types.ts 5 new PhaseStatus values; TDD fields on Phase/PhaseState
build/orchestrator/parser.ts Parse **Test Specification checkbox; backward compat fallback
build/orchestrator/phase-runner.ts TDD state machine (decideNextAction + applyResult extended)
build/orchestrator/sub-agents.ts detectTestCmd, runGeminiTestSpec, runTests; logPrefix on runGemini
build/orchestrator/cli.ts --test-cmd flag; buildGeminiTestSpecPrompt; RUN_GEMINI_FIX log prefix
build/orchestrator/plan-mutator.ts flipTestSpecCheckbox
build/orchestrator/README.md TDD Workflow section, env vars, file layout
build/orchestrator/__tests__/integration.test.ts Dry-run integration test covering full TDD flow

Adversarial review findings addressed

  • JSON.parse in detectTestCmd wrapped in try/catch (malformed package.json no longer crashes orchestrator)
  • RUN_GEMINI_FIX now passes logPrefix: 'gemini-fix' to runGemini — fix-iteration logs go to phase-N-gemini-fix-1.log, not phase-N-gemini-1.log (fixes collision with implementation logs per README docs)

Test plan

  • HOME=/tmp/gstack-review-home bun test build/orchestrator/__tests__/ — 105 pass, 0 fail
  • bun build build/orchestrator/cli.ts --target=bun — compiles clean
  • ./bin/gstack-build --help — shows --test-cmd flag
  • gstack-build --dry-run --test-cmd "bun test" <3-checkbox-plan> — walks all 7 steps, exits 0
  • Legacy 2-checkbox plan runs unchanged (no TDD steps in output)

🤖 Generated with Claude Code

anbangr and others added 23 commits April 28, 2026 05:00
TDD Red phase for Phase 4. Tests import detectTestCmd from sub-agents.ts
and buildGeminiTestSpecPrompt from cli.ts — both exports not yet implemented.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Observability fix from Codex review M1 finding: silent skip was inconsistent
with RUN_TESTS which already warns. Now both log a ⚠ when no test command found.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eration field)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dry-run output

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- README.md: add TDD Workflow section (7-step loop, test command detection,
  --test-cmd usage, 3 new env vars, updated file layout and failure modes)
- cli.ts: log 'Test Specification' (not 'writing test spec') so integration
  test assertion passes and output is consistent with the phase format
- parser.test.ts: add TDD checkbox parsing tests (3-checkbox TDD phase,
  legacy 2-checkbox backward compat, testSpecDone=false for unchecked spec)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- fresh-start line: rm <slug>.lock → <slug>.json (lock prevents concurrent runs; .json holds state)
- monorepo --test-cmd example: cd ... && ... → bash -c wrapper (runTests splits on whitespace; cd is a shell builtin)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- wrap JSON.parse in detectTestCmd with try/catch; malformed package.json
  no longer crashes the orchestrator, falls through to next detector
- remove stale Red-phase comment from integration.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
runGemini now accepts logPrefix option (default 'gemini').
RUN_GEMINI_FIX passes logPrefix='gemini-fix' so fix iterations write
to phase-N-gemini-fix-1.log, not phase-N-gemini-1.log (which collides
with implementation iteration logs).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@anbangr anbangr merged commit fae1a8c into main Apr 28, 2026
@anbangr anbangr deleted the feat/tdd-build-skill branch April 28, 2026 03:42
anbangr added a commit that referenced this pull request Apr 29, 2026
…it diff

- codexFixHistory injection in buildJudgePrompt (positive case)
- parseJudgeVerdict with Windows CRLF line endings (\r\n strip)
- REASONING regex Fix #3: 'HARDENING:' mid-sentence in prose does not truncate
- Hygiene gate (Fix #1/#2): git diff detects test file modification in worktree
- Hygiene gate (Fix #1/#2): git diff is empty for source-only worktree changes

Coverage: 14/23 → 19/23 (61% → 83%)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant