Skip to content

feat: add fact-checking for implementation plan file claims (#68) (#68)#72

Merged
jafreck merged 3 commits intomainfrom
cadre/issue-68
Feb 23, 2026
Merged

feat: add fact-checking for implementation plan file claims (#68) (#68)#72
jafreck merged 3 commits intomainfrom
cadre/issue-68

Conversation

@jafreck
Copy link
Copy Markdown
Owner

@jafreck jafreck commented Feb 23, 2026

Summary

The implementation planner could hallucinate claims about file contents (e.g., "a postbuild script already exists") causing the code-writer to skip critical changes. This PR addresses the root cause by (1) mandating that the planner agent reads every source file before referencing it, and (2) adding a fileExistenceCheck warning to PlanningToImplementationGate so the pipeline surfaces missing file references at plan-validation time rather than silently continuing.

Closes #68

Changes

  • src/agents/templates/implementation-planner.md: Added a mandatory rule requiring the agent to read every source file before making claims about its contents or structure; updated the Tool Permissions entry to mark file reads as required.
  • src/core/phase-gate.ts: Extended PlanningToImplementationGate.validate() to import access from node:fs/promises and check each file path listed across all plan tasks against context.worktreePath, emitting a warning (not an error) for any path that does not exist on disk.
  • tests/phase-gate.test.ts: Added four new test cases covering the new file-existence warning behaviour: all files present (pass), all files missing (warn), mixed present/missing (warn only for missing), and single missing file scenario.
  • tests/implementation-planner-template.test.ts: Added six new test assertions verifying that the updated template contains a Rules section, the mandatory MUST read every source file instruction, the "before making any claims" language, and that Tool Permissions marks the read as required.

Implementation Details

The file-existence check uses a try/catch around fs/promises.access() for each resolved path — consistent with how Node.js file presence is tested without stat overhead. Warnings are appended to the existing warnings[] array, and the gate returns 'warn' status only when there are no errors, preserving the existing fail/pass contract. No new dependencies were introduced.

Testing

  • All 4 new PlanningToImplementationGate file-existence tests pass
  • All 6 new template assertion tests pass
  • Full test suite (npx vitest run) passes with exit code 0

Integration Verification

  • Install: pass
  • Build: pass
  • Tests: pass

Notes

  • The "spot-check specific claims" feature mentioned in the issue (e.g., verifying that a file contains a named function) is not implemented in this PR; it is deferred as future work since the issue marked it as optional.
  • The file-existence check warns on files that are expected to be created by the task itself. This is a known false-positive edge case; operators can treat those warnings as informational.

Cadre Process Challenges

This section is required for all CADRE-generated PRs (dogfooding data).

  • Issue clarity: The issue was well-structured with clear acceptance criteria. The main ambiguity was whether "optional spot-checks" were in scope — the analysis agent correctly flagged this and deferred it.
  • Agent contracts: The makeContext helper in tests/phase-gate.test.ts only accepted a single directory argument; existing tests constructed a context with worktreePath equal to tempDir. The code-writer had to update makeContext calls to accept a second worktreeDir argument, which required updating existing passing tests, not just adding new ones — a mild friction point since the implementation plan did not explicitly call this out.
  • Context limitations: The implementation plan did not note that existing tests passed tempDir as worktreePath, so the code-writer discovered the need to update callers only after reading the actual test file. More thorough scout output (enumerating existing makeContext call sites) would have prevented this surprise.
  • Git/worktree: No branch, worktree, or commit issues encountered.
  • Parsing/output: Agent outputs parsed correctly; no schema mismatches observed.
  • Retry behavior: No agent retries were required in this run.
  • Overall: The biggest friction point was that the implementation plan described tasks at a file level but did not inspect existing test helper signatures, leading to a necessary but unplanned refactor of makeContext callers. Scout reports that enumerate specific function signatures and call sites in test files would eliminate this category of surprise.

Closes #68

@jafreck jafreck merged commit e6dcb48 into main Feb 23, 2026
2 checks passed
@jafreck jafreck added the cadre-generated Pull request automatically generated by cadre label Feb 24, 2026
@jafreck jafreck deleted the cadre/issue-68 branch February 25, 2026 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cadre-generated Pull request automatically generated by cadre

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add fact-checking for implementation plan claims about file contents

1 participant