Skip to content

feat: scope code-reviewer diff to per-task commit to reduce context window pressure (#67) (#67)#74

Merged
jafreck merged 5 commits intomainfrom
cadre/issue-67
Feb 23, 2026
Merged

feat: scope code-reviewer diff to per-task commit to reduce context window pressure (#67) (#67)#74
jafreck merged 5 commits intomainfrom
cadre/issue-67

Conversation

@jafreck
Copy link
Copy Markdown
Owner

@jafreck jafreck commented Feb 23, 2026

Summary

The code-reviewer agent was previously receiving the full cumulative diff from the base commit to HEAD, meaning by task 13 of a 13-task decomposition the reviewer would see all prior changes. This PR scopes the diff to the current task only by committing code-writer + test-writer output before the review step and using git diff HEAD~1..HEAD. A 200,000-character truncation guard is also applied to prevent extremely large single-task diffs from overflowing the context window.

Closes #67

Changes

  • src/git/commit.ts: Added getTaskDiff() method to CommitManager that returns git diff HEAD~1..HEAD, with a graceful fallback to git show HEAD when no parent commit exists (first commit scenario).
  • src/executors/implementation-phase-executor.ts: Added an intermediate wip: commit after code-writer + test-writer complete, then calls getTaskDiff() (instead of getDiff(baseCommit)) to obtain the task-scoped diff. Added a truncateDiff() helper that caps the diff at 200,000 characters and appends a truncation notice when the limit is exceeded.
  • tests/git-commit.test.ts: Added tests for getTaskDiff() covering the normal case, the first-commit fallback, and empty-diff behavior.
  • tests/implementation-phase-executor.test.ts: Added tests verifying the executor calls getTaskDiff() (not getDiff()), commits before the review step with the expected wip: prefix, and correctly applies truncation logic at and above the 200,000-character threshold.
  • tests/e2e-pipeline.test.ts and tests/issue-orchestrator-zod-retry.test.ts: Updated mocks to expose getTaskDiff so existing integration tests continue to pass.

Implementation Details

The key insight is that at the point the code-reviewer runs, the code-writer and test-writer have already produced their changes but no commit has been made yet for the current task. Using HEAD~1..HEAD without an intermediate commit would return the previous task's diff. The fix introduces an intermediate wip: commit before the review, then calls getTaskDiff() so HEAD~1..HEAD correctly reflects the current task's changes. The final task commit still occurs at the queue.complete step as before, so the overall commit structure gains one additional intermediate commit per task per attempt.

Testing

  • Added 3 unit tests for CommitManager.getTaskDiff() covering normal, fallback, and empty-diff scenarios
  • Added 11 new tests in implementation-phase-executor.test.ts covering task diff usage, intermediate commit ordering, truncation at/above/below the 200,000-character limit, and truncation notice content
  • All existing tests pass (build: ✅, tests: ✅, lint: ✅ per integration report)

Integration Verification

  • Build: pass
  • Tests: pass
  • Lint: pass (no lint step configured; build was clean)

Notes

  • The intermediate wip: commit adds one extra commit to the git history per task attempt. If a task is retried, each attempt produces its own wip: commit. The squash step at the end of the implementation phase will collapse these into the final commit, so the PR diff remains clean.
  • The 200,000-character limit (~50k tokens) is hardcoded. A follow-up could make this configurable via cadre.config.json.
  • Summarization of large diffs (as proposed in the issue) was not implemented; truncation-with-notice was chosen as a simpler, sufficient first step.

Cadre Process Challenges

This section is required for all CADRE-generated PRs (dogfooding data).
Document honestly what was difficult, confusing, or error-prone when CADRE processed this issue.

  • Issue clarity: The issue was well-specified on the high level but the proposed solution had a subtle ordering flaw — the suggestion to use git diff HEAD~1..HEAD without noting that the current task's changes are uncommitted at review time required the implementation plan to reason carefully about timing. This took significant space in the task description to work through.
  • Agent contracts: No major schema mismatches observed. Mock updates in existing test files (e2e-pipeline.test.ts, issue-orchestrator-zod-retry.test.ts) were needed to expose getTaskDiff — these were not surfaced by the implementation plan and were caught only at test time.
  • Context limitations: The implementation plan's task-002 description is unusually long because it had to reason through the commit-ordering problem in prose; ideally the planner would encode this as structured constraints rather than free text in the description field.
  • Git/worktree: No worktree or branch problems encountered during this run.
  • Parsing/output: Agent outputs parsed correctly; no schema mismatches in this run.
  • Retry behavior: No retries were required.
  • Overall: The biggest friction point was the commit-ordering subtlety — the issue proposed HEAD~1..HEAD without accounting for the uncommitted state of the worktree when the reviewer runs. The implementation-planner had to reason through this and encode the intermediate-commit approach, which added complexity to task-002's description and to the test surface.

Closes #67

@jafreck jafreck merged commit 72bd5a7 into main Feb 23, 2026
2 checks passed
@jafreck jafreck added the cadre-generated Pull request automatically generated by cadre label Feb 24, 2026
@jafreck jafreck deleted the cadre/issue-67 branch February 25, 2026 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cadre-generated Pull request automatically generated by cadre

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scope code-reviewer context to per-task diff to reduce context window pressure

1 participant