Problem
Two related artifact-resolution defects in the consult harness produce verdicts that don't reflect the artifact the architect actually intends to review.
Defect A — workspace-rooted artifact resolution
consult --type {spec,plan,impl} --issue N resolves the artifact (spec/plan/PR diff) from the invoking process's workspace root rather than from the builder's branch. When run from an architect's checkout (typically the repo's integration branch), the consult reviews the version of the artifact that's on the integration branch, not the builder's reworked version on builder/.... Verdicts are therefore about the stale artifact, not the work-in-progress one the architect intends to review.
commands/consult/index.ts: findPlanContent(workspaceRoot, issueId) / findSpecContent / impl-diff resolution all read from workspaceRoot — wherever the architect's invocation is rooted. There is no flag/parameter to point the consult at a specific branch/ref/worktree.
Defect B — main hardcoded as default integration branch (multi-layer, uneven distribution)
A multi-layer defect: the same assumption is fossilised in both the porch CLI implementation and the protocol instruction markdown. Fixing one without the other leaves the symptom intact. Distribution across protocols is uneven — PIR is roughly 4× more exposed at Layer 2 than its siblings.
Layer 1 — porch CLI / consult code (2 sites, protocol-agnostic)
packages/codev/src/commands/consult/index.ts:989 — git merge-base HEAD main (the buildImplQuery fallback).
packages/codev/src/commands/consult/index.ts:1297 — git merge-base main origin/${pr.headRefName} (the impl-review path).
This layer hits every protocol equally — any --type impl consult on a non-main-default repo produces false-positive "scope creep" verdicts: the diff sweeps in every commit that's on the integration branch but not on main, attributing them to the branch under review.
Layer 2 — protocol .md instruction files (per-protocol breakdown)
| Protocol |
Hard-fail (literal shell commands) |
Wrong-answer (review questions) |
Convention |
Layer-2 exposure |
| PIR |
5: builder-prompt.md:85, prompts/implement.md:21,110, prompts/review.md:41,250 |
1: consult-types/pr-review.md:25 |
2: implement.md:136, review.md:242 |
High |
| BUGFIX |
1: protocol.md:327 |
1: consult-types/pr-review.md:44 |
minor |
Medium |
| SPIR |
0 |
1: consult-types/pr-review.md:31 |
2: protocol.md:148,243 ("Commit to main...") |
Low |
| ASPIR |
0 |
1: consult-types/pr-review.md:31 |
(mirrors SPIR pattern) |
Low |
| AIR |
0 |
1: consult-types/pr-review.md:37 |
0 |
Lowest |
| MAINTAIN |
0 |
1: consult-types/pr-review.md:31 |
0 |
Lowest |
Categories:
Hard-fail (literal shell commands agents execute): the prompt instructs the agent to run git diff main, git diff --stat main, or git fetch origin main && git rebase origin/main. On a non-main-default repo these either error (no main branch) or produce diffs against an irrelevant branch.
Wrong-answer (review questions agents must answer): the prompt asks "Is the branch up to date with main?" On a non-main-default repo this is always "no" and the consult typically suggests rebase to main — the wrong action.
Convention (semantically right, literally wrong on the wire): the prompt says "Don't push to main", "Commit to main so builder worktrees include the artifact", "merge to main", etc. The semantic intent is correct ("the integration branch"), but the literal name is fossilised. SPIR/ASPIR's "Commit to main..." instruction in particular misleads an architect on a non-main repo into committing to a branch builders won't see.
Why PIR is more exposed: PIR's prompts (introduced more recently) lean on concrete command examples to guide the agent. SPIR/ASPIR use more abstract phrasing. PIR's design choice happens to fossilise the main assumption more visibly than its older siblings.
codev-skeleton propagation: Identical copies under codev-skeleton/protocols/ ship to every project that adopts codev via codev init / codev adopt, so consumer repos inherit the hardcoded prompts. Fixing codev's protocols upstream doesn't retroactively fix already-adopted projects — they have static checked-in copies.
Layer 3 — test infrastructure
packages/codev/src/__tests__/bugfix-280-consult-diff.test.ts:31 — git init -b main. No existing test exercises a non-main-default repo through either Layer 1 or Layer 2.
Origin
9e95a2ab ("Fix architect auto-restart, consultation loops, and maxIterations escape", 2026-02-12) — pattern introduced conceptually.
92300008 ("[Spec 325][Phase: cli-rewrite] Rewrite consult CLI with flag-based mode routing", 2026-02-16) — current line shape; the rewrite carried the assumption forward without re-examination.
Existing fix pattern (in-tree)
packages/vscode/src/commands/view-diff.ts:262-271 already does the right thing — reads origin/HEAD via git symbolic-ref --short refs/remotes/origin/HEAD, falls back to main only when unset.
Fix surface (scoped by layer and exposure)
Layer 1 (CLI, protocol-agnostic):
- Extract
resolveDefaultBranch(workspaceRoot) into @cluesmith/codev-core (or similar shared module).
- Use it at
consult/index.ts:989 and :1297.
- Replace
view-diff.ts:262-271's inline implementation with the helper so the three sites stay in sync.
Layer 2 (per-protocol edits, scoped by exposure — PIR is the bulk of the work):
4. PIR-specific pass (8 sites in PIR + skeleton copies): replace literal git diff main / git rebase origin/main commands with branch-agnostic phrasing or a placeholder porch substitutes at send time. Replace conventions ("Don't push to main") with "Don't push to the integration branch". (Shipped on feat/pir-and-vscode-updates-2, commit 01f733d3.)
5. BUGFIX-specific pass (1 hard-fail in protocol.md:327): same treatment.
6. Cross-protocol one-line edit: replace "Is the branch up to date with main?" in PIR/SPIR/ASPIR/AIR/BUGFIX/MAINTAIN consult-types/pr-review.md files with branch-agnostic phrasing (6 identical edits).
7. SPIR/ASPIR convention pass: spir/protocol.md:148,243 (and ASPIR equivalents) — replace "Commit to main..." with "Commit to the integration branch...".
Layer 3 (tests):
8. Add a non-main-default repo test (e.g. git init -b dev workspace) that exercises both Layer 1 (consult CLI) and one or two of the more egregious Layer 2 prompts.
Optional (cleaner, larger scope):
9. porch resolves the default branch once per workspace and substitutes a placeholder into prompts before handing them to agents — avoids relying on agent-side variable expansion in markdown.
Note on layer interaction: fixing only Layer 1 (the CLI code) leaves the protocol-prompt commands hardcoded; agents will still run git diff main literally even when the consult harness internally resolves the right base. Both layers must move together for consumer-facing behaviour to change. The bulk of Layer 2 work is concentrated in PIR specifically; SPIR/ASPIR/AIR/MAINTAIN need only the cross-protocol one-line edit plus (for SPIR/ASPIR) the convention rewording.
Defect C — two-dot diff semantics produce reverse-included upstream "scope creep"
Independent of Defect B's branch-name issue but observable in the same wild scenarios. A reviewer-level defect rooted in three sites in the same consult CLI codepath.
Mechanism. The impl-review prompt has two failure modes that allow a reviewer (e.g. codex) to compute its own diff with two-dot semantics (git diff <base>..HEAD) instead of three-dot (git diff <base>...HEAD). Two-dot reverse-includes every commit that landed on the base branch since the builder's branch base — those commits appear as file changes in the builder's "scope" even though they aren't in the PR. Three-dot (what GitHub PR diff and gh pr diff --name-only compute) is clean.
Sites (all in packages/codev/src/commands/consult/index.ts):
-
Silent fallback when merge-base fails (lines 990-995, 1029-1030):
try {
const ref = diffRef ?? execSync('git merge-base HEAD main', ...).trim();
// …
} catch {
// If git diff fails, reviewer will explore filesystem
}
// … if changedFiles is empty:
query += `\n## Instructions\n\nExplore the filesystem to find and review the implementation changes.\n`;
When the merge-base call fails (e.g., on a non-main-default repo without a local main), changedFiles stays empty and the prompt drops to "explore the filesystem." The reviewer is given no scope anchor and naturally defaults to git diff <integration-branch>..HEAD (two-dot, the more common idiom in agent training data).
-
Prompt actively discourages the canonical diff (lines 1026-1028):
**Read the changed files from disk** to review their actual content. …
Do NOT rely on git diffs to determine the current state of code — diffs miss uncommitted changes in worktrees.
Even when consult emits a correct three-dot-equivalent file list, this instruction tells reviewers to look at disk instead. Reviewers reading the worktree see whatever's there — including upstream-rebased files that aren't in their PR.
-
Range syntax in getDiffStat (lines 772-773):
execSync(`git diff --stat ${ref}`, …);
When ref is the range string ${mergeBase}..origin/${pr.headRefName} (passed from line 1297), this expands to git diff --stat A..B — two-dot syntax. Output is correct because A is the merge-base (so A..B ≡ A...B), but the code reads as two-dot and survives copy-paste. Cosmetic but worth tightening.
Observed in the wild. An impl-review on a non-main-default repo flagged four files as "scope creep" that aren't in the PR. The reviewer had correctly resolved the integration branch (so Defect B didn't fire), but used two-dot diff against it. Three-dot diff was clean.
Fix surface (all in consult/index.ts):
- Fix the merge-base source (already in Defect B Layer 1's fix surface): use
resolveDefaultBranch(workspaceRoot). Eliminates the most common reason merge-base lookup fails.
- Don't swallow merge-base errors silently (line 994 catch block): on failure, fall back to the PR's actual base ref from
gh pr view ${pr.number} --json baseRefName — the canonical answer for what GitHub considers the PR's scope. Only as last-resort drop to "explore the filesystem", and even then include a clear directive: "if you compute your own diff, use git diff origin/<base>...HEAD (three-dot)".
- Tighten the scope-anchoring prompt (lines 1020-1028): when
changedFiles is non-empty, state explicitly that the listed files are the canonical scope (three-dot diff equivalent to GitHub's PR view); replace the "Do NOT rely on git diffs" line with something that directs reviewers to the list rather than away from it. Suggested wording: "The files above are the canonical scope of this PR (three-dot diff against the PR's base, equivalent to GitHub's PR view). Do not flag files outside this list, even if you see uncommitted changes in the worktree."
- Make the range explicit at emit time (line 1297's
buildImplQuery call): use ... (three-dot) instead of .. (two-dot) — semantically equivalent given the merge-base on the LHS, but disambiguates in code review and in any prompt text that surfaces the command.
Layer. All sites are in the CLI (commands/consult/index.ts). The protocol .md files have no git diff invocations across any protocol (confirmed via grep). No Layer 2 / Layer 3 surface for Defect C — the protocol prompts are clean. Fix surface is purely CLI, identical shape to Defect B Layer 1; the same PR can carry both.
Observed impact
-
Architect-supplied verify workaround broken. When porch's verify is wedged (see sibling issue — porch verify-wedge) and the architect runs consult --type plan --issue N manually to supply the missing review, Defect A causes the consult to read the stale on-integration-branch plan, not the builder's reworked plan. All three reviewers return REQUEST_CHANGES citing findings that the reworked plan has already corrected. Verdicts are useless for the artifact actually under review — verified by grepping the on-integration-branch plan for the symbols the consult flagged (still present) vs. the rework branch (corrected).
-
Phantom "scope creep" verdicts on impl review in non-main-integration repos (Defect B Layer 1, all protocols equally): codex returns REQUEST_CHANGES citing files the PR never touched; every flagged file is present on the integration branch but absent on main. Verified by: gh pr diff --name-only (true scope) ≠ codex's listed-files (polluted scope), with the difference being exactly the set of commits between main and the integration branch.
-
Hard-fail or misleading prompt commands (Defect B Layer 2, primarily PIR + BUGFIX): agents on non-main-default repos that literally execute the prompt's git diff main either error (no main branch) or produce a diff against an irrelevant branch, polluting their planning/review reasoning. SPIR/ASPIR/AIR/MAINTAIN are less affected at this layer — their Layer 2 manifestation is limited to the "up to date with main?" review question and (for SPIR/ASPIR) the "commit to main" architect instruction.
-
Phantom "scope creep" from two-dot reviewer diffs (Defect C, observable independently of Defect B): even when the reviewer resolves the integration branch correctly, two-dot diff against it reverse-includes upstream churn since the branch base, attributing it to the builder. Verified by: gh pr diff --name-only (true scope) vs the reviewer's flagged files matches the set of commits the integration branch added since the builder's branch base.
Together these silently invalidate the consult signal whenever the architect needs it most (supplying a missing verify, or reviewing PRs against a non-main integration branch, or both even on a main-default repo where the integration branch has advanced since the builder branched). They also stack with the sibling porch verify-wedge issue to make the independent 3-way review structurally unreachable for reworked SPIR artifacts.
Problem
Two related artifact-resolution defects in the consult harness produce verdicts that don't reflect the artifact the architect actually intends to review.
Defect A — workspace-rooted artifact resolution
consult --type {spec,plan,impl} --issue Nresolves the artifact (spec/plan/PR diff) from the invoking process's workspace root rather than from the builder's branch. When run from an architect's checkout (typically the repo's integration branch), the consult reviews the version of the artifact that's on the integration branch, not the builder's reworked version onbuilder/.... Verdicts are therefore about the stale artifact, not the work-in-progress one the architect intends to review.commands/consult/index.ts:findPlanContent(workspaceRoot, issueId)/findSpecContent/ impl-diff resolution all read fromworkspaceRoot— wherever the architect's invocation is rooted. There is no flag/parameter to point the consult at a specific branch/ref/worktree.Defect B —
mainhardcoded as default integration branch (multi-layer, uneven distribution)A multi-layer defect: the same assumption is fossilised in both the porch CLI implementation and the protocol instruction markdown. Fixing one without the other leaves the symptom intact. Distribution across protocols is uneven — PIR is roughly 4× more exposed at Layer 2 than its siblings.
Layer 1 — porch CLI / consult code (2 sites, protocol-agnostic)
packages/codev/src/commands/consult/index.ts:989—git merge-base HEAD main(thebuildImplQueryfallback).packages/codev/src/commands/consult/index.ts:1297—git merge-base main origin/${pr.headRefName}(the impl-review path).This layer hits every protocol equally — any
--type implconsult on a non-main-default repo produces false-positive "scope creep" verdicts: the diff sweeps in every commit that's on the integration branch but not onmain, attributing them to the branch under review.Layer 2 — protocol .md instruction files (per-protocol breakdown)
builder-prompt.md:85,prompts/implement.md:21,110,prompts/review.md:41,250consult-types/pr-review.md:25implement.md:136,review.md:242protocol.md:327consult-types/pr-review.md:44consult-types/pr-review.md:31protocol.md:148,243("Commit tomain...")consult-types/pr-review.md:31consult-types/pr-review.md:37consult-types/pr-review.md:31Categories:
Hard-fail (literal shell commands agents execute): the prompt instructs the agent to run
git diff main,git diff --stat main, orgit fetch origin main && git rebase origin/main. On a non-main-default repo these either error (nomainbranch) or produce diffs against an irrelevant branch.Wrong-answer (review questions agents must answer): the prompt asks "Is the branch up to date with main?" On a non-
main-default repo this is always "no" and the consult typically suggests rebase to main — the wrong action.Convention (semantically right, literally wrong on the wire): the prompt says "Don't push to main", "Commit to main so builder worktrees include the artifact", "merge to main", etc. The semantic intent is correct ("the integration branch"), but the literal name is fossilised. SPIR/ASPIR's "Commit to
main..." instruction in particular misleads an architect on a non-mainrepo into committing to a branch builders won't see.Why PIR is more exposed: PIR's prompts (introduced more recently) lean on concrete command examples to guide the agent. SPIR/ASPIR use more abstract phrasing. PIR's design choice happens to fossilise the
mainassumption more visibly than its older siblings.codev-skeleton propagation: Identical copies under
codev-skeleton/protocols/ship to every project that adopts codev viacodev init/codev adopt, so consumer repos inherit the hardcoded prompts. Fixing codev's protocols upstream doesn't retroactively fix already-adopted projects — they have static checked-in copies.Layer 3 — test infrastructure
packages/codev/src/__tests__/bugfix-280-consult-diff.test.ts:31—git init -b main. No existing test exercises a non-main-default repo through either Layer 1 or Layer 2.Origin
9e95a2ab("Fix architect auto-restart, consultation loops, and maxIterations escape", 2026-02-12) — pattern introduced conceptually.92300008("[Spec 325][Phase: cli-rewrite] Rewrite consult CLI with flag-based mode routing", 2026-02-16) — current line shape; the rewrite carried the assumption forward without re-examination.Existing fix pattern (in-tree)
packages/vscode/src/commands/view-diff.ts:262-271already does the right thing — readsorigin/HEADviagit symbolic-ref --short refs/remotes/origin/HEAD, falls back tomainonly when unset.Fix surface (scoped by layer and exposure)
Layer 1 (CLI, protocol-agnostic):
resolveDefaultBranch(workspaceRoot)into@cluesmith/codev-core(or similar shared module).consult/index.ts:989and:1297.view-diff.ts:262-271's inline implementation with the helper so the three sites stay in sync.Layer 2 (per-protocol edits, scoped by exposure — PIR is the bulk of the work):
4. PIR-specific pass (8 sites in PIR + skeleton copies): replace literal
git diff main/git rebase origin/maincommands with branch-agnostic phrasing or a placeholder porch substitutes at send time. Replace conventions ("Don't push to main") with "Don't push to the integration branch". (Shipped onfeat/pir-and-vscode-updates-2, commit01f733d3.)5. BUGFIX-specific pass (1 hard-fail in
protocol.md:327): same treatment.6. Cross-protocol one-line edit: replace "Is the branch up to date with main?" in PIR/SPIR/ASPIR/AIR/BUGFIX/MAINTAIN
consult-types/pr-review.mdfiles with branch-agnostic phrasing (6 identical edits).7. SPIR/ASPIR convention pass:
spir/protocol.md:148,243(and ASPIR equivalents) — replace "Commit tomain..." with "Commit to the integration branch...".Layer 3 (tests):
8. Add a non-
main-default repo test (e.g.git init -b devworkspace) that exercises both Layer 1 (consult CLI) and one or two of the more egregious Layer 2 prompts.Optional (cleaner, larger scope):
9. porch resolves the default branch once per workspace and substitutes a placeholder into prompts before handing them to agents — avoids relying on agent-side variable expansion in markdown.
Note on layer interaction: fixing only Layer 1 (the CLI code) leaves the protocol-prompt commands hardcoded; agents will still run
git diff mainliterally even when the consult harness internally resolves the right base. Both layers must move together for consumer-facing behaviour to change. The bulk of Layer 2 work is concentrated in PIR specifically; SPIR/ASPIR/AIR/MAINTAIN need only the cross-protocol one-line edit plus (for SPIR/ASPIR) the convention rewording.Defect C — two-dot diff semantics produce reverse-included upstream "scope creep"
Independent of Defect B's branch-name issue but observable in the same wild scenarios. A reviewer-level defect rooted in three sites in the same consult CLI codepath.
Mechanism. The impl-review prompt has two failure modes that allow a reviewer (e.g. codex) to compute its own diff with two-dot semantics (
git diff <base>..HEAD) instead of three-dot (git diff <base>...HEAD). Two-dot reverse-includes every commit that landed on the base branch since the builder's branch base — those commits appear as file changes in the builder's "scope" even though they aren't in the PR. Three-dot (what GitHub PR diff andgh pr diff --name-onlycompute) is clean.Sites (all in
packages/codev/src/commands/consult/index.ts):Silent fallback when merge-base fails (lines 990-995, 1029-1030):
When the merge-base call fails (e.g., on a non-
main-default repo without a localmain),changedFilesstays empty and the prompt drops to "explore the filesystem." The reviewer is given no scope anchor and naturally defaults togit diff <integration-branch>..HEAD(two-dot, the more common idiom in agent training data).Prompt actively discourages the canonical diff (lines 1026-1028):
Even when consult emits a correct three-dot-equivalent file list, this instruction tells reviewers to look at disk instead. Reviewers reading the worktree see whatever's there — including upstream-rebased files that aren't in their PR.
Range syntax in
getDiffStat(lines 772-773):When
refis the range string${mergeBase}..origin/${pr.headRefName}(passed from line 1297), this expands togit diff --stat A..B— two-dot syntax. Output is correct because A is the merge-base (soA..B≡A...B), but the code reads as two-dot and survives copy-paste. Cosmetic but worth tightening.Observed in the wild. An impl-review on a non-
main-default repo flagged four files as "scope creep" that aren't in the PR. The reviewer had correctly resolved the integration branch (so Defect B didn't fire), but used two-dot diff against it. Three-dot diff was clean.Fix surface (all in
consult/index.ts):resolveDefaultBranch(workspaceRoot). Eliminates the most common reason merge-base lookup fails.gh pr view ${pr.number} --json baseRefName— the canonical answer for what GitHub considers the PR's scope. Only as last-resort drop to "explore the filesystem", and even then include a clear directive: "if you compute your own diff, usegit diff origin/<base>...HEAD(three-dot)".changedFilesis non-empty, state explicitly that the listed files are the canonical scope (three-dot diff equivalent to GitHub's PR view); replace the "Do NOT rely on git diffs" line with something that directs reviewers to the list rather than away from it. Suggested wording: "The files above are the canonical scope of this PR (three-dot diff against the PR's base, equivalent to GitHub's PR view). Do not flag files outside this list, even if you see uncommitted changes in the worktree."buildImplQuerycall): use...(three-dot) instead of..(two-dot) — semantically equivalent given the merge-base on the LHS, but disambiguates in code review and in any prompt text that surfaces the command.Layer. All sites are in the CLI (
commands/consult/index.ts). The protocol .md files have nogit diffinvocations across any protocol (confirmed via grep). No Layer 2 / Layer 3 surface for Defect C — the protocol prompts are clean. Fix surface is purely CLI, identical shape to Defect B Layer 1; the same PR can carry both.Observed impact
Architect-supplied verify workaround broken. When porch's verify is wedged (see sibling issue —
porch verify-wedge) and the architect runsconsult --type plan --issue Nmanually to supply the missing review, Defect A causes the consult to read the stale on-integration-branch plan, not the builder's reworked plan. All three reviewers return REQUEST_CHANGES citing findings that the reworked plan has already corrected. Verdicts are useless for the artifact actually under review — verified by grepping the on-integration-branch plan for the symbols the consult flagged (still present) vs. the rework branch (corrected).Phantom "scope creep" verdicts on impl review in non-
main-integration repos (Defect B Layer 1, all protocols equally): codex returns REQUEST_CHANGES citing files the PR never touched; every flagged file is present on the integration branch but absent onmain. Verified by:gh pr diff --name-only(true scope) ≠ codex's listed-files (polluted scope), with the difference being exactly the set of commits betweenmainand the integration branch.Hard-fail or misleading prompt commands (Defect B Layer 2, primarily PIR + BUGFIX): agents on non-
main-default repos that literally execute the prompt'sgit diff maineither error (nomainbranch) or produce a diff against an irrelevant branch, polluting their planning/review reasoning. SPIR/ASPIR/AIR/MAINTAIN are less affected at this layer — their Layer 2 manifestation is limited to the "up to date with main?" review question and (for SPIR/ASPIR) the "commit to main" architect instruction.Phantom "scope creep" from two-dot reviewer diffs (Defect C, observable independently of Defect B): even when the reviewer resolves the integration branch correctly, two-dot diff against it reverse-includes upstream churn since the branch base, attributing it to the builder. Verified by:
gh pr diff --name-only(true scope) vs the reviewer's flagged files matches the set of commits the integration branch added since the builder's branch base.Together these silently invalidate the consult signal whenever the architect needs it most (supplying a missing verify, or reviewing PRs against a non-
mainintegration branch, or both even on amain-default repo where the integration branch has advanced since the builder branched). They also stack with the sibling porch verify-wedge issue to make the independent 3-way review structurally unreachable for reworked SPIR artifacts.