Issue 3 — PIR protocol: plan-review + code-review gates for issue-driven work
Problem
Codev's existing protocols cover two extremes for GitHub-issue–driven work:
- BUGFIX / AIR — autonomous by design. Builder spawns, works alone, surfaces a PR. No human gates between investigation and PR. Review happens ON the PR.
- SPIR / ASPIR — heavyweight, spec + plan file ceremony. Produces
codev/specs/<N>-*.md and codev/plans/<N>-*.md files, with multi-agent review passes. Appropriate for new features from scratch, architectural changes, or work that deserves formal design artifacts.
There is no middle ground for work where the engineer's assessment is that either the approach or the implementation — or both — genuinely needs human checking before a PR exists. The need is driven by the nature of the change, not its size. PIR fills this gap.
When the existing options fail
Autonomous protocols fail when the reviewer must see the approach before the builder commits to it:
- Root cause is ambiguous; multiple valid fixes exist and the reviewer wants to pick
- Change touches unfamiliar or high-blast-radius code (shared utilities, auth, migrations, public APIs)
- Design-sensitive work where picking the wrong abstraction early is expensive to undo at PR time
Autonomous protocols mean this review only happens after the builder has already implemented a full fix. Redirecting at PR time means throwing away work.
PR-time review fails when the code cannot be adequately verified by reading a diff. The classic case is mobile apps: a 10-line UI change affects Android, iOS, and potentially web. The PR diff shows you the lines changed, but it doesn't show you:
- How the change looks on each platform
- How it interacts with platform-specific behavior (keyboard, permissions, push notifications, device orientation)
- Whether it preserves existing behavior across device sizes / OS versions
- Cross-platform consistency
The same applies to:
- UI/UX changes in any app — visual correctness, interaction flow, accessibility behavior require running the app, not just reading code
- Hardware-adjacent changes — camera, microphone, sensors, Bluetooth, location
- Integration with external services that don't mock cleanly (OAuth flows, payment SDKs, analytics)
- User-journey changes that require exercising the full flow (multi-step forms, checkout, onboarding)
- Performance-sensitive changes that need profiling on the running app
- Accessibility changes (screen readers, keyboard navigation)
For any of these, discovering a regression at PR time is already too late — the reviewer either merges and hopes, or rejects after the builder has wasted effort.
What's needed
A protocol that lets the engineer, based on their own judgment about the work's nature, opt into:
- Plan review before implementation starts — reviewer sees the proposed approach, can redirect cheaply before code is written
- Code review AND testing before PR creation — reviewer can run the builder's branch locally on their own devices / browsers / hardware to verify behavior
Either alone is useful. Together they cover the full spectrum of cases where reading the PR diff at merge time is insufficient.
This protocol should NOT produce standalone spec/plan markdown files (those belong to SPIR). It should use GitHub issue comments as its artifacts so they remain part of the issue's discussion thread.
Proposed solution — new pir protocol
A new protocol pir (Plan / Implement / Review) that slots between AIR and SPIR in the existing ladder:
| Protocol |
Phases |
Gates |
Use case |
| AIR |
implement → review |
review (PR) |
Autonomous — review happens on the PR |
| BUGFIX |
investigate → fix → pr |
review (PR) |
Autonomous — review happens on the PR |
| PIR |
plan → implement → review |
plan-approval + code-review + PR |
Engineer's judgment: work needs approach review before coding, and/or needs testing before PR (e.g., mobile / cross-platform / UI changes) |
| ASPIR |
specify → plan → implement → review |
review (PR) only |
New features, autonomous from spec |
| SPIR |
specify → plan → implement → review |
spec-approval + plan-approval + review |
New features, full supervision with file artifacts |
PIR is structurally SPIR minus the specify phase: the GitHub issue is the implicit spec. The builder reads the issue, investigates the codebase, posts a structured plan as a GitHub issue comment, waits for approval, implements, posts a structured code-review summary as a second comment, waits for approval again, then creates the PR.
PIR is chosen by the engineer based on the nature of the work, not its size. A 10-line mobile UI tweak might warrant PIR. A 500-line internal refactor might not.
Phases
plan → implement → review
- plan: builder reads the issue, investigates the codebase, writes a structured plan, posts it as a comment on the GitHub issue, waits at the
plan-approval gate. Builder stays interactive (live Claude session in a PTY) — reviewer can type feedback directly, send via afx send, or comment on the issue; builder revises and re-posts plan.
- implement: after plan approval, builder writes code, runs tests locally in the worktree, posts a structured code-review summary as a second comment on the issue, waits at the
code-review gate. Same interactive feedback model.
- review: after code-review approval, builder pushes branch, creates PR with
Fixes #<N>, runs CMAP (3-way Gemini + Codex + Claude), notifies architect. Architect reviews PR, instructs merge. Builder merges with gh pr merge <M> --merge.
Gates
Two human-approved gates, plus the implicit PR merge gate inherited from BUGFIX pattern:
plan-approval on the plan phase — reused from SPIR's existing gate machinery. Porch treats gate names as opaque strings; reuse is safe (gates are keyed by (project_id, gate_name)).
code-review on the implement phase — new gate name.
Artifacts (posted as GitHub issue comments, NOT files)
Plan comment (at plan-approval):
## Proposed Plan — Issue #<N>
**Understanding**: <what the issue asks for; for a bug, the root cause>
**Proposed change**: <what the builder will do>
**Files to change**: <paths>
**Risks / alternatives considered**: <trade-offs; considered-and-rejected alternatives>
**Test plan**: <how to verify — for the reviewer to use at the code-review gate>
Code-review comment (at code-review):
## Code Review — Issue #<N>
**Summary**: <2–3 sentence overview of what was implemented>
**Files changed**: <from `git diff --stat main`, with +/- counts>
**Commits**: <from `git log main..HEAD --oneline`>
**Test results**: <what was run; pass/fail>
**Things to look at**: <tricky spots the reviewer should focus on>
**To review**: use the "View Diff" context menu in VSCode's Codev sidebar, or `git -C .builders/pir-<id> diff main`.
**To run locally**: use the "Run Dev Server" context menu, or `afx dev <builder-id>`.
No codev/plans/<N>-*.md or codev/reviews/<N>-*.md files. The GitHub issue is the source of truth; the issue comments are the durable artifacts.
Builder banners at each gate
The builder is a long-running interactive Claude session in a PTY pane managed by Tower. At each gate, the builder's prompt instructs it to stay alive with a banner:
At plan-approval:
═══════════════════════════════════════════════════════════
✋ Plan posted to issue #<N>. Awaiting review.
Review paths:
• Terminal: porch approve <id> plan-approval --a-human-explicitly-approved-this
(no explicit reject — use `afx send <id> "feedback"` to request changes)
• VSCode: Codev sidebar → Needs Attention → "#<N>" → Approve Gate (Cmd+K G)
or Send Message (Cmd+K D) to request changes
• Read plan: gh issue view <N> --comments
Or type feedback here / on the issue — I'll revise and re-post.
═══════════════════════════════════════════════════════════
At code-review:
═══════════════════════════════════════════════════════════
✋ Implementation complete. Awaiting code review.
Summary posted to issue #<N>.
Review paths (no manual cd needed):
• View diff: git -C .builders/pir-<id> diff main | VSCode: "View Diff"
• Run the app: afx dev <id> | VSCode: "Run Dev Server"
• Read summary: gh issue view <N> --comments
• Approve: porch approve <id> code-review --a-human-explicitly-approved-this
| VSCode: "Approve Gate" (Cmd+K G)
• Request changes: afx send <id> "<feedback>" | VSCode: "Send Message" (Cmd+K D)
(builder revises; gate stays pending until approved)
Or type feedback here / on the issue — I'll iterate.
═══════════════════════════════════════════════════════════
Rejection mechanism (no formal reject command needed)
Porch has no reject command — it only has approve. Rejection in SPIR works implicitly: the human provides feedback (via file edits, afx send, or issue comments), the builder iterates its artifact, the human approves when satisfied. PIR uses the same pattern:
- Reviewer wants changes: runs
afx send <builder-id> "<feedback>" OR posts a comment on the GitHub issue OR types directly in the builder's PTY pane
- Builder reads the feedback on its next iteration (via the prompt instructing it to re-check for messages and issue comments), revises its artifact (plan comment or code-review summary), re-posts the comment
- Gate stays pending until the reviewer runs
porch approve <id> <gate>
This matches how SPIR's spec-approval and plan-approval gates already work. No new porch command needed.
Serial review model (one dev env at a time)
The code-review gate assumes runnable worktrees (Issue 1) so the reviewer can test the change locally via afx dev <builder-id> or VSCode's "Run Dev Server" (Issue 2). Per the design rationale in Issue 1, only one dev env runs at a time — the worktree reuses main's ports and URLs. This preserves CORS, OAuth callbacks, cookie scoping, CSP, and webhooks, which would otherwise break if the worktree's dev ran on rewritten ports.
Parallel review (multiple dev envs side-by-side) requires deploy previews (Vercel/Netlify/Render) — out of Codev scope.
Porch integration — concrete details
PIR uses porch's existing gate machinery. This section itemizes what works as-is, what the SPIR spec phase must verify before coding, and what porch-side changes (if any) may be required.
What works without porch engine changes
- Gate declaration in
protocol.json — "gate": "<name>" syntax on a phase, already supported. Copy SPIR's pattern at codev/protocols/spir/protocol.json.
- Gate approval command —
porch approve <id> <gate> --a-human-explicitly-approved-this. Unchanged; treats gate name as an opaque string.
- Gate-pending notification to architect — porch already notifies on gate reach (per
codev/protocols/spir/protocol.md:49). PIR inherits this automatically.
- Gate rollback —
porch rollback <id> <phase> exists as an escape hatch if a gate needs to be undone entirely (rare).
- Protocol loading —
loadProtocol('pir') at packages/codev/src/commands/porch/protocol.ts:22 auto-picks up the new codev/protocols/pir/ directory via directory-based discovery.
Open questions the spec phase MUST investigate and resolve
Each of these needs a concrete answer (with code verification) before the plan phase can design the prompts:
Q1. Is code-review as a new gate name accepted without any whitelist / schema change?
- Action for spec phase:
grep -r "plan-approval\|spec-approval" packages/codev/src/commands/porch/ and check whether gate names are referenced beyond protocol.json files (enums, validation schemas, type unions, docs that would need updating)
- If there's a schema whitelist,
code-review must be added to it
- If purely data-driven, no change needed
Q2. How does an interactive builder detect that a gate was approved out-of-band?
- Scenario: builder is sitting at
plan-approval, displaying its banner, responsive to user input. User (or architect) runs porch approve <id> plan-approval from a different terminal. How does the builder's live Claude session notice?
- Candidate mechanisms (spec phase must evaluate):
- Polling: prompt instructs Claude to run
porch status every N seconds — wasteful, but simple
- User-prompted: user types "proceed" in the pane after approving; builder runs
porch status to confirm, then advances
- Signal-based: does
porch approve write to a file the builder's prompt can watch? Check porch state storage at codev/projects/<id>/status.yaml
- Existing pattern reuse: check how SPIR builders detect approval — they may already exit and rely on the restart loop, in which case PIR just inherits that (and the "interactive hold" is an illusion — Claude exits after posting, restarts, sees advanced state, moves on)
- This is the single most important integration question. The entire banner/stay-alive design depends on the answer. Resolve before drafting prompts.
Q3. Does PHASE_COMPLETE with a pending gate correctly mark phase-complete-but-blocked in porch state?
- Scenario: builder emits
<signal>PHASE_COMPLETE</signal> after posting its plan comment. Does porch (a) advance to the next phase immediately, (b) mark phase complete but gate pending, or (c) error because the gate hasn't been approved?
- Action: test against existing SPIR behavior — signal PHASE_COMPLETE on specify phase and observe whether porch advances without the spec-approval gate being run first
- If PHASE_COMPLETE requires gate approval before being emitted, the prompt must not emit until after
porch approve runs — meaning the builder does need to detect approval somehow (see Q2)
- If PHASE_COMPLETE can be emitted while gate is pending (phase "complete but waiting"), the simpler design works
Q4. What does porch status output when a gate is pending?
- Exact string/JSON format matters for:
- Dashboard rendering (existing — verify it handles
code-review gate state)
- VSCode Needs Attention view (existing — same)
- Builder's own introspection if it needs to check gate state
- Action: run
porch init pir <test-id> and porch status <test-id> after sim-emitting PHASE_COMPLETE; verify output format
Q5. Feedback-loop iteration and porch state
- When the reviewer requests changes via
afx send or issue comment, the builder revises its artifact and re-posts. Does the builder emit PHASE_COMPLETE again on each revision? Does porch distinguish "first attempt" from "revised attempt"?
- Action: check if
PHASE_COMPLETE is idempotent or if porch tracks emission count
- If porch doesn't care, the prompt just tells the builder to re-post freely
- If porch tracks, may need to adjust prompt to emit PHASE_COMPLETE only once per artifact cycle
Q6. Resumption behavior after builder crash
- If the builder's Claude session crashes mid-feedback-loop, the Tower restart loop re-launches Claude with the same prompt file. The new Claude instance calls
porch status, sees phase: plan, gate: plan-approval, pending. It must know to:
- Re-read the GitHub issue (including comments posted since it last checked)
- Re-read
afx send messages that arrived while it was down
- Check its own last-posted comment to decide whether to re-post or continue iterating
- None of this is porch's responsibility directly — it's prompt design — but it interacts with porch's state ("phase X, gate pending" must be unambiguous on resumption)
- Action for spec phase: ensure the
plan.md and implement.md prompts include explicit resumption logic
Potential porch-side changes (only if spec-phase investigation reveals them)
List the changes that could be needed, for transparency. The spec phase will determine which (if any) are actually required.
| Potential change |
Trigger condition |
Location |
Register code-review in a gate-name allowlist |
If Q1 reveals a whitelist exists |
TBD via grep |
Add a porch wait-for-approval <id> <gate> command |
If Q2 concludes polling is the only option and we want porch to own the poll loop |
packages/codev/src/commands/porch/ — new command module |
Document PHASE_COMPLETE + gate semantics |
If Q3 behavior isn't obvious from existing docs |
codev/resources/protocol-format.md or equivalent |
Expose gate state in a structured porch status --json format |
If Q4 reveals dashboard/VSCode need parsable output |
packages/codev/src/commands/porch/status.ts |
If none of these are needed, PIR remains a pure data-only addition (protocol.json + prompts + skeleton mirror + spawn routing + docs). If some are needed, they get scoped into this issue (not deferred) — they're prerequisites for PIR to work end-to-end.
Spec-phase deliverable
Before the plan phase begins, the SPIR spec phase for this issue must answer Q1–Q6 with code-verified answers and document them in the spec. The plan phase designs the prompts and tests around those answers.
Files to create
Protocol directory — codev/protocols/pir/
codev/protocols/pir/
├── protocol.json # phases + plan-approval on plan + code-review on implement
├── protocol.md # workflow doc, comparison to BUGFIX/AIR/SPIR
├── builder-prompt.md # top-level builder prompt (mirror of bugfix's, adjusted for PIR phases)
├── prompts/
│ ├── plan.md # read issue, investigate, post plan comment, await feedback in interactive loop
│ ├── implement.md # write code, test locally, post code-review summary, await review in interactive loop
│ └── review.md # create PR, run CMAP, notify architect, merge on instruction
└── consult-types/
├── impl-review.md # CMAP consult type for implementation review
└── pr-review.md # CMAP consult type for PR review
Skeleton mirror — codev-skeleton/protocols/pir/
Byte-identical mirror of the above. packages/codev/skeleton/protocols/pir/ is auto-regenerated by the copy-skeleton build script on next build.
Reference implementations: codev/protocols/bugfix/ for the autonomous cousin (investigate → fix → pr phase structure), codev/protocols/spir/ for the gate-at-plan pattern (see protocol.json for gate field syntax).
Files to modify
-
packages/codev/src/agent-farm/cli.ts — update --protocol option help text to include pir
-
packages/codev/src/agent-farm/commands/spawn.ts — route --protocol pir through the bugfix spawn path. One line in getSpawnMode:
if (options.protocol === 'bugfix' || options.protocol === 'pir') return 'bugfix';
PIR's spawn is issue-driven with a worktree — identical infrastructure to bugfix. Protocol-specific behavior lives entirely in the prompts, not in the spawn code.
-
CLAUDE.md — add pir to the Available Protocols list and a "Use PIR for" entry in the Protocol Selection Guide:
### Use PIR for (engineer-judged — based on the nature of the work, not its size):
Pick PIR when ONE or BOTH of the following apply to a GitHub-issue-driven change:
**1. The approach needs review before coding starts**:
- Root cause is ambiguous; multiple valid fixes exist
- Area is unfamiliar or high-blast-radius
- Design-sensitive (affects conventions, patterns, architecture)
- Cheaper to redirect at plan time than at PR time
**2. The implementation needs to be TESTED before a PR is created**, because the PR diff alone is insufficient:
- Mobile app changes (needs device testing on Android, iOS, possibly web)
- UI/UX changes (visual inspection, interaction flow, accessibility)
- Hardware-adjacent behavior (sensors, camera, permissions, notifications)
- Integration with external services that don't mock cleanly
- User journeys needing full-flow exercise
- Performance-sensitive changes needing profiling on the running app
**PIR uses GitHub Issues as source of truth.** Three phases: Plan (gated by plan-approval) → Implement (gated by code-review) → Review (PR). Artifacts are GitHub issue comments, not files. Lighter than SPIR (no spec/plan files) but more supervised than autonomous BUGFIX/AIR.
-
AGENTS.md — mirror the CLAUDE.md update (AGENTS.md standard compatibility)
-
packages/codev/src/__tests__/skeleton.test.ts — add pir to the expected protocol list if the test enumerates protocols
-
packages/codev/src/commands/porch/__tests__/protocol.test.ts — add a test asserting loadProtocol('pir') returns a protocol with a plan phase carrying plan-approval gate and an implement phase carrying code-review gate
Acceptance criteria
- Protocol loads:
porch init pir 42 --worktree .builders/pir-42-test initializes a project with the PIR phases + gates. Porch state correctly shows plan phase with pending plan-approval gate.
- Spawn routing:
afx spawn <N> --protocol pir spawns an issue-driven builder in .builders/pir-<N>-<slug>/. No error; uses the same spawn path as bugfix.
- Plan phase produces artifact: builder reads the issue, investigates, posts a structured plan comment to the GitHub issue via
gh issue comment. Comment contains all five sections (Understanding, Proposed change, Files to change, Risks, Test plan).
- Plan gate pauses builder: after posting, builder emits
PHASE_COMPLETE; porch marks plan-approval as pending. Builder's Claude session stays interactive, displays the banner. Dashboard + VSCode Needs Attention view show the builder as blocked at plan-approval.
- Plan feedback loop: reviewer runs
afx send <builder-id> "<feedback>" OR posts an issue comment → builder reads, revises plan, re-posts updated comment. Gate stays pending.
- Plan approval advances: reviewer runs
porch approve <id> plan-approval --a-human-explicitly-approved-this (or VSCode "Approve Gate") → builder transitions to implement phase on next poll.
- Implement phase produces code + code-review artifact: builder writes code, runs tests locally in the worktree, commits. Posts a structured code-review comment with Summary, Files changed, Commits, Test results, Things to look at. Emits
PHASE_COMPLETE.
- Code-review gate pauses builder: porch marks
code-review as pending. Builder stays interactive with the second banner. Dashboard + sidebar reflect the state.
- Code-review feedback loop: same mechanism as plan —
afx send, issue comments, or direct terminal input. Builder revises and re-posts updated summary.
- Code-review approval advances: reviewer runs
porch approve <id> code-review → builder transitions to review phase.
- Review phase creates PR: builder pushes branch, runs
gh pr create with Fixes #<N>, runs CMAP 3-way review, updates PR description, notifies architect via afx send architect "PR #<M> ready (PIR #<N>)".
- Merge flow: architect approves PR → instructs builder to merge → builder runs
gh pr merge <M> --merge (non-squash). Issue auto-closes via Fixes #<N>.
- Runnable worktree (requires Issue 1 merged): at the
code-review gate, reviewer runs afx dev <builder-id> and the worktree's dev server starts on main's ports. Reviewer tests in browser.
- VSCode flow (requires Issue 2 merged): reviewer right-clicks the builder at
code-review in the Codev sidebar → View Diff (opens native diff editor) → Run Dev Server (spawns Tower dev PTY, auto-swaps from previous builder's dev if any) → Approve Gate (Cmd+K G) advances.
- Skeleton mirror:
pnpm build regenerates packages/codev/skeleton/protocols/pir/ from codev-skeleton/protocols/pir/.
- Tests: protocol load test passes; skeleton enumeration test includes pir.
Non-goals (explicitly out of scope)
- Changes to BUGFIX, AIR, SPIR, ASPIR behavior — PIR is a pure sibling protocol
- Porch engine changes — reusing existing gate mechanism;
plan-approval gate name is already defined in SPIR, code-review is a new string but no engine code changes are needed
- A
codev/reviews/<N>-pir.md file — PIR doesn't produce a retrospective review file; the issue comments + PR body are the artifacts
- A dedicated
porch reject command — rejection is via the feedback-iterate pattern, matching SPIR
- Claude Code plan mode (
--permission-mode plan) — not used; PIR's approach is prompt-driven, not runtime-enforced
- Harness abstraction changes — PIR's mechanism is protocol-level (prompts + porch state), works for any harness
Dependencies
- Issue 1 merged — PIR's
code-review gate is meaningless without runnable worktrees. Without Issue 1, the reviewer can't test the builder's changes; the gate degenerates to a diff-read, which is no better than what existing protocols offer at PR time.
- Issue 2 merged — Without it, the VSCode review flow requires terminal trips (still workable via
git -C + afx dev, but the sidebar-native UX that motivates PIR's existence isn't delivered).
PIR can technically ship without Issues 1+2, but the end-to-end review UX it promises requires them. Recommend shipping Issues 1 and 2 before this.
Benefits delivered
- A new lightweight middleweight protocol for gated work on GitHub issues — fills the gap between autonomous (BUGFIX/AIR) and ceremonial (SPIR/ASPIR)
- Two human gates without the overhead of markdown artifacts
- Leverages the infrastructure from Issues 1 and 2 for testable + reviewable worktrees
- Opt-in per spawn — existing protocols unchanged
Reference implementations / existing patterns
- BUGFIX protocol structure:
codev/protocols/bugfix/protocol.json, codev/protocols/bugfix/prompts/*.md, codev/protocols/bugfix/builder-prompt.md
- SPIR's gate-at-plan pattern:
codev/protocols/spir/protocol.json — shows "gate": "plan-approval" on the plan phase
- Porch gate mechanism:
packages/codev/src/commands/porch/protocol.ts (loadProtocol), packages/codev/src/commands/porch/state.ts (gate state)
- Porch's notify-architect-on-gate behavior: documented at
codev/protocols/spir/protocol.md:49 — "When a gate is reached, porch notifies the architect and blocks until approved" — PIR inherits this automatically
- Spawn routing:
getSpawnMode in packages/codev/src/agent-farm/commands/spawn.ts
- Skeleton sync:
copy-skeleton script in packages/codev/package.json
Risks and mitigations
- Risk: PIR's builder-stays-interactive-at-gate pattern differs from how BUGFIX builders behave (BUGFIX exits cleanly between phases). Mitigation: the approach and implement prompts explicitly instruct Claude to stay alive after posting the artifact and signaling PHASE_COMPLETE. Tower's existing restart loop is a safety net — even if Claude exits, it'll restart and re-poll porch status.
- Risk: The phrase "plan" in PIR may confuse users familiar with SPIR's
codev/plans/<N>.md file convention. Mitigation: PIR's protocol.md explicitly documents that its plan is a GitHub issue comment, not a file; gate name (plan-approval) is reused but scoped by project id.
- Risk: The code-review gate assumes the user will actually test the change via a runnable worktree. If they don't (or Issue 1 isn't configured with
worktree.devCommand), the gate is just a diff-read. Mitigation: documented explicitly; users who don't want this step should use BUGFIX or AIR.
Issue 3 — PIR protocol: plan-review + code-review gates for issue-driven work
Problem
Codev's existing protocols cover two extremes for GitHub-issue–driven work:
codev/specs/<N>-*.mdandcodev/plans/<N>-*.mdfiles, with multi-agent review passes. Appropriate for new features from scratch, architectural changes, or work that deserves formal design artifacts.There is no middle ground for work where the engineer's assessment is that either the approach or the implementation — or both — genuinely needs human checking before a PR exists. The need is driven by the nature of the change, not its size. PIR fills this gap.
When the existing options fail
Autonomous protocols fail when the reviewer must see the approach before the builder commits to it:
Autonomous protocols mean this review only happens after the builder has already implemented a full fix. Redirecting at PR time means throwing away work.
PR-time review fails when the code cannot be adequately verified by reading a diff. The classic case is mobile apps: a 10-line UI change affects Android, iOS, and potentially web. The PR diff shows you the lines changed, but it doesn't show you:
The same applies to:
For any of these, discovering a regression at PR time is already too late — the reviewer either merges and hopes, or rejects after the builder has wasted effort.
What's needed
A protocol that lets the engineer, based on their own judgment about the work's nature, opt into:
Either alone is useful. Together they cover the full spectrum of cases where reading the PR diff at merge time is insufficient.
This protocol should NOT produce standalone spec/plan markdown files (those belong to SPIR). It should use GitHub issue comments as its artifacts so they remain part of the issue's discussion thread.
Proposed solution — new
pirprotocolA new protocol
pir(Plan / Implement / Review) that slots between AIR and SPIR in the existing ladder:PIR is structurally SPIR minus the specify phase: the GitHub issue is the implicit spec. The builder reads the issue, investigates the codebase, posts a structured plan as a GitHub issue comment, waits for approval, implements, posts a structured code-review summary as a second comment, waits for approval again, then creates the PR.
PIR is chosen by the engineer based on the nature of the work, not its size. A 10-line mobile UI tweak might warrant PIR. A 500-line internal refactor might not.
Phases
plan-approvalgate. Builder stays interactive (live Claude session in a PTY) — reviewer can type feedback directly, send viaafx send, or comment on the issue; builder revises and re-posts plan.code-reviewgate. Same interactive feedback model.Fixes #<N>, runs CMAP (3-way Gemini + Codex + Claude), notifies architect. Architect reviews PR, instructs merge. Builder merges withgh pr merge <M> --merge.Gates
Two human-approved gates, plus the implicit PR merge gate inherited from BUGFIX pattern:
plan-approvalon theplanphase — reused from SPIR's existing gate machinery. Porch treats gate names as opaque strings; reuse is safe (gates are keyed by(project_id, gate_name)).code-reviewon theimplementphase — new gate name.Artifacts (posted as GitHub issue comments, NOT files)
Plan comment (at
plan-approval):Code-review comment (at
code-review):No
codev/plans/<N>-*.mdorcodev/reviews/<N>-*.mdfiles. The GitHub issue is the source of truth; the issue comments are the durable artifacts.Builder banners at each gate
The builder is a long-running interactive Claude session in a PTY pane managed by Tower. At each gate, the builder's prompt instructs it to stay alive with a banner:
At
plan-approval:At
code-review:Rejection mechanism (no formal reject command needed)
Porch has no
rejectcommand — it only hasapprove. Rejection in SPIR works implicitly: the human provides feedback (via file edits,afx send, or issue comments), the builder iterates its artifact, the human approves when satisfied. PIR uses the same pattern:afx send <builder-id> "<feedback>"OR posts a comment on the GitHub issue OR types directly in the builder's PTY paneporch approve <id> <gate>This matches how SPIR's spec-approval and plan-approval gates already work. No new porch command needed.
Serial review model (one dev env at a time)
The
code-reviewgate assumes runnable worktrees (Issue 1) so the reviewer can test the change locally viaafx dev <builder-id>or VSCode's "Run Dev Server" (Issue 2). Per the design rationale in Issue 1, only one dev env runs at a time — the worktree reuses main's ports and URLs. This preserves CORS, OAuth callbacks, cookie scoping, CSP, and webhooks, which would otherwise break if the worktree's dev ran on rewritten ports.Parallel review (multiple dev envs side-by-side) requires deploy previews (Vercel/Netlify/Render) — out of Codev scope.
Porch integration — concrete details
PIR uses porch's existing gate machinery. This section itemizes what works as-is, what the SPIR spec phase must verify before coding, and what porch-side changes (if any) may be required.
What works without porch engine changes
protocol.json—"gate": "<name>"syntax on a phase, already supported. Copy SPIR's pattern atcodev/protocols/spir/protocol.json.porch approve <id> <gate> --a-human-explicitly-approved-this. Unchanged; treats gate name as an opaque string.codev/protocols/spir/protocol.md:49). PIR inherits this automatically.porch rollback <id> <phase>exists as an escape hatch if a gate needs to be undone entirely (rare).loadProtocol('pir')atpackages/codev/src/commands/porch/protocol.ts:22auto-picks up the newcodev/protocols/pir/directory via directory-based discovery.Open questions the spec phase MUST investigate and resolve
Each of these needs a concrete answer (with code verification) before the plan phase can design the prompts:
Q1. Is
code-reviewas a new gate name accepted without any whitelist / schema change?grep -r "plan-approval\|spec-approval" packages/codev/src/commands/porch/and check whether gate names are referenced beyondprotocol.jsonfiles (enums, validation schemas, type unions, docs that would need updating)code-reviewmust be added to itQ2. How does an interactive builder detect that a gate was approved out-of-band?
plan-approval, displaying its banner, responsive to user input. User (or architect) runsporch approve <id> plan-approvalfrom a different terminal. How does the builder's live Claude session notice?porch statusevery N seconds — wasteful, but simpleporch statusto confirm, then advancesporch approvewrite to a file the builder's prompt can watch? Check porch state storage atcodev/projects/<id>/status.yamlQ3. Does
PHASE_COMPLETEwith a pending gate correctly mark phase-complete-but-blocked in porch state?<signal>PHASE_COMPLETE</signal>after posting its plan comment. Does porch (a) advance to the next phase immediately, (b) mark phase complete but gate pending, or (c) error because the gate hasn't been approved?porch approveruns — meaning the builder does need to detect approval somehow (see Q2)Q4. What does
porch statusoutput when a gate is pending?code-reviewgate state)porch init pir <test-id>andporch status <test-id>after sim-emitting PHASE_COMPLETE; verify output formatQ5. Feedback-loop iteration and porch state
afx sendor issue comment, the builder revises its artifact and re-posts. Does the builder emitPHASE_COMPLETEagain on each revision? Does porch distinguish "first attempt" from "revised attempt"?PHASE_COMPLETEis idempotent or if porch tracks emission countQ6. Resumption behavior after builder crash
porch status, seesphase: plan, gate: plan-approval, pending. It must know to:afx sendmessages that arrived while it was downplan.mdandimplement.mdprompts include explicit resumption logicPotential porch-side changes (only if spec-phase investigation reveals them)
List the changes that could be needed, for transparency. The spec phase will determine which (if any) are actually required.
code-reviewin a gate-name allowlistporch wait-for-approval <id> <gate>commandpackages/codev/src/commands/porch/— new command modulePHASE_COMPLETE+ gate semanticscodev/resources/protocol-format.mdor equivalentporch status --jsonformatpackages/codev/src/commands/porch/status.tsIf none of these are needed, PIR remains a pure data-only addition (protocol.json + prompts + skeleton mirror + spawn routing + docs). If some are needed, they get scoped into this issue (not deferred) — they're prerequisites for PIR to work end-to-end.
Spec-phase deliverable
Before the plan phase begins, the SPIR spec phase for this issue must answer Q1–Q6 with code-verified answers and document them in the spec. The plan phase designs the prompts and tests around those answers.
Files to create
Protocol directory —
codev/protocols/pir/Skeleton mirror —
codev-skeleton/protocols/pir/Byte-identical mirror of the above.
packages/codev/skeleton/protocols/pir/is auto-regenerated by thecopy-skeletonbuild script on next build.Reference implementations:
codev/protocols/bugfix/for the autonomous cousin (investigate → fix → pr phase structure),codev/protocols/spir/for the gate-at-plan pattern (seeprotocol.jsonfor gate field syntax).Files to modify
packages/codev/src/agent-farm/cli.ts— update--protocoloption help text to includepirpackages/codev/src/agent-farm/commands/spawn.ts— route--protocol pirthrough the bugfix spawn path. One line ingetSpawnMode:PIR's spawn is issue-driven with a worktree — identical infrastructure to bugfix. Protocol-specific behavior lives entirely in the prompts, not in the spawn code.
CLAUDE.md— addpirto the Available Protocols list and a "Use PIR for" entry in the Protocol Selection Guide:AGENTS.md— mirror the CLAUDE.md update (AGENTS.md standard compatibility)packages/codev/src/__tests__/skeleton.test.ts— addpirto the expected protocol list if the test enumerates protocolspackages/codev/src/commands/porch/__tests__/protocol.test.ts— add a test assertingloadProtocol('pir')returns a protocol with aplanphase carryingplan-approvalgate and animplementphase carryingcode-reviewgateAcceptance criteria
porch init pir 42 --worktree .builders/pir-42-testinitializes a project with the PIR phases + gates. Porch state correctly showsplanphase with pendingplan-approvalgate.afx spawn <N> --protocol pirspawns an issue-driven builder in.builders/pir-<N>-<slug>/. No error; uses the same spawn path as bugfix.gh issue comment. Comment contains all five sections (Understanding, Proposed change, Files to change, Risks, Test plan).PHASE_COMPLETE; porch marksplan-approvalas pending. Builder's Claude session stays interactive, displays the banner. Dashboard + VSCode Needs Attention view show the builder as blocked atplan-approval.afx send <builder-id> "<feedback>"OR posts an issue comment → builder reads, revises plan, re-posts updated comment. Gate stays pending.porch approve <id> plan-approval --a-human-explicitly-approved-this(or VSCode "Approve Gate") → builder transitions toimplementphase on next poll.PHASE_COMPLETE.code-reviewas pending. Builder stays interactive with the second banner. Dashboard + sidebar reflect the state.afx send, issue comments, or direct terminal input. Builder revises and re-posts updated summary.porch approve <id> code-review→ builder transitions toreviewphase.gh pr createwithFixes #<N>, runs CMAP 3-way review, updates PR description, notifies architect viaafx send architect "PR #<M> ready (PIR #<N>)".gh pr merge <M> --merge(non-squash). Issue auto-closes viaFixes #<N>.code-reviewgate, reviewer runsafx dev <builder-id>and the worktree's dev server starts on main's ports. Reviewer tests in browser.code-reviewin the Codev sidebar → View Diff (opens native diff editor) → Run Dev Server (spawns Tower dev PTY, auto-swaps from previous builder's dev if any) → Approve Gate (Cmd+K G) advances.pnpm buildregeneratespackages/codev/skeleton/protocols/pir/fromcodev-skeleton/protocols/pir/.Non-goals (explicitly out of scope)
plan-approvalgate name is already defined in SPIR,code-reviewis a new string but no engine code changes are neededcodev/reviews/<N>-pir.mdfile — PIR doesn't produce a retrospective review file; the issue comments + PR body are the artifactsporch rejectcommand — rejection is via the feedback-iterate pattern, matching SPIR--permission-mode plan) — not used; PIR's approach is prompt-driven, not runtime-enforcedDependencies
code-reviewgate is meaningless without runnable worktrees. Without Issue 1, the reviewer can't test the builder's changes; the gate degenerates to a diff-read, which is no better than what existing protocols offer at PR time.git -C+afx dev, but the sidebar-native UX that motivates PIR's existence isn't delivered).PIR can technically ship without Issues 1+2, but the end-to-end review UX it promises requires them. Recommend shipping Issues 1 and 2 before this.
Benefits delivered
Reference implementations / existing patterns
codev/protocols/bugfix/protocol.json,codev/protocols/bugfix/prompts/*.md,codev/protocols/bugfix/builder-prompt.mdcodev/protocols/spir/protocol.json— shows"gate": "plan-approval"on the plan phasepackages/codev/src/commands/porch/protocol.ts(loadProtocol),packages/codev/src/commands/porch/state.ts(gate state)codev/protocols/spir/protocol.md:49— "When a gate is reached, porch notifies the architect and blocks until approved" — PIR inherits this automaticallygetSpawnModeinpackages/codev/src/agent-farm/commands/spawn.tscopy-skeletonscript inpackages/codev/package.jsonRisks and mitigations
codev/plans/<N>.mdfile convention. Mitigation: PIR's protocol.md explicitly documents that its plan is a GitHub issue comment, not a file; gate name (plan-approval) is reused but scoped by project id.worktree.devCommand), the gate is just a diff-read. Mitigation: documented explicitly; users who don't want this step should use BUGFIX or AIR.