Skip to content

[Bugfix #872] Fix: emit canonical pr_ready_for_human signal across all protocols#874

Merged
waleedkadous merged 10 commits into
mainfrom
builder/bugfix-872
May 26, 2026
Merged

[Bugfix #872] Fix: emit canonical pr_ready_for_human signal across all protocols#874
waleedkadous merged 10 commits into
mainfrom
builder/bugfix-872

Conversation

@waleedkadous
Copy link
Copy Markdown
Contributor

Summary

Fixes #872.

PR #844 (v3.1.3) gated Needs Attention on b.blocked === 'PR review', a protocol-specific derivation that worked for AIR/SPIR/ASPIR/PIR but silently dropped BUGFIX PRs (BUGFIX has no pr gate — it transitions straight to phase: verified after CMAP). This change makes the signal canonical: porch writes pr_ready_for_human: boolean to status.yaml when transitioning out of the CMAP-emitting state for the PR-creating phase, for ALL five bundled protocols.

Root Cause

"CMAP complete for the PR-creating phase" had one definition in porch's state machine but two surface manifestations depending on whether the protocol had a pr gate. Consumers had to know the protocol-specific shape, and the only way to discover you got it wrong was to ship a regression — which #844 did.

Fix

State machine (3 touchpoints, one canonical write each):

  • next.ts handleVerifyApproved — SPIR/ASPIR/PIR review (build_verify, gate=pr): set true at gate-pending
  • index.ts done() — AIR pr (once-phase, gate=pr): set true when auto-requesting the pr gate
  • index.ts advanceProtocolPhase — BUGFIX pr (once-phase, no gate, terminal): set true on transition to verified

PR-creating phase identification: new isPrCreatingPhase(protocol, phaseId) helper returns true when the phase has gate === 'pr' OR a consultation block in protocol.json (preserved on ProtocolPhase.hasConsultation). Both markers are picked up retroactively — no protocol.json edits required.

Reset to false on pr-gate approval (covers all four gate-protocols) and on rollback. BUGFIX has no gate, so the field stays true until cleanup deletes the worktree — matching the model that a BUGFIX builder is done after PR creation.

Consumer (overview.ts): new derivePrReady reads the explicit field if present, else falls back to blocked === 'PR review' || (phase === 'verified' && protocol === 'bugfix'). The BUGFIX-specific fallback closes the v3.1.3 gap for in-flight builders whose status.yaml pre-dates this field. New prReady field on BuilderOverview exposes the signal.

Dashboard (NeedsAttentionList.tsx): gates on b.prReady instead of b.blocked === 'PR review'. Falls back to pr.createdAt for prReady builders without a gate (BUGFIX shape — no blockedSince).

Test Plan

  • 8 new tests in pr-ready-872.test.ts covering the field lifecycle for SPIR/ASPIR/PIR (build_verify review), AIR (once-phase pr with gate), BUGFIX (once-phase pr without gate — the regression case), the REQUEST_CHANGES rebuttal cycle, pr-gate approval reset, and rollback reset.
  • 6 new tests in overview.test.ts for derivePrReady: explicit-true, explicit-false (trumps fallback), legacy SPIR/AIR fallback, BUGFIX gap closure, and non-BUGFIX verified no-fire.
  • 2 new tests in NeedsAttentionList.test.tsx for BUGFIX visibility (the regression) and blockedSincecreatedAt fallback for gateless prReady builders.
  • All 3152 porch+codev tests pass (148 files); all 17 dashboard component tests pass.
  • Build clean (pnpm build).
  • Backward compat: in-flight projects on v3.1.x continue working — the OverviewBuilder.prReady fallback derives the same signal until porch writes the explicit field.

Files

  • Field + helpers: packages/codev/src/commands/porch/{types,protocol,next,index}.ts
  • Consumer: packages/codev/src/agent-farm/servers/overview.ts, packages/types/src/api.ts, packages/dashboard/src/components/NeedsAttentionList.tsx
  • Tests: packages/codev/src/commands/porch/__tests__/pr-ready-872.test.ts (new), packages/codev/src/agent-farm/__tests__/overview.test.ts, packages/dashboard/__tests__/NeedsAttentionList.test.tsx
  • Two existing test fixtures gained prReady: false to satisfy the new shared type (BuilderCard.test.tsx, spec-823-builder-attribution.test.ts)

Notes

  • One pre-existing test failure in packages/dashboard/__tests__/scrollController.test.ts (terminal scroll behavior) — completely unrelated to this work and fails identically on this branch in isolation.
  • Net diff: ~290 LOC added across implementation + tests; well under the 300 LOC BUGFIX threshold for the implementation alone.

…l protocols

PR #844 gated the dashboard's Needs Attention list on `b.blocked === 'PR review'`,
which silently dropped BUGFIX PRs because BUGFIX has no `pr` gate (it transitions
straight to phase `verified` after CMAP). The derivation worked for AIR/SPIR/
ASPIR/PIR but never for BUGFIX, and the only way to find out was to ship the
regression — which we did in v3.1.3.

This change adds an explicit `pr_ready_for_human: boolean` field to status.yaml
that porch sets when transitioning out of the CMAP-emitting state for the
PR-creating phase, for ALL bundled protocols:

  - SPIR/ASPIR/PIR review (build_verify, gate=pr) → next.ts handleVerifyApproved
  - AIR pr (once-phase, gate=pr)                  → index.ts done() auto-request
  - BUGFIX pr (once-phase, no gate, terminal)     → index.ts advanceProtocolPhase

The PR-creating phase is identified by either `gate === 'pr'` or a
`consultation` block in protocol.json (preserved on ProtocolPhase via
`hasConsultation`). The new `isPrCreatingPhase` helper centralizes this so
adding a new protocol with a CMAP-emitting PR phase just means landing either
marker.

The field is reset to false when the human acts: `pr` gate approval (covers
all four protocols with a gate) and rollback. BUGFIX has no gate so the field
stays true until cleanup deletes the worktree — which matches the cohort's
mental model (a BUGFIX builder is done after the PR is created).

The consumer side (overview.ts) reads the field if present, else falls back to
the v3.1.3 derivation plus a BUGFIX-specific case (`phase === 'verified' &&
protocol === 'bugfix'`) so in-flight BUGFIX builders from before this change
become visible immediately. The dashboard NeedsAttentionList now gates on the
canonical `b.prReady` boolean instead of the protocol-specific
`b.blocked === 'PR review'` check.

Fixes #872
…ocols

Adds pr-ready-872.test.ts with 8 regression tests covering the state-machine
touchpoints for SPIR/ASPIR/PIR (build_verify review), AIR (once-phase pr with
gate), and BUGFIX (once-phase pr without gate — the v3.1.3 regression).
Covers the REQUEST_CHANGES path (field stays false during rebuttal cycle) and
the reset transitions (pr-gate approval + rollback).

Extends overview.test.ts with 6 tests for derivePrReady: explicit field wins
over derivation, fallback fires for legacy state files (SPIR-shape and the
BUGFIX gap the canonical signal closes), and post-merge SPIR `verified` does
NOT fire the BUGFIX-specific fallback case.

Updates the existing NeedsAttentionList tests to use the new `prReady` API
and adds two new tests: BUGFIX PR visibility (the v3.1.3 regression) and
blockedSince→createdAt fallback for prReady builders without a gate.
@waleedkadous
Copy link
Copy Markdown
Contributor Author

CMAP Review (3-way, iter 1)

All three reviewers APPROVE with HIGH confidence.

Model Verdict Summary
Gemini APPROVE The PR perfectly implements the canonical pr_ready_for_human signal across all protocols, closing the Needs Attention gap for BUGFIX PRs, backed by thorough tests.
Codex APPROVE The fix is focused, matches issue #872, and adds solid regression coverage for BUGFIX plus the cross-protocol state transitions.
Claude APPROVE Clean, well-tested canonical signal that fixes BUGFIX PR visibility regression with solid backward compatibility.

KEY_ISSUES from all three: none.

Claude flagged one non-blocking observation: the condition gate === 'pr' || isPrCreatingPhase(...) is technically redundant when the first half matches, but is kept for clarity ("this is a PR gate OR this is a PR-creating phase"). Not worth changing.

Review files committed in codev/projects/bugfix-872-porch-emit-explicit-pr-ready-f/.

@waleedkadous
Copy link
Copy Markdown
Contributor Author

Architect integration review

Ran a 3-way architect-side CMAP per the dual-CMAP discipline. The builder's own CMAP was unanimous APPROVE but the architect-side caught two real bugs both other reviewers missed individually. Posting findings here so the PR record reflects them.

Verdicts: Gemini REQUEST_CHANGES (HIGH) · Codex REQUEST_CHANGES (HIGH) · Claude APPROVE (HIGH).

The state-machine three-tier design (status.yaml write → derivePrReady normalize → consumer gates on boolean) is well-architected and the right shape. Two specific defects need iter-2.

Required fix 1: isPrCreatingPhase overmatches RESEARCH (Codex)

isPrCreatingPhase() returns true for any phase with a consultation block. But codev/protocols/research/protocol.json has investigate and critique phases that carry consultation blocks and are not PR-creating. Consequences:

  • A research builder gets marked pr_ready_for_human: true even though no PR is involved
  • approve() only clears the flag for gate pr, so a non-PR consultation phase that advances via a different gate leaves the flag stuck true
  • The "canonical" signal is now inaccurate outside the five target protocols

Fix direction: narrow the classifier. consultation alone isn't enough. Options:

  • Require consultation.type === 'pr' (or consultation.on === 'review' matching the PR-creating shape) in addition to having consultation
  • Add an explicit is_pr_phase: true marker to the protocol.json schema for PR-creating phases and check that
  • Whichever you pick, add a regression test using RESEARCH (or a contrived non-PR consultation phase) so this can't regress silently

Required fix 2: NeedsAttentionList builder fallback drops BUGFIX builders (Gemini)

The builder loop (// Builders blocked on gate approvals) does if (!b.blocked || !b.blockedSince) continue before checking anything else. BUGFIX builders always have b.blocked === null — so if a BUGFIX PR is missing from the prs array (cache delay, pagination, transient API failure), the builder hits the continue and disappears entirely. That defeats the exact defensive case this PR is supposed to fix.

The test still surfaces a prReady builder when its PR is missing from prs mocks the BUGFIX builder with blocked: 'PR review' — that's unrealistic. Real BUGFIX builders have blocked: null and blockedSince: null. The test passes the unrealistic case but the realistic one breaks silently.

Fix direction: in the builder loop, check b.prReady before the !b.blocked skip. Emit a prReady-driven attention row when present; otherwise fall through to the gate-blocked logic. Update the BUGFIX test fixture to use realistic shape (blocked: null, blockedSince: null).

Non-blocking notes

  • Claude: VSCode builder tree (packages/vscode/src/views/builders.ts) could use prReady to boost BUGFIX builders into the priority section. Not in this PR's scope; flag for a small follow-up.
  • Claude: the derivePrReady fallback becomes dead code once all in-flight builders have cycled. Worth a v3.2+ cleanup pass.
  • Claude: minor redundancy in gate === 'pr' || isPrCreatingPhase(...) since isPrCreatingPhase already checks gate === 'pr'. Readable as-is, leave it.

Please push iter-2 with the two fixes above and the porch pr gate will re-trigger CMAP. Don't merge yet; same explicit-human-approval pattern as before.

…allback

Architect-side CMAP on iter-1 caught two real defects that the builder-side
CMAP missed individually (classic dual-CMAP-catches-more pattern). Both are
fixed here.

**Fix 1: isPrCreatingPhase overmatched RESEARCH** (Codex finding).
The iter-1 classifier triggered on any phase carrying a consultation block,
but RESEARCH's investigate / critique phases also carry consultation blocks
for non-PR purposes (type: "investigation" / "critique"). Consequences: a
research builder would have been marked pr_ready_for_human: true even with no
PR involved, and approve() only clears the flag for gate=pr, so a non-PR
consultation phase advancing via a different gate would leak the flag.

Narrowed the marker to `consultation.on === 'review'`, which BUGFIX and AIR
both carry today and RESEARCH does not. Renamed ProtocolPhase.hasConsultation
to hasPrConsultation to reflect the narrower semantics.

**Fix 2: NeedsAttentionList dropped BUGFIX builders on cache miss** (Gemini
finding). The builder loop early-exited on `!b.blocked || !b.blockedSince`
before checking prReady, and BUGFIX builders always have both as null (no
`pr` gate). With a realistic BUGFIX shape, if the PR was missing from the
cached prs array the builder would silently disappear — the exact defensive
case this PR is supposed to cover.

Restructured the loop so prReady is checked BEFORE the gate-blocked early-out.
prReady builders are emitted with kind='PR review' and a waitingSince fallback
chain (blockedSince → startedAt → now) since gateless protocols don't supply
blockedSince.
…stic BUGFIX shape

Adds 6 isPrCreatingPhase classifier tests covering the two markers
(gate==='pr' / consultation.on==='review') across AIR, BUGFIX, SPIR-shape,
RESEARCH investigate (must be false), RESEARCH critique (must be false), and
non-PR phases. Without the iter-2 narrowing fix the two RESEARCH cases would
fail — pins the regression so it can't return silently.

Updates the iter-1 'still surfaces builder when PR missing from prs' test
into two cleanly separated cases: a realistic BUGFIX shape (blocked=null,
blockedSince=null) that would have failed against the iter-1 NeedsAttentionList
loop, and the AIR-style gated variant that preserves the original coverage.
The BUGFIX assertion also pins the waitingSince fallback chain
(blockedSince → startedAt) so silent fallback regressions surface.
@waleedkadous
Copy link
Copy Markdown
Contributor Author

CMAP Review (3-way, iter 2)

Re-ran 3-way CMAP after addressing both architect findings. All three reviewers APPROVE with HIGH confidence.

Model Verdict Summary
Gemini APPROVE Both issues requested by the Architect have been fully addressed (RESEARCH protocol overmatch and NeedsAttentionList BUGFIX drop) with complete test coverage.
Codex APPROVE Iter-2 fixes the two real architect-raised defects, keeps the bugfix focused, and adds strong regression coverage for both the porch state machine and dashboard fallback paths.
Claude APPROVE Clean, well-tested canonical signal; both architect iter-2 bugs addressed and verified against bundled protocol.json files.

KEY_ISSUES from all three: none.

What changed in iter-2

  1. isPrCreatingPhase narrowed: classifier no longer matches bare consultation presence. Now requires consultation.on === 'review' (the marker BUGFIX/AIR pr phases already carry; RESEARCH investigate/critique do not). Renamed ProtocolPhase.hasConsultationhasPrConsultation. 6 new classifier tests including 2 that pin RESEARCH investigate/critique to false — would fail under iter-1 logic.

  2. NeedsAttentionList builder loop restructured: prReady check now comes BEFORE the !b.blocked early-out so BUGFIX builders (where blocked is always null) survive the cache-miss defense path. Added waitingSince fallback chain (blockedSince → startedAt → now) for gateless protocols. The iter-1 missing-PR test split into a realistic BUGFIX case (blocked: null, blockedSince: null) — which would fail under iter-1 logic — and a preserved AIR-style gated variant.

@waleedkadous
Copy link
Copy Markdown
Contributor Author

Approved. Both fixes are surgical and well-commented — the inline note in isPrCreatingPhase's JSDoc explaining why bare consultation presence isn't enough is exactly the kind of comment that prevents this regression from recurring. The renaming hasConsultationhasPrConsultation is the right call too — the field name now matches what it actually means.

Please merge with --admin bypass.

@waleedkadous waleedkadous merged commit d854849 into main May 26, 2026
6 checks passed
@waleedkadous waleedkadous deleted the builder/bugfix-872 branch May 26, 2026 23:17
waleedkadous added a commit that referenced this pull request May 27, 2026
v3.1.4 PR #874 wired BUGFIX's pr_ready_for_human=true to the
advanceProtocolPhase terminal-exit setter because BUGFIX had no
`pr` gate. That set-point only fires when the builder calls
`porch done` post-merge — by which point the human has already
acted. BUGFIX PRs sat at `phase: pr, pr_ready_for_human: false`
indefinitely, never surfacing in Needs Attention.

This adds "gate": "pr" to BUGFIX's pr phase (same shape as AIR),
so the canonical signal fires via the gate-request setter at the
right moment — when CMAP completes and the PR is waiting for a
reviewer. The BUGFIX-no-gate branches in advanceProtocolPhase,
handleVerifyApproved, and isPrCreatingPhase are now redundant and
removed; hasPrConsultation field dropped from ProtocolPhase. The
derivePrReady BUGFIX fallback in overview.ts is intentionally left
in place as graceful-degradation for in-flight v3.1.4 projects.

Fixes #887
waleedkadous added a commit that referenced this pull request May 28, 2026
Cross-reference the defensive prReady builder-fallback against the
workspace's recently-merged PR set. After a PR merges, pendingPRs (open
PRs only) correctly omits it, but in-flight builders that crossed the
v3.1.4 → v3.1.5 line-453 setter boundary keep stale pr_ready_for_human
in status.yaml. The iter-2 cache-miss defense (PR #874) couldn't
distinguish "PR missing because cache miss" from "PR missing because
merged" and surfaced both — emitting a stale "PR review" row for work
already shipped.

Adds OverviewData.recentlyMergedIssueIds (derived from the already-
fetched mergedPRs list via parseLinkedIssue) and plumbs it into
NeedsAttentionList.buildItems. The defensive emit now skips when the
builder's issueId is in the merged set.

Fixes #901
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

porch: emit explicit pr_ready_for_human state field when CMAP completes

1 participant