feat(v4): cursor-driven engine refactor + UX hardening by jwaldrip · Pull Request #323 · gigsmart/haiku-method

jwaldrip · 2026-05-07T03:02:32Z

Summary

Replaces the pre-v4 phase/status FSM with a cursor that walks per-unit frontmatter (iterations[] / reviews{} / approvals{}) instead of per-stage state.json. On top of the v4 base, this PR adds 9 numbered priorities (P1–P9) — each scoped to a single class of bug or UX pain surfaced in real session logs from the 2026-05-04 → 2026-05-06 run (588M tokens, 52h wall-clock, retry loops + branch hijacks + missing collaboration moments).

What landed

Engine

v0→v4 soft-scrub migrator (BFS-graph migrate-registry). Strips deprecated FM, stamps plugin_version, synthesizes terminal approvals for v3-completed units, relocates v3 FBs whose upstream_stage pointed elsewhere, preserves replies[] (caught a silent data-loss bug where the migrator's denylist included replies while v4 still reads them), tolerates malformed YAML per-file with structured warnings.
New field-hygiene.ts cruft detector for post-migration audit.
Cursor walks 26 lifecycle scenarios (was 13). Adds: design_direction_required hard gate (studio/stage-conditional), clarify_required gate at elaborate entry sourced from per-stage clarify/*.md, finished discovery_required (cursor.ts:419 was a stub) via readStageArtifactDefs + per-unit fm.discovery.<agent> records, classifier-first dispatch on unclassified user feedback, merge_stage transition, cross-stage FB priority, mid-wave noop, closed-FB invalidation re-route, reject_hat re-entry.
New haiku_feedback_set_targets MCP tool. Classifier hat lands user-FB triage (target_unit, target_invalidates, reasoning) once and only once — immutable post-classification.
Reply-on-closure: terminal haiku_feedback_advance_hat requires a reply string. Surfaces in SPA as a "Resolved" card with dismiss action; filter chip for unread replies.
Resolution dropdown removed from SPA. Agent classifies, not user.

P1–P9 (this session's priorities)

P1: file-backed dispatch DEFAULT for every haiku_run_next tick. Skip set is noop/sealed/error. Cuts 47K-char tool-result blobs.
P2: self-contained dispatch prompts. start_feedback_hat emits one per-FB subagent block with the canonical FB ID inlined. Kills the 6-retry ID-guessing loop.
P3: design-direction hard gate. STAGE.md requires_design_direction: true. pick_design_direction stamps design_directions.<stage> on intent.md. software/design opted in.
P4: clarify gate at elaborate entry. Sources questions from <studio>/stages/<stage>/clarify/*.md. Records clarifications.<stage> on intent.md. Stage-conditional. No autopilot bypass.
P5: research only (channels = recommended for server-push, Monitor + file polling = lean fallback). Citations in research subagent output.
P6: edit-auto-read-hint hook. PostToolUse on Edit/MultiEdit detects "file not read yet" and surfaces actionable Read+retry message.
P7: discovery_required wiring (was stubbed in v4 base).
P8: discovery template uniqueness validation. Studio loader throws on colliding location: fields.
P9: lock-aware branch enforcement. ensureOnStageBranch refuses checkout under a locked worktree. Closes the hijack class — this very worktree got switched mid-session by a stray run_next from a different intent. The recovery walked through git stash pop of WIP on haiku/v4-engine-refactor.

Studios

Classifier hat distributed to all 23 non-software studios (114 stages, 114 fix_hats updated). All software/* stages already had it. Sequence: [classifier, <implementer>, feedback-assessor].
design-stage opted in to requires_design_direction: true.

SPA

Dual-path for v3↔v4 status / phase / completion across 9 files.
New helpers: isIntentTerminal (sealed_at + v3 fallback), deriveUnitStatus (iterations[]-based), resolveWalkthroughForDetail (tab-scoped fallback — fixes the advance/back UX bug Chris flagged).
Schema indicator chip (v4 / v3) on IntentReview.
Migrated breadcrumb banner when intent or unit approvals carry migrated: true.
await_gate UX: 30min → 4h timeout, isError: false on timeout (continuation, not fault), per-session announced_at so retries don't re-post the URL.

Website (browse)

Dual-paths in parseIntentFromRaw, all three providers (local, github, gitlab).
New helpers: deriveStageStatusFromUnits, deriveV4ActiveStage.
Schema indicator chip on IntentDetailView.
20 new derivation/integration tests via npx tsx.

Tests

Engine: 1244/1244 (was ~1130 at session start). 73 → 74 test files.
SPA: 453/453 (was 441).
New test files: cursor-walk (26 scenarios), v0-to-v4-realistic-scenario (11), v0-to-v4-migrator (4), feedback-set-targets (7), haiku-reflect-dual-path (8), field-hygiene (4), skill-tools-smoke (14), announcement-contract (7), await-gate-timeout-classifier (9), await-gate-announce-dedup (4), worktree-lock-guard (4), closure-reply-filter (6), plus SPA-side walkthrough/review-helpers/UnitsTable derivation.
run-all.mjs: silent-test-loss guard + node:test output parsing (caught a hidden-coverage bug where 12 tests were invisible to the aggregator).

Skill audit

Dual-path on haiku_dashboard, haiku_capacity, haiku_reflect. Smoke tests for previously-untested haiku_intent_archive/unarchive, haiku_backlog, haiku_seed, haiku_release_notes, haiku_review, haiku_version_info.
4 skill docs updated (refine, revisit, adopt, repair).
repair: documented split-purpose under v4.

Known gaps (deferred)

No real-world .haiku/intents/ fixture run-through. All migrator + cursor fixtures are synthesized.
No end-to-end "create → run → seal" pipeline test.
Drift sweep cursor-emit path not scenario-tested.
intent_completion_review / intent_completion_fix cursor scenarios stubbed.
P5 channels / Monitor pattern researched, not built.

Test plan

cd packages/haiku && node test/run-all.mjs shows 1244/1244
cd packages/haiku-ui && npx vitest run shows 453/453
cd website && npx tsx lib/browse/__tests__/v4-derivation.test.ts shows 20/20
Type-check: cd packages/haiku && npx tsc --noEmit clean (pre-existing v4-base errors notwithstanding)
Type-check: cd packages/haiku-ui && npx tsc --noEmit clean (modulo routeTree.gen codegen)
Run a real v3 intent through the migrator (point at an existing .haiku/intents/ from a current production repo) and verify the field-hygiene report is empty afterwards
Open the SPA against a v4 intent — verify schema chip shows v4, no v3-only fields render
Open the SPA against a migrated v3 intent — verify migrated-from-v3 banner renders if any approval carries migrated: true
Verify worktree lock guard: lock a worktree, try to haiku_run_next for a different intent from inside it, confirm refusal with worktree_locked block

🤖 Generated with Claude Code

Skip archetype generation when the designer already has finished designs. The picker now opens with no archetypes — the user either uploads files (`mode: "upload"`, written to `stages/<stage>/artifacts/design-direction/uploads/` and surfaced via a new `design_direction_uploaded` action) or signals the agent to generate variants (`mode: "generate"`). Generation only happens after the explicit ask, so we don't burn tokens producing variants the user is going to throw away. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the pre-v4 phase/status FSM with a cursor that walks per-unit frontmatter (`iterations[]` / `reviews{}` / `approvals{}`) instead of per-stage `state.json`. Adds 9 numbered priorities (P1–P9) on top of the v4 base, each scoped to a single class of bug or UX pain surfaced in real session logs. Engine - v0→v4 soft-scrub migrator (BFS-graph migrate-registry). Strips deprecated FM, stamps `plugin_version`, synthesizes terminal approvals for v3-completed units, relocates v3 FBs whose `upstream_stage` pointed elsewhere, preserves `replies[]` (caught silent data-loss bug), tolerates malformed YAML per-file. - New `field-hygiene.ts` cruft detector for post-migration audit. - Cursor walks 26 lifecycle scenarios (was 13). Adds: • design_direction_required hard gate (studio/stage-conditional) • clarify_required gate at elaborate entry, sourced from per-stage `clarify/*.md` • discovery_required (finished the cursor.ts:419 stub via readStageArtifactDefs + per-unit `fm.discovery.<agent>` records) • classifier-first dispatch on unclassified user feedback • merge_stage transition, cross-stage FB priority, mid-wave noop, closed-FB invalidation re-route, reject_hat re-entry - New `haiku_feedback_set_targets` MCP tool. Classifier hat lands the user-FB triage decision (target_unit, target_invalidates, reasoning) once and only once — immutable post-classification. - Reply-on-closure: terminal `haiku_feedback_advance_hat` requires a `reply` string; surfaces in SPA as a "Resolved" card with dismiss. Filter chip for unread replies in the FB summary bar. - Resolution dropdown removed from SPA. Agent classifies, not user. - File-backed dispatch is now the DEFAULT for every haiku_run_next tick (P1). Skip set is just noop/sealed/error. Cuts the 47K-char tool-result blobs that ate context. - Self-contained dispatch prompts (P2): `start_feedback_hat` emits one per-FB subagent block with the canonical FB ID inlined into every tool call. No more `<FB-NN>` placeholders. Kills the 6-retry ID-guessing loop pattern. - Lock-aware branch enforcement (P9): `ensureOnStageBranch` refuses to checkout under a locked worktree. Closes the hijack class that caused the 2026-05-06 incident where this very worktree got switched to a different intent's branch. - Edit-auto-read-hint hook (P6): PostToolUse hook on Edit/MultiEdit detects "file not read yet" and surfaces an actionable Read+retry message with the file path inlined. Studios - Classifier hat distributed to all 23 non-software studios (114 stages, 114 fix_hats updated). software/* stages already had it. fix_hats sequence: [classifier, <implementer>, feedback-assessor]. - design-stage opted in to `requires_design_direction: true`. - Discovery template uniqueness validation (P8): studio-loader refuses to load when two discovery templates within a stage share the same `location:` field. SPA - Dual-pathing for v3↔v4 status / phase / completion across 9 files. - New helpers: `isIntentTerminal` (sealed_at + v3 fallback), `deriveUnitStatus` (iterations[]-based with v3 status fallback), `resolveWalkthroughForDetail` (tab-scoped fallback for off-tab browsing — fixes Chris Downard's advance/back UX bug). - Schema indicator chip (v4 / v3) on IntentReview. - Migrated breadcrumb banner when intent or unit approvals carry `migrated: true`. - await_gate UX: 30min → 4h timeout, isError: false on timeout (continuation, not fault), per-session `announced_at` so retries don't re-post the URL. Website (browse) - Dual-paths for v4 in `parseIntentFromRaw`, all three providers (local, github, gitlab), search index inherits derivation via `intent.status`. - New helpers: `deriveStageStatusFromUnits`, `deriveV4ActiveStage`. - Schema indicator chip on IntentDetailView header. - 20 new derivation/integration tests via `npx tsx`. Tests - Engine: 1244/1244 (was ~1130 at session start). 73 → 74 test files. - SPA: 453/453 (was 441). - New test files (this session): cursor-walk (26 scenarios), v0-to-v4-realistic-scenario (11), v0-to-v4-migrator (4), feedback-set-targets (7), haiku-reflect-dual-path (8), field-hygiene (4), skill-tools-smoke (14), announcement-contract (7), await-gate-timeout-classifier (9), await-gate-announce-dedup (4), worktree-lock-guard (4), closure-reply-filter (6), and SPA-side walkthrough/review-helpers/UnitsTable derivation tests. - run-all.mjs: silent-test-loss guard + node:test output parsing (caught a hidden-coverage bug where 12 tests were invisible). Skill audit - Refresh + dual-path on `haiku_dashboard`, `haiku_capacity`, `haiku_reflect`. Smoke tests for previously-untested `haiku_intent_archive/unarchive`, `haiku_backlog`, `haiku_seed`, `haiku_release_notes`, `haiku_review`, `haiku_version_info`. - 4 skill docs updated (refine, revisit, adopt, repair) for v4 wording. - repair: documented split-purpose under v4 (drift-baseline rebuilds + worktree relocation are still relevant; metadata-cleanup paths are obsolete and no-op on v4 intents). Known gaps (deferred): - No real-world `.haiku/intents/` fixture run-through. All migrator + cursor fixtures are synthesized. - No end-to-end "create → run → seal" pipeline test. - Drift sweep cursor-emit path not scenario-tested (the sweep itself has unit tests). - intent_completion_review / intent_completion_fix cursor scenarios are stubbed. - P5 channels / Monitor pattern researched, not built. Channels (https://code.claude.com/docs/en/channels.md) is the recommended path for SPA-decision push when we get to it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-05-07T03:02:48Z

Claude finished @jwaldrip's task in 15m 3s —— View job

PR Review: feat(v4): cursor-driven engine refactor + UX hardening

Gather PR context and metadata
Review core engine changes (cursor.ts, run-tick.ts)
Review migration logic (v0-to-v4.ts, migrate-registry.ts)
Review new MCP tools and state-tools changes
Review SPA changes
Review security / correctness issues
Post final review summary

This is a large, architecturally significant PR. The cursor model is cleaner than the handler FSM it replaces. Several issues found range from a guaranteed post-ship breakage to data-loss-class migration bugs. Listed in priority order.

CRITICAL

1. Version bump after ship permanently blocks all v4 intents

File: packages/haiku/src/orchestrator/workflow/run-tick.ts:71

if (sourceVersion !== target) {   // exact string comparison
    migrateIntent(ctx, sourceVersion, target)   // throws if no edge

The registry (migrate-registry.ts) has exactly one edge: "0" → "4.0.0". CI auto-bumps the plugin version on every merge to main. The first merge after this PR ships (say 4.0.1) causes every intent with plugin_version: "4.0.0" to hit migrateIntent("4.0.0", "4.0.1"). findChain finds no edge → throws "no migration path from 4.0.0 to 4.0.1" → every tick on every v4 intent returns action: "error".

Fix: Change line 71 to compare majors only, not full semver:

const sourceMajor = Number(sourceVersion.split(".")[0] ?? "0")
if (sourceMajor !== targetMajor) {

Fix this →

2. `result: "rejected"` (v3 form) not normalized — migrated units re-dispatch forever

File: packages/haiku/src/orchestrator/migrations/v0-to-v4.ts, packages/haiku/src/orchestrator/workflow/cursor.ts:287

The realistic scenario test at test/v0-to-v4-realistic-scenario.test.mjs:176 shows v3 units writing result: "rejected" (past tense). The v4 cursor at cursor.ts:287 checks last.result === "reject" (present tense). The migrator preserves iterations[] as-is with no normalization pass.

A migrated unit whose last iteration has result: "rejected" falls through both "advance" and "reject" checks and hits return null (line 300) — treated as in-progress on the current hat. walkStageUnitsTrack returns null for this unit at step 3 (in-flight), preventing the wave from progressing. Every tick for this stage returns a noop. The unit is never re-dispatched but also never completed.

Fix (two options):

In migrateUnitFile, add a normalization pass over iterations[]: if (iter.result === "rejected") iter.result = "reject".
Or widen cursor.ts:287 to last.result === "reject" || last.result === "rejected" as a compat shim.

Option 1 is cleaner because it eliminates the drift at the source.

Fix this →

3. `withIntentMainLock` is imported and documented but never called — stage→main merges are unserialised

File: packages/haiku/src/state-tools.ts:62, packages/haiku/src/orchestrator/prompts/merge_stage.ts:28

$ grep -rn "withIntentMainLock" packages/haiku/src/
packages/haiku/src/state-tools.ts:62:          import { withIntentMainLock, ... }
packages/haiku/src/locks.ts:202:          export function withIntentMainLock<T>(...)
packages/haiku/src/orchestrator/prompts/merge_stage.ts:28:  "... the engine will perform the stage→main merge under `withIntentMainLock`..."

withIntentMainLock is imported in state-tools.ts, defined in locks.ts, and promised to the agent in merge_stage.ts:28 — but it is never called anywhere. The stage-branch-into-main merge runs without serialisation. Two parallel merge_stage ticks (e.g. from concurrent autopilot sessions or a retry during a slow merge) can both call mergeStageBranchIntoMain simultaneously, producing a merge in progress git error or silently clobbering each other's work.

Fix: Wrap mergeStageBranchIntoMain call-sites (wherever that function is invoked in run-tick.ts or the merge handler) in withIntentMainLock(slug, () => mergeStageBranchIntoMain(...)).

HIGH

4. `close_feedback` has no prompt builder — agent receives "Unknown Action" JSON dump

File: packages/haiku/src/orchestrator/workflow/cursor.ts:425, packages/haiku/src/orchestrator/prompts/index.ts:61–108

The cursor emits { kind: "close_feedback" } when a terminal fix-hat advances (cursor.ts:425). cursorActionToOrchestratorAction maps this to action: "close_feedback". The actionPromptBuilders map has no entry for "close_feedback". buildRunInstructions (orchestrator.ts:255–258) falls through to:

## Unknown Action: close_feedback
{ "kind": "close_feedback", "stage": "...", "feedback_id": "..." }

The agent has no explicit instruction on which tool to call. The cursor re-emits close_feedback on every tick until closed_at is stamped. The agent must self-recover from raw JSON — possible but fragile. Under retry pressure (P2 sessions), this becomes a multi-turn stall.

Fix: Register a "close_feedback" entry in actionPromptBuilders that instructs the agent to call haiku_feedback_update { intent, feedback_id, status: "closed" }.

5. `isAlive(pid)` catches EPERM as dead — premature lock-stealing on Windows and cross-user Linux

File: packages/haiku/src/locks.ts:57–64

function isAlive(pid: number): boolean {
    try {
        process.kill(pid, 0)
        return true
    } catch {          // ← catches ALL errors, including EPERM
        return false
    }
}

On Windows, process.kill(pid, 0) throws EPERM (not ESRCH) when the target process is alive but belongs to a different user account. The catch-all treats this as "dead" → isStale returns true → steal() deletes the lock while the holder is still running. Two different user sessions sharing a machine (common in CI environments with multiple agents) will steal each other's stage-merge locks.

Fix: Differentiate errors: ESRCH means dead, EPERM means alive-but-unpingable:

} catch (err) {
    return (err as NodeJS.ErrnoException).code === "EPERM"
}

Fix this →

6. Stage[0] fallback when all stages merged runs drift/feedback tracks on completed work

File: packages/haiku/src/orchestrator/workflow/cursor.ts:737–759

const activeStage = firstUnmergedStage(slug, studio)
const stage = activeStage ?? resolveStudioStages(studio)[0] ?? ""

if (stage) {
    const drift = runDriftSweep({ intentDir, stage, studio })  // runs on stage[0]
    ...
}
if (stage) {
    const fbAction = walkFeedbackTrack({ intentDir, studio, currentStage: stage })  // walks 0..0
    ...
}
// "All stages merged" path is never reached if an open FB exists on stage[0]

When firstUnmergedStage returns null (all stages merged), stage falls back to the first stage. Track C runs a drift sweep on stage[0]'s outputs. Track B walks feedback from stages 0..0. Any lingering open FB on stage[0] (e.g. from a user-chat FB that was filed but not triaged) preempts the intent-level approvals track forever — the cursor never reaches line 774 (// All stages merged → intent-level approvals).

Fix: When activeStage === null, skip Tracks C and B entirely. The intent is past all stage work; only the intent-level approval walk should run.

MEDIUM

7. `walkFeedbackTrack` slices all stages when `currentStage` is unknown

File: packages/haiku/src/orchestrator/workflow/cursor.ts:363–365

When currentStage is not found in the studio's configured stage list (cutoff === -1), the slice falls back to stages.slice() — the entire stage list. The comment says it covers "stages 0..currentStage" but the fallback covers stages 0..N. This surfaces FBs from future stages (not yet started) if currentStage is an unknown value (e.g., after a studio reconfiguration with a renamed stage).

Fix: When cutoff === -1, return [] (walk no stages) rather than walking all stages.

8. Drift sweep resolves all output paths relative to `intentDir` — repo-relative outputs never checked

File: packages/haiku/src/orchestrator/workflow/drift-sweep.ts:205 (approx)

Output paths in unit.outputs[] come in two forms: intent-relative (stages/design/foo.md) and repo-relative (src/components/Button.tsx). The drift sweep does join(intentDir, out) for all paths. For repo-relative paths this resolves to <intentDir>/src/components/Button.tsx — a path that doesn't exist. The file is not found, no drift event is emitted, and the drift check silently passes for the most important artifacts (actual code files).

Fix: Distinguish intent-relative from repo-relative paths before resolving. Repo-relative paths (those that don't start with stages/) should be joined against repoRoot, not intentDir.

9. `parseFbIdFromFilename` fallback returns non-parseable IDs — causes infinite `start_feedback_hat` dispatch

File: packages/haiku/src/orchestrator/workflow/cursor.ts:438–441

function parseFbIdFromFilename(fbPath: string): string {
    const m = basename(fbPath).match(/^(\d+)-/)
    return m ? `FB-${m[1].padStart(2, "0")}` : basename(fbPath, ".md")

When a feedback file doesn't start with digits, the fallback returns the raw basename (e.g., orphan-note). The downstream tools (haiku_feedback_advance_hat, haiku_feedback_reject_hat) call findFeedbackFile which matches only ^(?:FB-)?(\d+)$. A bare orphan-note ID fails this → feedback_not_found error → the agent retries → cursor re-emits start_feedback_hat every tick.

Fix: In nextActionForFeedback, validate the parsed ID before returning an action. If parseFbIdFromFilename returns a value that doesn't match the expected pattern, skip the file (return null) rather than emitting an unresolvable action.

MINOR

10. Migration error path returns misleading `track: "drift"` with empty events

File: packages/haiku/src/orchestrator/workflow/run-tick.ts:87–89

When migration fails, the tick returns position: { track: "drift", action: { kind: "drift_detected", events: [] } }. A monitor or SPA reading position.track sees "drift with no events" — a benign no-op. The actual error is in action.action: "error". Consider returning track: "intent" (or a new "error" track) so the position reflects the error state, not a ghost drift.

11. `migrateUnitFile` line 243–245: no-op double-write of `started_at`

File: packages/haiku/src/orchestrator/migrations/v0-to-v4.ts:243

started_at is not in DEPRECATED_UNIT_FIELDS, so strip() preserves it unchanged. The if (next.started_at == null) { next.started_at = data.started_at ?? null } block is a no-op — next.started_at already equals data.started_at. Dead code that misleads readers into thinking started_at needs re-application after stripping.

Summary

#	Severity	File	Description
1	Critical	`run-tick.ts:71`	Exact semver comparison → all v4 intents break on next CI bump
2	Critical	`v0-to-v4.ts`, `cursor.ts:287`	`"rejected"` not normalized → migrated units stuck as in-progress
3	Critical	`state-tools.ts:62`	`withIntentMainLock` imported but never called — stage merges unserialised
4	High	`cursor.ts:425`, `prompts/index.ts`	`close_feedback` has no prompt builder → "Unknown Action" dump
5	High	`locks.ts:61`	`isAlive` catches EPERM as dead → premature lock-stealing on Windows
6	High	`cursor.ts:737–759`	Stage[0] fallback when all merged blocks intent-level approvals
7	Medium	`cursor.ts:363–365`	Unknown stage → walks entire stage list for FBs
8	Medium	`drift-sweep.ts`	Repo-relative output paths resolved against wrong base
9	Medium	`cursor.ts:438–441`	Non-numeric FB filenames produce unresolvable dispatch IDs
10	Minor	`run-tick.ts:87`	Migration error returns misleading `track: "drift"`
11	Minor	`v0-to-v4.ts:243`	No-op `started_at` double-write

Issues 1, 2, and 3 should be resolved before merging. Issue 1 in particular is a ticking clock — it activates on the first CI bump after this lands.

Checkpoint of in-flight v4 engine work before merging the design-direction intake/upload PR (#322). What's in this checkpoint: - Cursor-driven engine refactor: derivePosition + Track A/B/C walks, firstUnmergedStage with strict-ahead semantic, body-sha256 witness model replacing baseline.json, drift-sweep dedup by source_ref. - Filesystem-mode persistence first-class: stages_merged in intent.md as canonical merge signal when no git repo. - All MCP elicitation removed; studio/mode/stage selection routes through SPA picker (createPickerSession + runPicker helper). - Tick-driven select_*: run-tick.ts emits select_studio / select_mode / select_stage when fields missing; haiku_run_next intercepts and runs the picker inline so the agent sees a blocking tick, never "Call haiku_select_*" instructions. - close_feedback / merge_stage / merge_intent auto-execute in haiku_run_next instead of bouncing back to the agent. - E2E suites: software studio, multi-mode, feedback mid-flight, drift mid-flight, filesystem mode, squash-merge fallback, picker wire round-trip, SPA wire round-trip. Next up (not yet done): - Consolidate URL+await pairs (gate_review, visual question, design direction) into single blocking tool calls so the agent never sees "post URL + call await_*" two-step instructions. - Pull in PR #322 (intake-first design picker + upload mode) and align the new wire shapes with the engine-side blocking pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pulls in the intake-first design picker (PR #322) and aligns the new wire shapes with the v4 cursor-driven engine. What carried over verbatim: - haiku-api schemas: DirectionUploadFile + DirectionUploadModeSchema + DirectionGenerateModeSchema discriminator branches. - haiku-ui DirectionPage split into IntakePage + ArchetypePage with drag/drop, per-file caption, generate fallback. - HTTP /direction/:id/select route handles upload mode via persistDesignDirectionUploads (decodes data URLs into stages/<stage>/artifacts/design-direction/uploads/). - Studio docs (ARCHITECTURE.md, software/design ELABORATION.md) + ModalRouter blurb describe intake-first. - New prompts: design_direction_required (rewritten for intake-first) + design_direction_uploaded. What had to be ported (PR #322 modified handlers/elaborate.ts; v4 deleted that file because the cursor itself now emits dispatch actions): - cursor.ts: design-direction gate is now two-phase. After selection, the cursor's surface-once branch emits ONE `design_direction_complete` (archetype mode) or `design_direction_uploaded` (upload mode) action so the agent can Read screenshots / uploaded files before elaborate. Surfaced state tracked by `surfaced_at` on the same intent.md record. - haiku_run_next.ts: stamps `surfaced_at` on intent.md right before returning the surface-once action so the next tick falls through to elaborate (mirrors the PR's design_direction_surfaced flag, but on intent.md FM instead of stage state.json). - state-tools.ts persistDesignDirectionUploads: also stamps `design_directions[<stage>] = { mode: "upload", uploads, … }` on intent.md so the v4 cursor's gate check sees the selection (state.json is no longer authoritative in v4). - state-tools.ts persistDesignDirectionSelection (archetype mode): the intent.md stamp now carries `mode: "archetype"`, comments, annotations, and `at` so the cursor can render the full payload on surface-once. Tests: - cursor-walk.test.mjs: existing fixtures stamping design_directions[stage] now also stamp `surfaced_at` so the walk tests assert on downstream gates (clarify, discovery) instead of the new surface-once branch. - New cursor-walk cases: archetype-mode without surfaced_at → design_direction_complete; upload-mode without surfaced_at → design_direction_uploaded. - haiku-api test suite: 182 passed. - haiku-ui test suite: 456 passed. - haiku focused tests (sad-paths + cursor-walk + multi-tick + picker-wire-round-trip + select-mode-stage-constraints + select-studio-prompt): 50 passed. Still open (queued for the URL+await consolidation pass): - pick_design_direction → haiku_await_design_direction is still a two-step (URL + await) pair. Will collapse into a single blocking call alongside gate_review and ask_user_visual_question. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The agent used to see "post URL + call await tool" pairs for every interactive surface — gate review, visual question, design direction. Each pair was three turns minimum: tool 1 returns URL, agent posts it, agent calls tool 2 which blocks. Half the time the agent forgot step 2 and the user clicked Approve into the void. Now every interactive surface is a single blocking tool call. The engine creates the session, launches the browser best-effort, and blocks until the user submits. The await tools stay as resume entry points for the case where the original blocking call timed out. ## ask_user_visual_question + haiku_await_visual_answer - ask_user_visual_question now creates the session, launches the browser, and blocks on awaitVisualAnswerSession inline. Returns the answer + screenshot annotations directly. - haiku_await_visual_answer becomes a thin resume wrapper around the same helper. - Extracted awaitVisualAnswerSession() to share the wait + response builder between the two paths. ## pick_design_direction + haiku_await_design_direction Same shape as visual: pick_design_direction inlines the await; the old await tool is the resume entry. Extracted awaitDesignDirectionSession() that handles all four submission modes (select / regenerate / generate / upload). ## gate_review + haiku_await_gate haiku_run_next's gate_review block used to return the URL and ask the agent to call haiku_await_gate. Now it: 1. Prepares the session (creates it, writes the resume pointers to intent.md, stamps announced_at on first prepare). 2. Calls haiku_await_gate inline — same code path as the resume entry, so post-decision side effects (stampGateApproval, workflowAdvancePhase/Stage, writeReviewFeedbackFiles, sealIntentState, etc.) all go through the canonical handler. 3. Inspects the returned action name. For advance cases (advance_phase / advance_stage / intent_approved) it re-ticks so the cursor surfaces the natural-next workflow action. For terminal/feedback cases (intent_complete, external_review_requested, changes_requested, revise_unit_specs, stage_revisit) it returns the await response directly. The gate_review handling is now a `while` loop, so a chain of gates (e.g. final stage approval → intent_review pre-completion gate) can all process in the same blocking tick — the agent sees one tool call that returns the post-everything action. Added extractActionFromAwaitResponse() helper that parses the action name out of haiku_await_gate's rendered response. ## Tests - Added design_direction_complete / design_direction_uploaded handlers to the e2e test scaffolding (e2e-software-studio + real-intent-dry-run). The scaffolding mirrors haiku_run_next's surface-once stamp by writing surfaced_at on the design_directions record. - All 1340 haiku tests, 182 haiku-api tests, and 456 haiku-ui tests pass (1978 total). Type-check clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Picker UI - New /picker/:sessionId route in haiku-ui rendering studio (card grid + stage-chain preview), mode (cards with mini-timeline showing where the engine pauses), stage (simple list), confirm (two-button decision), and url_input (free-text input for the new external-review URL collection path). - haiku-api carries PickerSessionPayloadSchema in the SessionPayload discriminated union plus PickerSelectRequest/Response schemas and /picker/:id GET + /picker/:id/select POST routes. - session-api projects picker fields (kind / title / prompt / options / selection) into the wire response. ApiClient gained submitPicker(). External review URL — engine direction - haiku_run_next's external_review_requested response now tells the agent to open the change request via gh / glab directly (the agent has CLI access; no need to ask the user). Fallback: ask the user in chat if gh/glab isn't reachable. The url_input picker stays available as a future fallback path. - FeedbackSidebar gained gateAcceptsLocalApprove() — pure-external gates no longer render Approve (the merge IS the approval signal). Compound [external, ask] still renders both Approve + External buttons so the user can pick either path. Dead code + migration cleanup - Deleted prompts/gate_review.ts and its companion test (gate-review-prompt.test.mjs). Under v4 the gate_review action never reaches the agent — haiku_run_next inlines the await + post-decision processing in one tool call. - Removed the gate_review entry from the prompt registry. - v0-to-v4 migrator now scrubs pre-v4 baseline noise: deletes baseline.json, drift-markers.json, and baseline-content/ at both intent and per-stage scope. Added a migrator test covering both scopes. Stale tool descriptions reframed - haiku_await_gate, haiku_await_visual_answer, and haiku_await_design_direction descriptions now read as resume entry points, not "Call this AFTER X" two-step instructions. Canonical flow blocks inside haiku_run_next / ask_user_visual_question / pick_design_direction. Sync surfaces refreshed (per CLAUDE.md sync rule) - ArchitectureMap orchestrator-actor + webui-actor blurbs rewritten for engine-side blocking (no more URL+await two-step language). The actor talks_to lists now mention the SPA picker and engine-side gate review explicitly. - plugin/studios/ARCHITECTURE.md action table: select_studio / select_mode / select_stage / gate_review entries describe engine-side blocking, not "Call haiku_select_*" / "Call haiku_await_gate". Tests - haiku 1336 / 1336 (added migrator baseline-cleanup case) - haiku-api 182 / 182 - haiku-ui 456 / 456 (test mocks for ApiClient gained submitPicker) - All type-checks clean - Stray packages/haiku/.haiku/intents/picker-mockups/ scaffolding removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-05-07T17:59:09Z

Claude finished @jwaldrip's task in 11m 22s —— View job

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

Gather PR context and metadata
Check status of previously-flagged critical issues (1–3)
Review new commits since last review (4 new commits: picker SPA, URL+await collapse, design-direction port, select_* engine dispatch)
Check high/medium issues (4–9) from prior review
Full re-review: engine, migrations, MCP tools, SPA, security
Post final review summary

The new commits (31c6ee1 → 580e77d) close two of the eleven prior issues (Issues #4 and #6), introduce the SPA picker flow and engine-side select_* dispatch, and introduce two new high-priority bugs. The three critical issues from the first review remain unaddressed.

Status of previously-flagged issues

#	Severity	Status	Note
1	Critical	STILL OPEN	Exact semver comparison on `run-tick.ts:82`
2	Critical	STILL OPEN	`"rejected"` not normalized in migrator
3	Critical	STILL OPEN	`withIntentMainLock` never called (comment added on line 471 claims it is)
4	High	FIXED ✓	`close_feedback.ts` added and registered in `prompts/index.ts:80`
5	High	STILL OPEN	`isAlive` bare catch on `locks.ts:61`
6	High	FIXED ✓	Stage[0] fallback removed ("2026-05-06 P11 bug fix" comment in cursor.ts:927)
7	Medium	STILL OPEN	`walkFeedbackTrack` cutoff === -1 fallback to `stages.slice()`
8	Medium	STILL OPEN	Repo-relative output paths resolved against `intentDir` (now also in new `sign-slot.ts:65`)
9	Medium	STILL OPEN	`parseFbIdFromFilename` raw-basename fallback → unresolvable dispatch ID
10	Minor	STILL OPEN	Migration error returns `track: "drift"`
11	Minor	STILL OPEN	No-op `started_at` double-write in migrator

NEW issues from the latest commits

N1 (HIGH): `close_feedback` handler never finds feedback files

File: packages/haiku/src/tools/orchestrator/haiku_run_next.ts:392

for (const f of readdirSync(fbDir)) {
    if (!f.endsWith(".md")) continue
    if (!f.startsWith(`${fbId}-`) && !f.startsWith(`${fbId}.`)) continue  // ← bug

fbId is "FB-01" (from parseFbIdFromFilename). Real feedback files on disk are named 01-some-title.md (state-tools.ts:4972: const filename = \${nn}-${fileSlug}.md`). "01-some-title.md".startsWith("FB-01-")is alwaysfalse. fbFilestays empty →if (!fbFile) break→ feedback is never closed → cursor emitsclose_feedback` again on the next tick → infinite loop.

The tests in drift-mid-flight-e2e.test.mjs accidentally avoid this because they create files like FB-DRIFT-01-drift.md (the raw-basename fallback path in parseFbIdFromFilename), which do satisfy startsWith("FB-DRIFT-01-drift"). Production files created via haiku_feedback MCP will all hit the broken path.

Fix: Replace the manual readdirSync scan with the existing findFeedbackFile(slug, stage, fbId) from state-tools.ts:5335, which correctly normalises "FB-01" → numeric 1 → finds 01-*.md.

Fix this →

N2 (HIGH): `select_*` picker infinite loop on timeout

File: packages/haiku/src/tools/orchestrator/haiku_run_next.ts:100–133

async function runSelectionPicker(...): Promise<{ ok: true } | { ok: false; message: string }> {
    ...
    if (result.isError) { return { ok: false, ... } }
    return { ok: true }   // ← fires on timeout too
}

When a studio/mode/stage picker times out, haiku_select_studio.handle() returns text(JSON.stringify({action:"cancelled",...})) — isError is undefined (the text() helper at _text.ts:7 doesn't set it). runSelectionPicker returns { ok: true }. The while (result.action === "select_studio" || ...) loop at line 302 re-dispatches, the cursor still emits select_studio (nothing was written), a new 30-minute picker session opens, and the loop runs again — indefinitely until the AbortSignal fires.

Fix: In runSelectionPicker, parse the response text for "action":"cancelled" and return { ok: false }:

const responseText = result.content?.find(c => c.type === "text")?.text ?? ""
if (responseText.includes('"action":"cancelled"')) {
    return { ok: false, message: "Picker timed out or was cancelled." }
}

Or, simpler: have the select tools return isError: true on timeout (consistent with the if (!selectedStudio) path at line ~162 of haiku_select_studio.ts which already returns isError: true).

Fix this →

N3 (MEDIUM): `clarify_required` prompt instructs an impossible write path

File: packages/haiku/src/orchestrator/prompts/clarify_required.ts:56

"3. Stamp the answers on `intent.md` via `haiku_human_write` ..."

haiku_human_write (tool-defs.ts:225 description) explicitly refuses intent.md: "Workflow-managed files (units, feedback, intent.md, state.json) are refused." And haiku_intent_set rejects clarifications because it's not in AGENT_AUTHORABLE_INTENT_FIELDS (intent.ts:105: Object.keys(INTENT_FRONTMATTER_SCHEMA.properties ?? {})). The P4 clarify gate cannot be satisfied — agents hitting it get an error on whichever tool they try.

Fix: Add clarifications to INTENT_FRONTMATTER_SCHEMA as an optional Type.Unknown() field so it passes the AJV schema and is included in AGENT_AUTHORABLE_INTENT_FIELDS. Then update the prompt to instruct haiku_intent_set { field: "clarifications", value: { [stage]: { answers, at } } }.

Carry-forward: still critical before merge

The three critical items from the prior review are still blocking. The comment at haiku_run_next.ts:471 even says "under the intent-main lock" but no lock is called — that's now actively misleading documentation on top of the underlying bug.

Issue	Location	What breaks
#1	`run-tick.ts:82`	All v4 intents fail on the first CI bump after landing (4.0.0 → 4.0.1 → `"no migration path"` error)
#2	`v0-to-v4.ts` + `cursor.ts:312`	Migrated units with v3-style `result:"rejected"` treated as in-progress forever
#3	`haiku_run_next.ts:531`, `side-effects.ts:129/333`	Stage-into-main merges are unserialised; concurrent `haiku_run_next` calls can clobber each other's merge commit
N1	`haiku_run_next.ts:392`	Production FB fix loops never close; `close_feedback` loops infinitely (test coverage only works by accident)
N2	`haiku_run_next.ts:302`	Picker timeout causes infinite re-open loop; effectively hangs any session where the user doesn't pick

Issues #1, #2, #3, N1, and N2 should all be resolved before merging.

Two real-world footguns in the discrete / external-review path that broke pickup for User B (the "Pedro issue"): 1. Stage PRs landing on the repo default branch instead of `haiku/<slug>/main`. Agents would run `gh pr create` without --base, so the PR opened against `main`. Once merged, the cursor's `firstUnmergedStage` (which checks against intent main, not the repo default) kept the stage pinned and pickup wedged. 2. No graceful recovery when the misroute had already happened. User B fetches, sees the stage commits on `main` but not on `haiku/<slug>/main`, and the engine has no way to reconcile. ## Engine-side stage PR opening `openStagePullRequest(slug, stage)` in `git-worktree.ts`: - Pushes `haiku/<slug>/<stage>` to origin. - Calls `gh pr create --base haiku/<slug>/main` (or `glab mr create --target-branch haiku/<slug>/main`). - If gh/glab CLI fails or isn't on PATH, builds a provider-specific "open MR" URL the user can click: GitHub → /compare/<base>...<head>?expand=1 (auto-opens create form) GitLab → /-/merge_requests/new?merge_request[source_branch]=…&[target_branch]=… (pre-fills source + target) - Returns a structured result with createdUrl / compareUrl / pushed / errors so callers can surface whichever path succeeded. Wired into `haiku_await_gate.ts`'s external_review handling: when the user clicks External Review at the gate, the engine opens the MR itself with the right base, persists `external_review_url` on intent.md so downstream pickups see it, and surfaces the URL to the agent. The agent's instruction set no longer relies on the agent remembering to pass `--base`. `haiku_run_next.ts` external_review_requested handling does the same when the action arrives without an URL already set (the cursor path, not the gate-await path). ## Misrouted-merge reconciliation `reconcileMisroutedStageMerges(slug, stages)` in `git-worktree.ts`: - Walks each stage branch. - Detects when the branch's commits exist on the repo mainline (local or `origin/<mainline>`) but not on `haiku/<slug>/main`. - Fast-forwards intent main to mainline when safe (i.e. intent main is itself an ancestor of mainline). Pushes the FF to origin so other clones see the fix. - Refuses to FF when intent main has divergent commits — surfaces a structured error with the exact `git checkout && git merge` recipe the operator needs to run. Pre-cursor reconciliation step in `haiku_run_next.ts.handle()`: when the intent has `external_review_url` set and we're in a git repo, fetch from origin and run reconciliation BEFORE the cursor walks. That's the "User B opens a fresh checkout, runs /haiku:pickup, expects the engine to figure it out" path. Cleanly reconciled stages are invisible to the agent (which is the right UX); divergent ones surface a hard error with a fix recipe. ## Tests Added `misrouted-stage-merge.test.mjs` covering all three paths: - Happy path (merged into intent main → no reconciliation needed). - Misroute + FF-safe (engine fast-forwards intent main, stage now appears merged to the cursor). - Misroute + divergent intent main (engine refuses to FF, surfaces a manual-fix error message). All 1339 haiku tests pass across 94 test files. Type-check clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-05-07T18:12:58Z

Claude finished @jwaldrip's task in 7m 1s —— View job

Claude finished @jwaldrip's task in ~15m ——

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

Gather PR context and prior review history
Check status of critical issues from prior reviews (perf: optimize session start hook performance #1, feat: add Google Antigravity CLI support #2, feat(ci): add automatic version bump and changelog pipeline #3, N1, N2, N3)
Review new commits since last review (cf1142b: misrouted stage merge recovery)
Review engine / cursor / migration changes
Review MCP tools and lock serialization
Post final review summary

The latest commit (cf1142b) adds misrouted-stage-merge recovery and engine-side PR opening in haiku_await_gate. Three of the prior eleven flagged issues are now fixed. Eight remain open, including all three criticals. One new medium issue is introduced by the latest commit.

Status of previously-flagged issues

#	Severity	Status	Note
1	Critical	STILL OPEN	`run-tick.ts:82` exact semver compare
2	Critical	STILL OPEN	`"rejected"` not normalized in migrator
3	Critical	STILL OPEN	`withIntentMainLock` still never called
4	High	FIXED ✓	`close_feedback.ts` prompt builder added
5	High	STILL OPEN	`isAlive` bare catch in `locks.ts:61`
6	High	FIXED ✓	Stage[0] fallback removed from cursor
7	Medium	STILL OPEN	`walkFeedbackTrack` cutoff === -1 walks all stages
8	Medium	STILL OPEN	Drift sweep resolves repo-relative paths against `intentDir`
9	Medium	STILL OPEN	`parseFbIdFromFilename` raw-basename fallback → unresolvable ID
10	Minor	STILL OPEN	Migration error returns `track: "drift"`
11	Minor	STILL OPEN	No-op `started_at` double-write in migrator
N1	High	STILL OPEN	`close_feedback` handler never finds feedback files
N2	High	FIXED ✓	Picker timeout now returns `isError: true` correctly
N3	Medium	STILL OPEN	`clarify_required` prompt writes to `haiku_human_write` + field not in `AGENT_AUTHORABLE_INTENT_FIELDS`

CRITICAL — unchanged, still blocking

1. Exact semver comparison still breaks all v4 intents on CI bump

File: packages/haiku/src/orchestrator/workflow/run-tick.ts:82

targetMajor was computed on line 75 but is only used for the outer if (targetMajor >= 4) gate (line 77). The migration guard on line 82 is still an exact string compare:

if (sourceVersion !== target) {          // ← "4.0.0" !== "4.0.1" → triggers
    migrateIntent(ctx, sourceVersion, target)  // no path → throws

The registry has exactly one edge: "0" → "4.0.0". The first CI auto-bump after this PR lands (e.g. 4.0.0 → 4.0.1) makes every v4 intent throw "no migration path from 4.0.0 to 4.0.1" on every tick. This is a ticking clock.

Fix: Change line 82 to compare majors:

const sourceMajor = Number(sourceVersion.split(".")[0] ?? "0") || 0
if (sourceMajor !== targetMajor) {

Fix this →

2. `result: "rejected"` (v3 form) still not normalized — migrated units stuck forever

File: packages/haiku/src/orchestrator/migrations/v0-to-v4.ts, packages/haiku/src/orchestrator/workflow/cursor.ts:312

cursor.ts:312 checks last.result === "reject" (present tense). v3 units used result: "rejected" (past tense). migrateUnitFile copies iterations[] as-is (next.iterations = data.iterations ?? []) with no normalization pass. A migrated unit whose last iteration has result: "rejected" falls through both checks and returns null — treated as in-flight, blocking the wave every tick.

Fix: In migrateUnitFile (after the iterations copy), add:

for (const iter of next.iterations as Array<Record<string, unknown>>) {
    if (iter.result === "rejected") iter.result = "reject"
    if (iter.result === "advanced") iter.result = "advance"  // belt+suspenders
}

Fix this →

3. `withIntentMainLock` still never called — stage→main merges unserialised

File: packages/haiku/src/tools/orchestrator/haiku_run_next.ts:601-604

withIntentMainLock is defined in locks.ts:202, imported in state-tools.ts:62, and promised to the agent in two prompt files (merge_stage.ts:28, merge_intent.ts:28). The while (result.action === "merge_stage") loop at line 568 calls:

const mergeOutcome = mergeStageBranchIntoMain(slug, stageToMerge)

…with no lock. Two concurrent haiku_run_next calls targeting the same intent produce a race on the stage→main merge.

Fix: In haiku_run_next.ts, import withIntentMainLock from ../../locks.js and wrap the mergeStageBranchIntoMain call:

const mergeOutcome = withIntentMainLock(slug, () =>
    mergeStageBranchIntoMain(slug, stageToMerge)
)

HIGH — still blocking

N1. `close_feedback` handler never finds FB files — infinite loop on every production FB

File: packages/haiku/src/tools/orchestrator/haiku_run_next.ts:460-462

parseFbIdFromFilename (cursor.ts:568) produces "FB-01" from a file named 01-slug.md. The close handler matches:

if (!f.startsWith(`${fbId}-`) && !f.startsWith(`${fbId}.`)) continue

"01-slug.md".startsWith("FB-01-") → false. fbFile stays empty → break exits without closing → cursor re-emits close_feedback on the next tick → infinite loop. Tests pass accidentally because test fixtures use non-standard FB-DRIFT-01-... naming which happens to satisfy the prefix check.

Fix: Replace the manual readdirSync scan with the existing findFeedbackFile(slug, stage, fbId) from state-tools.ts, which correctly normalises "FB-01" → numeric 1 → matches 01-*.md. Also update the drift test fixtures to use production-standard naming so this stays caught.

Fix this →

5. `isAlive` catches EPERM as dead — premature lock-stealing on Windows and cross-user Linux

File: packages/haiku/src/locks.ts:57-63

function isAlive(pid: number): boolean {
    try {
        process.kill(pid, 0)
        return true
    } catch {      // ← catches EPERM as dead
        return false
    }
}

Fix: ESRCH = dead, EPERM = alive-but-unpingable:

} catch (err) {
    return (err as NodeJS.ErrnoException).code === "EPERM"
}

Fix this →

NEW — introduced by the latest commit (`cf1142b`)

N5. `reconcileMisroutedStageMerges` may leave worktree on wrong branch on unexpected throw

File: packages/haiku/src/git-worktree.ts:732-785

After execFileSync("git", ["checkout", intentMain]) succeeds and restoreBranch is captured, any unhandled exception inside the outer try block before the if (restoreBranch) restore block causes the outer catch to fire — but restoreBranch goes out of scope and git checkout <originalBranch> never runs. The agent's next operation then runs on haiku/<slug>/main instead of the stage branch, where all writes land on the wrong branch.

In practice this requires an uncaught throw after checkout, which is rare (push errors return result objects), but the pattern should use finally to guarantee the restore:

if (restoreBranch) {
    try {
        execFileSync("git", ["checkout", restoreBranch], { stdio: "pipe" })
    } catch { /* non-fatal */ }
}

…should be in a finally {} wrapping the whole operation, not inside the try block.

Fix this →

MEDIUM — carry-forward open

7. `walkFeedbackTrack` walks all stages when `currentStage` is not found

File: packages/haiku/src/orchestrator/workflow/cursor.ts:493-495

const cutoff = stages.indexOf(currentStage)
const stagesUpToCurrent =
    cutoff >= 0 ? stages.slice(0, cutoff + 1) : stages.slice()  // walks ALL stages

When currentStage is unrecognised (renamed stage, misconfigured studio), stages.slice() returns the full list and the feedback walk surfaces FBs from future stages.

Fix: Return [] when cutoff === -1 — walk no stages when the current stage can't be located.

Fix this →

8. Drift sweep joins repo-relative output paths against `intentDir` — check silently passes

File: packages/haiku/src/orchestrator/workflow/drift-sweep.ts:224

const outAbs = join(args.intentDir, outRel)

Unit outputs like src/components/Button.tsx resolve to <intentDir>/src/components/Button.tsx, which doesn't exist. fileSha256(outAbs) returns null → continue → no drift event → drift check silently passes for the most important code artifacts.

Fix: Paths that don't start with stages/ are repo-relative and should resolve against args.repoRoot, not args.intentDir.

9. `parseFbIdFromFilename` fallback returns raw basename — causes unresolvable dispatch IDs

File: packages/haiku/src/orchestrator/workflow/cursor.ts:568-571

function parseFbIdFromFilename(fbPath: string): string {
    const m = basename(fbPath).match(/^(\d+)-/)
    return m ? `FB-${m[1].padStart(2, "0")}` : basename(fbPath, ".md")
}

The raw-basename fallback (e.g., orphan-note) passes findFeedbackFile's ^(?:FB-)?(\d+)$ check and causes feedback_not_found on every tick. The agent retries, the cursor re-emits, infinite loop.

Fix: Return null when the filename doesn't match the expected pattern; in nextActionForFeedback, skip the file if parseFbIdFromFilename returns null.

N3. `clarify_required` prompt instructs a permanently-blocked write path

File: packages/haiku/src/orchestrator/prompts/clarify_required.ts:56

"3. Stamp the answers on `intent.md` via `haiku_human_write`..."

Two failures:

haiku_human_write explicitly refuses intent.md (state-tools.ts comment line 32: "Workflow-managed files... are refused").
clarifications is not in INTENT_FRONTMATTER_SCHEMA.properties and therefore not in AGENT_AUTHORABLE_INTENT_FIELDS (intent.ts:105). haiku_intent_set { field: "clarifications" } returns the "Field not in INTENT_FRONTMATTER_SCHEMA" error.

An agent hitting clarify_required cannot satisfy it. The gate is permanently stuck.

Fix (two parts):

Add clarifications: Type.Optional(Type.Unknown()) to INTENT_FRONTMATTER_SCHEMA in packages/haiku/src/state/schemas/intent.ts.
Update the prompt to use haiku_intent_set { field: "clarifications", value: { [stage]: { answers, at } } }.

Fix this →

Summary

Issue	Severity	Status	File
#1	Critical	OPEN	`run-tick.ts:82` — semver exact compare → all v4 intents break on next CI bump
#2	Critical	OPEN	`v0-to-v4.ts` + `cursor.ts:312` — `"rejected"` not normalized → migrated units stuck
#3	Critical	OPEN	`haiku_run_next.ts:601` — `withIntentMainLock` never called
N1	High	OPEN	`haiku_run_next.ts:462` — `close_feedback` never finds files; production infinite loop
#5	High	OPEN	`locks.ts:61` — `isAlive` bare catch steals live locks on Windows
N5	Medium	NEW	`git-worktree.ts:732` — restore after checkout not in `finally`; wrong-branch writes on throw
#7	Medium	OPEN	`cursor.ts:495` — cutoff === -1 walks all stages for FBs
#8	Medium	OPEN	`drift-sweep.ts:224` — repo-relative outputs joined against intentDir
#9	Medium	OPEN	`cursor.ts:568` — raw-basename fallback in `parseFbIdFromFilename`
N3	Medium	OPEN	`clarify_required.ts:56` — writes via blocked path; field not in schema
#4	High	FIXED ✓	`close_feedback` prompt builder added
#6	High	FIXED ✓	Stage[0] fallback removed
N2	High	FIXED ✓	Picker timeout returns `isError: true`
#10–11	Minor	OPEN	Cosmetic

Issues #1, #2, #3, and N1 must be resolved before merging. #1 activates on the first CI bump after this lands. N1 affects every production fix-loop closure.

…boundary Matt (designer) tried to upload `Travel_Delivery_Example_Header_Only.html` to the knowledge uploader and got "File type not accepted." That allowlist (.md .pdf .png .jpg .jpeg .svg .txt) rejected real artifacts: designers exporting .html mockups from Sketch / Figma, researchers attaching .docx / .xlsx / .csv notes, anyone pasting a .json data bundle. The upload-side allowlist + extension blocklist on the knowledge route were defense against V-01 (stored XSS via uploaded HTML rendering in the privileged tunnel origin). That defense moved to serve time when serveFile (http/path-safety.ts) gained the inverted MIME map: anything not in SAFE_INLINE_MIME_TYPES gets `application/octet-stream` + `Content-Disposition: attachment`, so the reviewer's browser downloads unknown types instead of executing them. Once that landed, the upload-side rejection became redundant friction. What this commit does: - KnowledgeDropZone DEFAULT_ACCEPT changes from a fixed extension list to "*" — accepts any file. The size cap (10 MB) is the only real limit. The "drop files here" caption renders "any file type · max 10 MB each" in this mode. - Knowledge upload route in http/upload-routes.ts drops both the BLOCKED_EXTENSIONS check and the ALLOWED_MIMES_KNOWLEDGE allowlist. The route accepts any file; serveFile downgrades dangerous MIMEs at serve time. - Stage-output uploads keep their existing allowlist — those files are intentionally rendered inline in the SPA review UI and the inverted MIME map alone isn't enough cover. Tests: - upload-routes.test.mjs: replaced 4 "knowledge: X rejected with 415" cases with positive "X uploads successfully and lands on disk" assertions covering .html (Sketch export shape), .svg, .js (snippet reference), and octet-stream payloads. - red-team-unit-01-upload-bypass.test.mjs: R-05 was the V-01 bypass via .js + octet-stream on the knowledge route. Same conversion — upload now succeeds (200) and the test documents that V-01 closure moved to serveFile, with a comment guarding against reintroducing the upload-side guard. - 1336 / 1336 haiku tests green; UI tests 456 / 456; type-check clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-05-07T18:32:59Z

Claude finished @jwaldrip's task in 4m 6s —— View job

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

Gather PR context and prior review history
Review new commits since last review (869c445, cf1142b)
Verify status of all previously-flagged issues
Full re-scan of engine, migration, MCP tools changes
Post final review summary

Two new commits reviewed (cf1142b misrouted-merge recovery, 869c445 knowledge upload filter drop). The upload security change is sound — serveFile's inverted MIME map is verified in packages/haiku/src/http/path-safety.ts:174-181 and the target_filename path-traversal guard is at upload-routes.ts:887-895. No new security issues introduced.

Six of eleven previously-flagged issues are still open, including all three criticals and the highest-impact high (N1). Status below.

Status of all previously-flagged issues

#	Severity	Status	File
1	Critical	STILL OPEN	`run-tick.ts:82` — exact semver compare
2	Critical	STILL OPEN	`v0-to-v4.ts` + `cursor.ts:312` — `"rejected"` not normalized
3	Critical	STILL OPEN	`haiku_run_next.ts:604` — `withIntentMainLock` never called
4	High	FIXED ✓	`close_feedback.ts` prompt builder added
5	High	STILL OPEN	`locks.ts:61` — `isAlive` bare catch steals live locks
6	High	FIXED ✓	Stage[0] fallback removed from cursor
7	Medium	STILL OPEN	`cursor.ts:495` — cutoff === -1 walks entire stage list
8	Medium	STILL OPEN	`sign-slot.ts:65` + `drift-sweep.ts:224` — repo-relative outputs never witnessed
9	Medium	STILL OPEN	`cursor.ts:568` — raw-basename fallback in `parseFbIdFromFilename`
10	Minor	STILL OPEN	`run-tick.ts:99` — migration error returns `track: "drift"`
11	Minor	STILL OPEN	`v0-to-v4.ts:247` — no-op `started_at` double-write
N1	High	STILL OPEN	`haiku_run_next.ts:462` — `close_feedback` never finds files
N2	High	FIXED ✓	Picker timeout returns `isError: true`
N3	Medium	STILL OPEN	`clarify_required.ts:56` — writes via `haiku_human_write`; field not in schema
N5	Medium	STILL OPEN	`git-worktree.ts:775` — `restoreBranch` restore not in `finally`

CRITICAL — still blocking merge

1. Exact semver compare still breaks all v4 intents on first CI bump

packages/haiku/src/orchestrator/workflow/run-tick.ts:82

targetMajor is computed on line 75 but line 82 still compares full version strings:

if (sourceVersion !== target) {          // "4.0.0" !== "4.0.1" → triggers
    migrateIntent(ctx, sourceVersion, target)  // no path → throws

The registry has one edge: "0" → "4.0.0". The first CI bump after this lands (4.0.0 → 4.0.1) causes every v4 intent to throw "no migration path" on every tick. This is a ticking clock — it activates on the first merge to main after this PR.

Fix: Change line 82 to:

const sourceMajor = Number(sourceVersion.split(".")[0] ?? "0") || 0
if (sourceMajor !== targetMajor) {

Fix this →

2. `result: "rejected"` still not normalized — migrated units stuck as in-progress forever

packages/haiku/src/orchestrator/migrations/v0-to-v4.ts:247, packages/haiku/src/orchestrator/workflow/cursor.ts:312

cursor.ts:312 checks last.result === "reject" (present tense). v3 units used result: "rejected" (past tense). Lines 246–247 copy iterations[] as-is with no normalization pass. A migrated unit whose last iteration has result: "rejected" falls through both the "advance" and "reject" branches and returns null — treated as in-flight, blocking the wave every tick with no re-dispatch.

Fix: After next.iterations = data.iterations ?? [] in migrateUnitFile, add:

for (const iter of next.iterations as Array<Record<string, unknown>>) {
    if (iter.result === "rejected") iter.result = "reject"
    if (iter.result === "advanced") iter.result = "advance"
}

Fix this →

3. `withIntentMainLock` still never called — concurrent merge_stage ticks can clobber each other

packages/haiku/src/tools/orchestrator/haiku_run_next.ts:604

withIntentMainLock is defined in locks.ts:202, imported (unused) in state-tools.ts:62, and documented in merge_stage.ts and merge_intent.ts prompt files. The actual call at line 604 is:

const mergeOutcome = mergeStageBranchIntoMain(slug, stageToMerge)

No lock wraps this. Two concurrent haiku_run_next calls on the same intent trigger two simultaneous stage-into-main merges, producing a merge in progress git error or silently clobbering each other's work. The comment in the prompt file says "under withIntentMainLock" — that's false documentation.

Fix: Import withIntentMainLock from ../../locks.js in haiku_run_next.ts and wrap the call:

const mergeOutcome = await withIntentMainLock(slug, () =>
    mergeStageBranchIntoMain(slug, stageToMerge)
)

HIGH — still blocking

N1. `close_feedback` handler never finds files in production — infinite loop

packages/haiku/src/tools/orchestrator/haiku_run_next.ts:462

parseFbIdFromFilename (cursor.ts:568) produces "FB-01" from a file named 01-slug.md. The close handler at line 462 then checks:

if (!f.startsWith(`${fbId}-`) && !f.startsWith(`${fbId}.`)) continue
// "01-slug.md".startsWith("FB-01-") → false → always skips

fbFile stays empty → break exits without stamping closed_at → cursor re-emits close_feedback on the next tick → infinite loop. Every production fix-loop closure is stuck. The tests pass accidentally because their fixtures use non-standard FB-DRIFT-01-* naming that happens to match.

Fix: Replace the readdirSync scan with the existing findFeedbackFile(slug, stage, fbId) from state-tools.ts which correctly normalises "FB-01" → numeric 1 → matches 01-*.md. Also update the test fixtures to use production-standard 01-*.md naming so this doesn't silently regress.

Fix this →

MEDIUM — should fix before merge

5. `isAlive` bare catch steals live locks on Windows / cross-user Linux

packages/haiku/src/locks.ts:61

} catch {      // catches ESRCH (dead) AND EPERM (alive-but-cross-user)
    return false
}

On Windows and cross-user Linux, process.kill(pid, 0) throws EPERM when the target process is alive but owned by a different account. The bare catch treats this as dead → isStale() returns true → lock stolen while the holder is still running. Common in CI environments with multiple agent users.

Fix: } catch (err) { return (err as NodeJS.ErrnoException).code === "EPERM" }

Fix this →

N5. `restoreBranch` checkout not in `finally` — wrong-branch writes on unexpected throw

packages/haiku/src/git-worktree.ts:775

reconcileMisroutedStageMerges checks out intentMain at line 741 and captures restoreBranch. The restore at line 775–780 is inside the outer try (lines 732–782). If pushBranchToOrigin(intentMain) at line 769 throws an unhandled exception, the outer catch at line 782 fires and the restore is skipped — leaving the worktree on intentMain instead of the stage branch. The agent's next operation then writes to the wrong branch.

Fix: Move the if (restoreBranch) block into a finally {} that wraps the whole operation.

Fix this →

7. `walkFeedbackTrack` walks all stages when current stage is unknown

packages/haiku/src/orchestrator/workflow/cursor.ts:494–495

const cutoff = stages.indexOf(currentStage)
const toWalk = cutoff >= 0 ? stages.slice(0, cutoff + 1) : stages.slice()  // walks ALL stages

When currentStage is unrecognised (renamed stage, misconfigured studio), stages.slice() returns the full list and surfaces FBs from future stages not yet started.

Fix: Return [] when cutoff === -1.

8. Repo-relative outputs never witnessed — drift on real code files silently passes

packages/haiku/src/orchestrator/workflow/sign-slot.ts:65, packages/haiku/src/orchestrator/workflow/drift-sweep.ts:224

Both buildOutputWitnesses (sign-slot.ts:65) and the drift check (drift-sweep.ts:224) join output paths against intentDir. A unit with outputs: ["src/components/Button.tsx"] resolves to <intentDir>/src/components/Button.tsx — a path that doesn't exist. The file is never added to the witnesses map at sign time; the drift sweep finds no entry; drift in the actual src/components/Button.tsx is permanently invisible.

Fix: Distinguish intent-relative paths (those starting with stages/) from repo-relative paths. Repo-relative paths should be joined against repoRoot, not intentDir.

9. `parseFbIdFromFilename` raw-basename fallback creates unresolvable dispatch IDs

packages/haiku/src/orchestrator/workflow/cursor.ts:568–570

return m ? `FB-${m[1].padStart(2, "0")}` : basename(fbPath, ".md")
// non-numeric file → "orphan-note" → findFeedbackFile rejects → infinite dispatch

Fix: Return null when the filename doesn't match; skip in nextActionForFeedback.

N3. `clarify_required` prompt writes via permanently-blocked path

packages/haiku/src/orchestrator/prompts/clarify_required.ts:56

The prompt instructs agents to call haiku_human_write on intent.md. Two independent blockers:

haiku_human_write explicitly refuses workflow-managed files including intent.md.
clarifications is not in INTENT_FRONTMATTER_SCHEMA.properties (intent.ts:105), so haiku_intent_set { field: "clarifications" } also returns an error.

An agent hitting clarify_required cannot satisfy it — the gate is permanently stuck.

Fix (both parts required):

Add clarifications: Type.Optional(Type.Record(Type.String(), Type.Unknown())) to INTENT_FRONTMATTER_SCHEMA in packages/haiku/src/state/schemas/intent.ts.
Update the prompt to use haiku_intent_set { field: "clarifications", value: { [stage]: { answers, at } } }.

Fix this →

Summary

Issues #1, #2, #3, and N1 must be resolved before merging. #1 is a ticking clock — it activates on the first CI bump after this lands, at which point every v4 intent in every production repo returns action: "error" on every tick. N1 means every fix-loop closure loops infinitely in production.

The two latest commits (misrouted-merge recovery and upload filter drop) are sound. The upload change correctly moves the XSS boundary to serveFile and the path-traversal guard at upload-routes.ts:887 remains in place.

Issues #5, N5, #7, #8, #9, N3 are lower-urgency but each represents a real operational failure mode that will surface in practice.

…te repair Hardens the v4 engine against failure modes surfaced by debugging real sessions (Matt's designer pickup, the overtime-ac product stage, Mike's quality-gate loop). Single bundle because the fixes touch the same set of dispatch / cursor / lifecycle paths. ENGINE - Cursor Track B walks merged stages too. Pre-fix: once every stage merged, FBs filed on prior stages were silently ignored and the pipeline sealed over them. New e2e regression pins both interpretations of "user on stage 4 leaves FB on stage 1." - Drift sweep is FM-aware. New `outputSha256` body-hashes markdown/text outputs (FM stripped) and full-file-hashes binaries. `buildOutputWitnesses` and the comparator both route through it. Backward-compat: dual-strategy comparator accepts legacy whole-file witnesses so in-flight intents don't false-positive on first sweep. - Model routing cascade wired into every dispatch site that emits an LLM subagent: start_unit_hat, start_feedback_hat, dispatch_review, dispatch_approval, discovery_required, review_fix. dispatch_quality_gates is intentionally exempt (engine tool, no LLM). Studio default_model: sonnet now flows to fix-hat dispatches instead of being silently overridden by the parent's Opus default. - Feedback hat-rejection escalates the model tier (sonnet→opus) on the FB FM, mirroring the unit-hat reject path. Next bolt picks up the bumped tier automatically. - haiku_unit_set lifecycle exemption for `quality_gates`. Mirrors `outputs`: gate definitions are check specs, not workflow state, so letting completed units repair them doesn't violate forward-only. Closes Mike's loop: agent had been telling users "edit the file outside Claude Code" because no MCP path existed. - fix_quality_gates prompt rewritten to point agents at the haiku_unit_set repair path with the exact call signature, splitting "code issue" vs "gate definition issue" failure shapes. REVIEW SPA - FeedbackItem renders iterations[] as a collapsed disclosure under the closure_reply card. Reviewers see the chain of hats that fired, each with result + reason + truncated commit, instead of just a bare "closed" status. - session-api scrubs body_sha256 + witnesses[] from the FM projection before the SPA sees it. The cursor needs them on disk for drift; reviewers don't need to see "scary sha artifacts" (Matt's session). - Click-to-flash boundary documented + tested. deriveExistingAnchors exported with a comment block making the persistent-vs-flash split explicit. Closed FBs stay out of the saved highlight layer (no clutter) but still flash on click. - haiku_feedback MCP tool accepts inline_anchor at create time. Adversarial-review and studio-review hats can now attach an excerpt the SPA flashes on click — same flow as user-authored anchored comments. Strict sub-schema gates the shape at the input layer. TESTS - 1352 haiku tests, 465 SPA tests, all green (was 1336 + 462). - New: cross-stage-fb-rewalk (2 e2e), feedback-model-routing (4), unit-hat-model-routing (2), scrub-witness-fields (1), FeedbackItem.iterations (4), deriveExistingAnchors (5). - Existing test files extended: drift-no-false-positives (3 backward- compat scenarios), feedback (2 inline_anchor cases), state-tools- handlers (3 lifecycle exemption + YAML round-trip). Followup tracked: #324 (inline file diff in review pane, deferred per scope discussion). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…B + units Two surfaces, same migration shape. Both keep working for in-flight intents authored with the legacy 2-digit width. feedback_id (the bigger change): - Schema: `feedback_id: integer (1..999)`. No more `FB-NN` string guessing on input. The handler returns canonical `FB-NNN` in responses. - New helpers: `formatFeedbackId(num)` and `deriveFeedbackIdFromFilename(filename)` — single source of truth for the on-disk → wire-format translation. - Lookup is numeric-prefix match against `^(\d+)-`, so legacy `08-foo.md` files resolve when the agent passes `feedback_id: 8`. New writes pad to 3 digits (`008-foo.md`). - Cross-stage `haiku_feedback_move` (panda's v3 regression) lands with a regression test pinning the cross-stage relocation contract. - All 8 FB handlers unified on `feedbackId` local var (was a mix of `feedbackId` / `fbId`). Units (smaller, narrower scope — slug stays human-readable for the depends_on graph): - `unitPath()` now resolves width-flexibly: if the requested name doesn't match an exact file, fall back to `(numeric prefix, slug)` match. Both `state-tools.ts` and `state/shared.ts` carry the identical implementation since both export `unitPath`. - `validateUnitFrontmatter` depends_on resolver uses the same width-flexible match — a fresh 3-digit unit can declare `depends_on: [unit-001-foo]` against a 2-digit on-disk sibling (and vice versa) and validate cleanly. - Engine prompts now teach `unit-NNN-slug.md` (3-digit pad, max 999) with an explicit migration note that legacy 2-digit names still resolve. Tests: - `numeric-id-migration.test.mjs` — pins FB read with 2-digit and 3-digit on-disk forms, FB write produces NNN-padded files, and unitPath resolves both directions across the width boundary. Includes a depends_on width-flexible cross-reference test. - `cross-stage-feedback-move.test.mjs` — locks in panda's v3 cross-stage move regression as fixed under v4. - `_v4-fixtures.mjs` — `makeFeedback` now accepts numeric / digit- string / FB-NN ids, derives the trailing integer, and writes 3-digit padded filenames. Permissive parser supports legacy `FB-DRIFT-NN` test ids by extracting trailing digits. - All 38 affected test files updated (perl bulk-substitution then per-file fixups for assertions on the canonical `FB-NNN` shape). Sweep: 1362 passed, 0 failed across 100 test files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…related intent dirs The 2026-05-07 incident: the user noticed `.haiku/intents/cowork-mcp-apps-integration/` files (`action-log.jsonl`, `.last_action.json`, `stages/development/`) showing up as untracked in `git status` even on `main` branch. The intent dir wasn't hardcoded anywhere, no hook or migration script was at fault — the engine itself was targeting it. Root cause (`haiku_run_next` auto-resolve): 1. Caller passes `intent` directly ← explicit 2. Current git branch is `haiku/<slug>/main` or `haiku/<slug>/<stage>` ← explicit 3. Else: pick the sole active intent on disk ← BUG Step 3 violated the user's invariant: "if I'm not on an intent's branch and I haven't said 'intent: foo', the engine has no business writing to any intent dir." The cowork intent had `status: active` (committed by PRs #174/#180/#238), and it was the only such intent on disk, so every workflow tick — including ones triggered by hooks the user didn't initiate — picked it up via step 3 and wrote runtime journals into its dir. The fix: in git mode, drop step 3 entirely. If the user isn't on an intent branch and didn't pass `intent`, the engine refuses to auto-target. The error message is explicit: "Pass `intent` explicitly, or `git switch haiku/<slug>/main` to scope the engine to a specific intent." Filesystem mode (no git) has no branch signal — keep the "sole active intent on disk" fallback there since it's the only auto-resolve available. Three companion changes ship in the same commit: - `.gitignore` patterns for the engine's runtime journals (`.haiku/**/action-log.jsonl`, `.last_action.json`, `write-audit.jsonl`). Even when the engine is correctly scoped, these are WAL-style journals that don't belong in git. Adding them to the project's gitignore protects every plugin user from the same pollution at the source. - `intent-scope-isolation.test.mjs` — pins the contract: * non-intent branch + no explicit `intent` → refuse, don't name the stranger intent in the error * intent branch → auto-resolve from branch * explicit `intent` arg → flow through regardless of branch - Remove the tracked `.haiku/intents/cowork-mcp-apps-integration/` files. These are zombies from PRs #174/#180/#238 that should never have been checked in. The engine fix prevents the dir from getting recreated. Sweep: 1365 passed, 0 failed across 101 test files. Note: the bundled `plugin/bin/haiku.mjs` is from v3 (last touched at `ffd22c3f3`) and still writes `.last_action.json` — that's a stale-bundle issue, not an engine bug. Will resolve on the next plugin release when CI rebuilds the bundle from current source. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-05-08T02:02:09Z

Claude finished @jwaldrip's task in 8m 57s —— View job

Claude finished @jwaldrip's task ——

Gather PR context and prior review history
Review new commits since last review (37ec691, b877e52, 1ccae8d)
Check status of all previously-flagged issues (perf: optimize session start hook performance #1–feat(cleanup): detect and remove merged worktrees #11, N1–N5)
Full re-scan of engine, migration, cursor, MCP tools, SPA
Post final review summary

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

Three new commits reviewed since the last review:

37ec691 — numeric ids refactor (feedback_id: integer, 3-digit padding)
b877e52 — session debug gap fixes (cross-stage FB walk, drift FM-aware, model routing)
1ccae8d — intent scope isolation (drop sole-active fallback in git mode)

The intent scope isolation fix (1ccae8d) is well-implemented with solid tests. The drift FM-awareness in b877e52 is a genuine improvement. The numeric id refactor (37ec691) is a good long-term fix but introduced a new critical breakage in the fix loop that blocks every fix-hat dispatch.

Status of all previously-flagged issues

#	Severity	Status	File
1	Critical	OPEN	`run-tick.ts:82` — exact semver compare
2	Critical	OPEN	`v0-to-v4.ts` + `cursor.ts:312` — `"rejected"` not normalized
3	Critical	OPEN	`haiku_run_next.ts:628` — `withIntentMainLock` never called
4	High	FIXED ✓	`close_feedback.ts` prompt builder added
5	High	OPEN	`locks.ts:61` — `isAlive` bare catch steals live locks
6	High	FIXED ✓	Stage[0] fallback removed from cursor
7	Medium	OPEN	`cursor.ts:495` — cutoff === -1 walks all stages
8	Medium	OPEN	`sign-slot.ts:105` — repo-relative outputs joined against `intentDir`
9	Medium	OPEN	`cursor.ts:570` — raw-basename fallback in `parseFbIdFromFilename`
10	Minor	OPEN	`run-tick.ts:99` — migration error returns `track: "drift"`
11	Minor	OPEN	`v0-to-v4.ts:247` — no-op `started_at` double-write
N1	High	OPEN	`haiku_run_next.ts:486` — `close_feedback` uses `"FB-NNN"` as file prefix; never matches `"NNN-*.md"`
N2	High	FIXED ✓	Picker timeout returns `isError: true`
N3	Medium	OPEN	`clarify_required.ts:56` — instructs `haiku_human_write` on `intent.md` (blocked)
N5	Medium	OPEN	`git-worktree.ts:775` — `restoreBranch` restore not in `finally`

CRITICAL — new issue from `37ec691`

N7. `start_feedback_hat` emits `feedback_id: "FB-001"` (string) but every `haiku_feedback_*` tool now requires integer — fix loop broken for every dispatch

File: packages/haiku/src/orchestrator/prompts/start_feedback_hat.ts:153,159,163

The 37ec691 commit changed feedback_id to Type.Integer({ minimum: 1, maximum: 999 }) across all 8 haiku_feedback_* tools. The start_feedback_hat prompt builder was not updated. It still emits tool call templates with the old string form:

haiku_feedback_read { intent: "foo", stage: "bar", feedback_id: "FB-001" }
haiku_feedback_advance_hat { intent: "foo", stage: "bar", feedback_id: "FB-001" }
haiku_feedback_reject_hat { intent: "foo", stage: "bar", feedback_id: "FB-001" }

The schema description (state/schemas/inputs/feedback-variants.ts:37) explicitly says: "Just a number; no FB- prefix, no zero-padding, no string form." Every agent following this prompt verbatim hits haiku_feedback_advance_hat_input_invalid on every call, and the prompt additionally says "pass each feedback_id verbatim, no substitution required" — which locks the agent into the failing behavior.

The fix loop is completely non-functional in production as of 37ec691.

Fix: In start_feedback_hat.ts, extract the integer from fbId for tool-call templates:

// fbId is "FB-001" from parseFbIdFromFilename; strip prefix → integer
const fbNum = parseInt(fbId.replace(/^FB-/i, ""), 10)

Then replace the template strings:

Line 153: feedback_id: ${fbNum} (no quotes)
Line 159: feedback_id: ${fbNum} (no quotes)
Line 163: feedback_id: ${fbNum} (no quotes)

Also update the preamble to show the integer as the canonical form, consistent with the schema description.

Fix this →

CRITICAL — carry-forward (all three still blocking)

1. Exact semver compare breaks all v4 intents on first CI bump after merge

packages/haiku/src/orchestrator/workflow/run-tick.ts:82

if (sourceVersion !== target) {   // "4.0.0" !== "4.0.1" → triggers
    migrateIntent(ctx, sourceVersion, target)  // no path → throws

targetMajor is computed on line 75 but the migration guard on line 82 still uses exact string compare. The registry has one edge: "0" → "4.0.0". First CI bump after this lands (4.0.0 → 4.0.1) makes every v4 intent throw "no migration path" on every tick.

Fix: const sourceMajor = Number(sourceVersion.split(".")[0] ?? "0") || 0; if (sourceMajor !== targetMajor)

Fix this →

2. `result: "rejected"` still not normalized — migrated units stuck as in-progress forever

packages/haiku/src/orchestrator/migrations/v0-to-v4.ts:247

cursor.ts:312 checks last.result === "reject" (present tense). v3 units wrote result: "rejected" (past tense). The migrator at line 247 copies iterations[] as-is. Migrated units with result: "rejected" return null from walkUnitTrack — treated as in-flight, blocking the wave on every tick with no re-dispatch.

Fix: After next.iterations = data.iterations ?? [], add:

for (const iter of next.iterations as Array<Record<string, unknown>>) {
    if (iter.result === "rejected") iter.result = "reject"
    if (iter.result === "advanced") iter.result = "advance"
}

Fix this →

3. `withIntentMainLock` still never called — stage→main merges unserialised

packages/haiku/src/tools/orchestrator/haiku_run_next.ts:628

const mergeOutcome = mergeStageBranchIntoMain(slug, stageToMerge)

No lock. withIntentMainLock is defined in locks.ts:202, imported (unused) in state-tools.ts:62, and promised in merge_stage.ts:28 and merge_intent.ts:28. Concurrent haiku_run_next calls can race on the same stage→main merge.

Fix: Import withIntentMainLock in haiku_run_next.ts and wrap: await withIntentMainLock(slug, () => mergeStageBranchIntoMain(slug, stageToMerge))

HIGH — still open

N1. `close_feedback` handler never finds production feedback files — infinite loop

packages/haiku/src/tools/orchestrator/haiku_run_next.ts:486

parseFbIdFromFilename("001-foo.md") returns "FB-001". The close handler checks:

if (!f.startsWith(`${fbId}-`) && !f.startsWith(`${fbId}.`)) continue
// "001-foo.md".startsWith("FB-001-") → false → always skips

fbFile stays empty → break exits without stamping closed_at → cursor re-emits close_feedback on the next tick → infinite loop.

The drift test fixtures use FB-DRIFT-01.md (non-standard naming) which accidentally satisfies f.startsWith("FB-DRIFT-01.") — tests pass, production is broken.

After 37ec691, the state-tools.ts lookup path correctly uses numeric matching (^(\d+)-). The close_feedback handler needs the same treatment:

Fix: Replace the readdirSync loop with findFeedbackFile(slug, stage, fbId) from state-tools.ts, which normalises "FB-001" → numeric 1 → matches 001-*.md. Also update the drift-mid-flight-e2e test fixtures to standard NNN-*.md naming so this can't silently regress.

[Fix this →](https://claude.ai/code?q=In%20packages%2Fhaiku%2Fsrc%2Ftools%2Forchestrator%2Fhaiku_run_next.ts%20in%20the%20close_feedback%20while%20loop%20(around%20line%20480-495)%2C%20replace%20the%20readdirSync%20loop%20%60for%20(const%20f%20of%20readdirSync(fbDir))%20%7B%20if%20(!f.startsWith(%60%24%7BfbId%7D-%60)%20...%60%20with%20a%20call%20to%20findFeedbackFile(slug%2C%20stage%2C%20fbId)%20from%20state-tools.ts.%20parseFbIdFromFilename%20returns%20%22FB-001%22%20but%20files%20on%20disk%20are%20named%20%22001-foo.md%22%20so%20startsWith(%22FB-001-%22)%20is%20always%20false.%20Also%20update%20test%20fixtures%20in%20drift-mid-flight-e2e.test.mjs%20to%20use%20standard%20001-*.md%20naming%20so%20the%20bug%20can%27t%20silently%20regress.%20Branch%3A%20haiku%2Fv4-engine-refactor&repo=gigsmart%2Fhaiku-method)

5. `isAlive` bare catch steals live locks on Windows / cross-user Linux

packages/haiku/src/locks.ts:61

process.kill(pid, 0) throws EPERM (not ESRCH) when the target process is alive but owned by a different user. The bare catch treats EPERM as dead and steals the lock.

Fix: } catch (err) { return (err as NodeJS.ErrnoException).code === "EPERM" }

Fix this →

MEDIUM — carry-forward open

7. `walkFeedbackTrack` walks all stages when current stage is unknown

packages/haiku/src/orchestrator/workflow/cursor.ts:495

const toWalk = cutoff >= 0 ? stages.slice(0, cutoff + 1) : stages.slice()  // walks ALL on cutoff === -1

Note: the "all stages merged" path is now correct — derivePosition passes the last stage as currentStage in that case (cursor.ts:958). The remaining risk is a renamed/unknown stage: if stages.indexOf(currentStage) === -1, stages.slice() walks all stages including future ones.

Fix: stages.slice() → [] when cutoff === -1.

8. Repo-relative output paths still joined against `intentDir`

packages/haiku/src/orchestrator/workflow/sign-slot.ts:105

buildOutputWitnesses at line 105 still does join(intentDir, out) for all output paths. A unit with outputs: ["src/components/Button.tsx"] resolves to <intentDir>/src/components/Button.tsx — a path that doesn't exist. The drift sweep at drift-sweep.ts:282 has the same pattern. The b877e52 commit added FM-awareness to the hash comparison but did not fix the base-dir selection for repo-relative paths.

Fix: Paths not starting with stages/ are repo-relative and should join against repoRoot.

N3. `clarify_required` prompt instructs a permanently-blocked write path

packages/haiku/src/orchestrator/prompts/clarify_required.ts:56

The prompt tells agents to call haiku_human_write on intent.md — the hook explicitly refuses workflow-managed files. haiku_intent_set { field: "clarifications" } also fails because clarifications is not in INTENT_FRONTMATTER_SCHEMA.properties (intent.ts:105). The clarify gate is permanently stuck.

Fix (both parts needed):

Add clarifications: Type.Optional(Type.Record(Type.String(), Type.Unknown())) to INTENT_FRONTMATTER_SCHEMA in state/schemas/intent.ts
Update the prompt to use haiku_intent_set { field: "clarifications", value: { [stage]: { answers, at } } }

N5. `restoreBranch` checkout not in `finally` — wrong-branch writes on unexpected throw

packages/haiku/src/git-worktree.ts:775

reconcileMisroutedStageMerges checks out intentMain at line 742 and restores at line 775, but the restore is inside the outer try. An unhandled exception between checkout and restore leaves the worktree on intentMain.

Fix: Move the if (restoreBranch) block into a finally {} that always runs.

Newly fixed (since last review)

Fix	Commit
N2 — picker timeout loops	`580e77d`
Cross-stage FB walk blocks after all stages merged	`b877e52`
Intent auto-resolve scope pollution	`1ccae8d`
Drift sweep FM-aware (no false-positives on markdown FM edits)	`b877e52`
Model routing cascade in all dispatch sites	`b877e52`

Summary

Must fix before merge:

Issue	File	What breaks
N7 (NEW)	`start_feedback_hat.ts:153,159,163`	Fix loop broken — every agent attempt to advance/reject a hat gets schema-rejected because prompt emits `"FB-001"` string but schema requires integer
#1	`run-tick.ts:82`	All v4 intents break on the first CI bump (4.0.0→4.0.1 → `"no migration path"` on every tick)
#2	`v0-to-v4.ts:247`	Migrated units with v3-style `result:"rejected"` are permanently stuck as in-flight
#3	`haiku_run_next.ts:628`	Stage→main merges unserialised; concurrent ticks can race and corrupt merge state
N1	`haiku_run_next.ts:486`	Every production fix-loop closure loops infinitely; `close_feedback` never finds `NNN-*.md` files

N7 is new and was not in any prior review — it was introduced by 37ec691 and means the fix loop is currently broken for all production feedback dispatch.

Three reviews flagged the same issues across the engine. Closing them all in one commit. CRITICAL: #1 (run-tick.ts:82) — Exact semver compare on plugin_version would have broken every v4 intent on the first CI bump after release (4.0.0 → 4.0.1 → migrateIntent throws "no migration path"). Compare majors only; the FM stamp marks schema generation, not the build that touched it. Same edit also fixes #10 — the migration-failure error path now returns `track: "intent"` instead of the misleading `track: "drift"` ghost event. #2 (v0-to-v4.ts) — v3 wrote `result: "rejected"` / `"advanced"`; v4's cursor only matches `"reject"` / `"advance"`. A migrated unit's last iteration with a past-tense result fell through both checks and was treated as in-flight on the current hat — the wave never progressed. Added a normalization pass in both migrateUnitFile and migrateFeedbackFile so the drift dies at the source. The v0-to-v4-realistic-scenario test that asserted the buggy preserve-as-rejected behavior now asserts the correct normalized form. Same edit removes the no-op `started_at` double-write (issue #11). #3 (haiku_run_next.ts:628 + side-effects.ts) — `withIntentMainLock` was imported in state-tools.ts and promised to the agent in merge_stage.ts / merge_intent.ts but never actually called. Stage→intent-main merges ran unserialised; two concurrent ticks (autopilot retry overlapping a manual run) could race on the merge commit. Wrapped all three call sites (haiku_run_next merge_stage loop + side-effects' pre-stage cleanup + finalize-stage paths). HIGH: #5 (locks.ts:61) — `isAlive` bare-catch returned false for every error including EPERM. On Windows / cross-user Linux, `process.kill(pid, 0)` throws EPERM (not ESRCH) when the target process is alive but unsignalable. Lock-stealing kicked in against live holders. Differentiate: ESRCH = dead, EPERM = alive-but-unpingable. N1 (haiku_run_next.ts:484) — close_feedback handler scanned the FB dir with `f.startsWith("FB-01-")` but real files are named `01-slug.md`. The match never fired → fbFile stayed empty → break → cursor re-emits close_feedback every tick → infinite loop. Replaced with the existing `findFeedbackFile` helper (single source of truth that already handles both wire-form FB-NNN and on-disk NNN-slug.md). Tests passed accidentally because fixtures used non-standard `FB-DRIFT-NN-` naming. MEDIUM: #7 (cursor.ts:495) — walkFeedbackTrack fell back to the entire stages list when currentStage wasn't found. Walking unknown future stages was the opposite of what "walk up to current" meant. Fall back to `[]` instead. #8 (drift-sweep.ts:282 + sign-slot.ts:104) — Output paths come in two shapes: intent-relative (`stages/design/foo.md`) and repo-relative (`src/components/Button.tsx`). Both were joined against intentDir, so repo-relative paths resolved to non-existent locations and silently skipped drift detection — the most important code artifacts had no witness. Distinguish by leading `stages/`: intent-relative goes to intentDir, everything else to repoRoot. Applied at both sign time and check time. #9 (cursor.ts:574) — `parseFbIdFromFilename` fallback returned the raw basename for non-numeric filenames; downstream `findFeedbackFile` rejected the unparseable ID with `feedback_not_found`, which the cursor ticked into infinite re-dispatch. Return null instead, and skip the FB in `nextActionForFeedback` when the ID can't be parsed. Also bumped the padStart to 3 (matches the 2026-05-07 numeric-ID refactor). N3 (intent.ts + clarify_required.ts:56) — The clarify gate instructed agents to write `intent.md` via `haiku_human_write` (which refuses intent.md) and the field `clarifications` wasn't in the intent schema's properties so `haiku_intent_set` rejected it too. Both paths blocked. Add `clarifications` as `Type.Record(Type.String(), Type.Unknown())` to the schema (so it's in AGENT_AUTHORABLE_INTENT_FIELDS) and rewrite the prompt to use `haiku_intent_set` with a merge note (the field overwrites, so the agent must read prior stages first). N5 (git-worktree.ts:732) — `restoreBranch` checkout-restore was inside the same try block as the operation. An unhandled throw between `git checkout intentMain` and the restore would skip the restore and leave the worktree on the wrong branch — every subsequent agent write lands on intentMain instead of the stage branch. Moved restore into a `finally` block so it always runs when restoreBranch is set. Sweep: 1365 passed, 0 failed across 101 test files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-05-08T02:46:49Z

Claude finished @jwaldrip's task in 11m 30s —— View job

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

Gather PR context and prior review history
Review latest commit (3938863: "address all 12 open PR feat(v4): cursor-driven engine refactor + UX hardening #323 review comments")
Verify status of all previously-flagged issues (perf: optimize session start hook performance #1–feat(cleanup): detect and remove merged worktrees #11, N1–N7)
Check CI failures (Biome Lint + MCP Tests)
Post final review summary

The 3938863 commit closes all 12 issues that prior reviews had flagged. Every fix was verified against the actual diff. Well done on the thoroughness — the migration normalizer, lock serialization, close_feedback lookup, and finally-based branch restore all land correctly.

Two issues remain blocking merge: one carry-forward (N7, not in the 12) and two new CI failures introduced by the latest commit.

Status of all previously-flagged issues

#	Severity	Status
1	Critical	FIXED ✓ — major-only version compare (`sourceMajor !== targetMajor`)
2	Critical	FIXED ✓ — normalization pass in both `migrateUnitFile` + `migrateFeedbackFile`
3	Critical	FIXED ✓ — `withIntentMainLock` wraps all three merge call-sites in `haiku_run_next` + `side-effects.ts`
4	High	FIXED ✓ (prior)
5	High	FIXED ✓ — `catch (err) { return err.code === "EPERM" }`
6	High	FIXED ✓ (prior)
7	Medium	FIXED ✓ — `cutoff === -1` now returns `[]`
8	Medium	FIXED ✓ — repo-relative paths (`!startsWith("stages/")`) joined against `repoRoot` in both `drift-sweep.ts` and `sign-slot.ts`
9	Medium	FIXED ✓ — `parseFbIdFromFilename` returns `null`; `nextActionForFeedback` skips the file
10	Minor	FIXED ✓ — migration error now returns `track: "intent"`
11	Minor	FIXED ✓ — no-op `started_at` double-write removed
N1	High	FIXED ✓ — `findFeedbackFile` replaces broken `startsWith` prefix scan
N2	High	FIXED ✓ (prior)
N3	Medium	FIXED ✓ — `clarifications` added to `INTENT_FRONTMATTER_SCHEMA`; prompt updated to use `haiku_intent_set`
N5	Medium	FIXED ✓ — `restoreBranch` checkout moved into `finally` block
N7	Critical	STILL OPEN — see below

CRITICAL — not in the 12, still open

N7. `start_feedback_hat` emits `feedback_id: "FB-001"` (string) — fix loop schema-rejected on every call

File: packages/haiku/src/orchestrator/prompts/start_feedback_hat.ts:153,159,163

The 37ec691 refactor changed all 8 haiku_feedback_* tool schemas to feedback_id: Type.Integer({ minimum: 1, maximum: 999 }). The description is explicit: "Just a number; no FB- prefix, no zero-padding, no string form."

start_feedback_hat.ts was not updated. The feedbackIds array comes from the cursor as strings like "FB-001". The prompt template on lines 153, 159, and 163 embeds them with quotes:

`feedback_id: "${fbId}"` // → feedback_id: "FB-001"

Line 128 also instructs agents: "pass each feedback_id verbatim, no substitution required." An agent following this verbatim passes "FB-001" (string) to every haiku_feedback_read, haiku_feedback_advance_hat, and haiku_feedback_reject_hat call. AJV rejects with haiku_feedback_read_input_invalid (type mismatch: string vs integer). The fix loop cannot proceed.

Fix:

for (const fbId of feedbackIds) {
    const fbNum = parseInt(fbId.replace(/^FB-/i, ""), 10)
    // ...
    `feedback_id: ${fbNum}` // no quotes — integer literal

Also update line 128 to clarify the agent should use the integer, not the "FB-NNN" display form.

Fix this →

CI FAILURES — blocking merge

C1 (Biome Lint — 42 errors)

The latest commit introduced 42 lint errors and 12 warnings across files touched by the fix pass. All are safe-fixable formatting / import-order issues (no logic changes required). Affected files:

packages/haiku/src/http/session-routes.ts — import ordering, formatting
packages/haiku/src/locks.ts — formatting (multi-line function signatures collapsed differently than Biome expects)
packages/haiku/src/orchestrator/migrate-registry.ts — formatting (throw new Error(...) line wrapping)
packages/haiku/src/orchestrator/migrations/field-hygiene.ts — import order (existsSync, readFileSync, readdirSync → existsSync, readdirSync, readFileSync), formatting
packages/haiku/src/orchestrator/migrations/v0-to-v4.ts — import order, formatting
And others (42 total errors reported)

Run npx @biomejs/biome check --write packages/haiku/src/ to auto-fix all of them, then verify and commit.

Fix this →

C2 (MCP Tests — drift coverage guard)

scripts/check-ci-covers-drift-tests.mjs requires four named test files. The v4 engine refactor (b743524) deleted drift-detection-gate.test.mjs and upstream-reconciliation.test.mjs because the pre-tick gates (drift-detection-gate.ts, upstream-reconciliation.ts) were absorbed into the v4 cursor. But the CI guard was not updated. Result: CI fails on every push.

The two missing files were added in commit 7af2c54 (which predates the v4 branch) and deleted in b743524. Neither exists anywhere in the current branch's git history.

Two options:

Update the guard — remove the two deleted filenames from REQUIRED_FILES in check-ci-covers-drift-tests.mjs and add the v4-equivalent coverage files that actually test the cursor's drift and feedback-track paths (e.g. cursor-walk.test.mjs, drift-scenarios.test.mjs).
Recreate the test files — write new drift-detection-gate.test.mjs and upstream-reconciliation.test.mjs that test the equivalent v4 cursor scenarios (the cursor now handles what those two gate modules did).

Option 1 is faster and more honest — the v4 cursor distributes that logic into the main cursor walk rather than standalone gate modules, so pointing the guard at the files that actually test the cursor is the right call.

Fix this →

Summary

Must fix before merge:

Issue	File	What breaks
N7	`start_feedback_hat.ts:153,159,163`	Fix loop completely broken — agents pass `"FB-001"` (string) to tools that require integer `feedback_id`; every advance/reject/read call is schema-rejected
C1	Multiple files (42 errors)	CI Biome Lint fails; merge blocked
C2	`scripts/check-ci-covers-drift-tests.mjs`	CI MCP Tests fails; two required test files deleted in v4 refactor, guard not updated

All 12 issues from prior reviews are resolved. The three items above are what stands between this PR and a clean CI.

…uards Three blockers from the latest review (2026-05-08): N7 (Critical) — start_feedback_hat / review_fix / intent_completion_fix emitted `feedback_id: "FB-001"` (string) into every tool-call template. The 37ec691 numeric-id refactor flipped the schema to `Type.Integer({ minimum: 1, maximum: 999 })` with explicit description *"Just a number; no FB- prefix, no zero-padding, no string form."* — agents following the prompt verbatim hit `<tool>_input_invalid` on every advance/reject/read call, so the entire fix loop was dead. Fix: derive `fbNum` once at dispatch time (parse the digit suffix off `FB-NNN`) and embed it as a JSON integer literal — `feedback_id: 1`, no quotes. Headings keep the `FB-NNN` display form for human readability; only the JSON args change. Updated all three prompt builders that compose fix-loop subagent blocks. C1 — 42 Biome lint errors + 12 warnings introduced by the prior commit's hand-edits (import ordering, optional-chain folding, etc.). Ran `biome check --write` then `biome check --write --unsafe` to clear the unused-imports backlog. Net effect across 35 files: pure formatting / dead-code removal, no logic changes. Spot-checked the files I'd actually touched; the rest are auto-formatter passes on files that just hadn't been re-formatted since their last edit. C2 — `scripts/check-ci-covers-drift-tests.mjs` required four named test files: `drift-detection-gate.test.mjs` and `upstream-reconciliation.test.mjs` were deleted in the v4 cursor refactor (b743524) when the standalone pre-tick gates were absorbed into the cursor walk; `drift-baseline.test.mjs` and `drift-markers.test.mjs` survived. CI failed on every push since v4 landed. Replaced REQUIRED_FILES with the v4-equivalent coverage: `drift-baseline`, `drift-markers`, `drift-mid-flight-e2e`, `drift-scenarios`, and `cursor-walk` — the files that actually exercise the cursor's drift behavior, marker bookkeeping, and track-walk priorities. Updated the header docstring + failure message to match. Sweep: 1365 passed, 0 failed across 101 test files. Biome: 0 errors, 0 warnings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-05-08T03:41:09Z

Claude finished @jwaldrip's task in 7m 57s —— View job

Claude finished @jwaldrip's task in ~15m ——

Gather PR context and prior review history
Review latest commit (bb08b27: N7 schema mismatch + CI guards)
Verify status of all previously-flagged issues (perf: optimize session start hook performance #1–feat(cleanup): detect and remove merged worktrees #11, N1–N7)
Full re-scan: engine, migration, MCP tools, SPA
Check CI status
Post final review summary

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

The bb08b27 commit addressed all three previously-open blockers (N7 integer feedback_id in prompt templates, C1 Biome lint, C2 drift-test CI guard). Every issue from prior reviews is now resolved. Two new CI failures were introduced.

Status of all previously-flagged issues

#	Severity	Status
1	Critical	FIXED ✓ — major-only semver compare (`sourceMajor !== targetMajor`)
2	Critical	FIXED ✓ — normalization pass in both `migrateUnitFile` + `migrateFeedbackFile`
3	Critical	FIXED ✓ — `withIntentMainLock` wraps all three merge call-sites
4	High	FIXED ✓ (prior)
5	High	FIXED ✓ — `catch (err) { return err.code === "EPERM" }`
6	High	FIXED ✓ (prior)
7	Medium	FIXED ✓ — `cutoff === -1` returns `[]`
8	Medium	FIXED ✓ — repo-relative paths joined against `repoRoot` in both `drift-sweep.ts` and `sign-slot.ts`
9	Medium	FIXED ✓ — `parseFbIdFromFilename` returns `null`; `nextActionForFeedback` skips the file
10	Minor	FIXED ✓ — migration error returns `track: "intent"`
11	Minor	FIXED ✓ — no-op `started_at` double-write removed
N1	High	FIXED ✓ — `findFeedbackFile` replaces broken `startsWith` prefix scan
N2	High	FIXED ✓ (prior)
N3	Medium	FIXED ✓ — `clarifications` added to `INTENT_FRONTMATTER_SCHEMA`; prompt updated to `haiku_intent_set`
N5	Medium	FIXED ✓ — `restoreBranch` checkout in `finally` block
N7	Critical	FIXED ✓ — `fbNum` integer literal in all three prompt builders (`start_feedback_hat.ts`, `review_fix.ts`, `intent_completion_fix.ts`)
C1	CI	FIXED ✓ — 0 Biome errors/warnings
C2	CI	FIXED ✓ — `check-ci-covers-drift-tests.mjs` updated to reference v4-equivalent files

CI FAILURES — blocking merge

C3 (Medium): `packages/haiku-api/src/schemas/session.ts` — 61 exports exceeds 60-export budget

Failure: Biome Lint job → "Check for god files" step

The PR added PickerSelectRequestSchema and PickerSelectResponseSchema to session.ts (packages/haiku-api/src/schemas/session.ts), pushing it from 59 to 61 exports — one over the 60-export budget enforced by scripts/check-no-god-files.mjs.

Fix (two options):

Extract picker schemas to packages/haiku-api/src/schemas/picker.ts (preferred — matches the per-concern split pattern).
Add an allowlist entry to scripts/check-no-god-files.mjs with a written reason if the file is cohesive.

Fix this →

C4 (Medium): `squash-merge-fallback.test.mjs` — glab fallback test fails on GitHub Actions

Failure: MCP Tests job → "Run MCP test suite" step
Test: "provider fallback success: glab reports merged MR → returns true" (squash-merge-fallback.test.mjs:245)

The test stubs only glab (not gh) and uses withStubbedPath(stub, fn) which sets process.env.PATH = "${stub}:/usr/bin:/bin". The test author expected that without a gh stub, detectPrTool() would return "glab". But GitHub Actions runners have gh at /usr/bin/gh — which is still in the restricted PATH. So which gh finds the real gh, detectPrTool() returns "gh", and isBranchMerged queries the real GitHub CLI (which returns nothing useful), so the function returns false instead of true.

Fix: In the glab-specific test case, add a stub gh script that returns [] (empty JSON — simulating no merged PRs) so the function falls through to glab:

const stub = makeStubDir({
    gh: "#!/bin/sh\necho '[]'\n",  // gh finds no merged PR → fall through to glab
    glab: "#!/bin/sh\necho '!42  Some merged MR  haiku/sq/foo  merged'\n",
})

Fix this →

MINOR — not blocking

`feedback-variants.ts:14` — comment says `maximum: 9999`, code has `maximum: 999`

File: packages/haiku/src/state/schemas/inputs/feedback-variants.ts:14

The comment block at the top says Type.Integer({ minimum: 1, maximum: 9999 }). The actual schema on line 38 uses maximum: 999 (consistent with the description string "NNN-…md 3-digit padding (max 999)"). Pure comment typo — the schema value is correct.

What changed since last review (`bb08b27`)

Fix	Verified
N7 — `feedback_id: ${fbNum}` integer in all three prompt builders	✓
C1 — Biome lint 42 errors cleared	✓
C2 — drift-test guard references v4-equivalent test files	✓
Normalization pass in migrator (issues #2)	✓
`withIntentMainLock` wraps merge call-sites (issue #3)	✓
`restoreBranch` in `finally` block (N5)	✓
`clarifications` in `INTENT_FRONTMATTER_SCHEMA` (N3)	✓
`findFeedbackFile` in close_feedback handler (N1)	✓
EPERM-aware `isAlive` (#5)	✓
`walkFeedbackTrack` cutoff === -1 returns [] (#7)	✓
Repo-relative output paths against repoRoot (#8)	✓

Summary

Two CI failures must be resolved before merging:

Issue	File	What breaks
C3	`packages/haiku-api/src/schemas/session.ts`	61 exports > 60-export budget → Biome Lint job fails on every push
C4	`packages/haiku/test/squash-merge-fallback.test.mjs:245`	glab fallback test assumes `gh` not in `/usr/bin`; fails on GitHub Actions where `gh` is at `/usr/bin/gh`

All prior critical/high issues are resolved. The two CI failures are straightforward to fix. Once cleared, this PR is in good shape to merge.

…k, god-file budget Three CI failures from the latest claude-review automation run, all landing in `bb08b27`: silent-test parser (run-all.mjs): the aggregator regex matched `ℹ pass N` / `ℹ fail N` (node:test default reporter), but Linux runners default to TAP-style output (`# pass N` / `# fail N`). 30 of 101 test files fell through both regex sets and reported as 0/0. The silent-test-loss guard (added 2026-05-06 after the v0-to-v4 migrator tests went silent) then failed the run. Locally on macOS the `ℹ` glyph form fires, which is why this didn't reproduce. Fix: extend each regex pair to also match the TAP form. Same handler for the success-exit and crash-exit paths. squash-merge-fallback test (CI gh-on-PATH leak): the `provider fallback success: glab reports merged MR` test stubbed glab and relied on `withStubbedPath` to hide gh. The PATH stub kept `/usr/bin` for `which` itself, but GitHub Actions ships `gh` at `/usr/bin/gh` — so `detectPrTool` ran `which gh` against the real binary, returned "gh", queried the un-stubbed real CLI, got nothing, and isBranchMerged returned false. Test passed locally on macOS (no /usr/bin/gh) and failed on Linux runners. Fix: inject a stub `which` script alongside the other stubs that only resolves tools the test deliberately stubbed. Uses `[ -x ... ]` against the stub dir so `which gh` returns nothing in this test (only glab was stubbed) regardless of what /usr/bin contains. god-file budget (haiku-api/src/schemas/session.ts): the file has 61 exports, 1 over the 60-export budget. It's a single concern (SPA session/wire-payload schemas) and every shape gets a `Schema` const + inferred `Type` export — the export count doubles by design. Splitting it would force every consumer to import from three barrels for one wire shape. Fix: add an ALLOWLIST entry with a written reason. Documented the revisit trigger (a sub-protocol like design-direction growing its own object graph) so the entry isn't a permanent waiver. Sweep: 1365 passed, 0 failed across 101 test files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-05-08T03:59:48Z

Claude finished @jwaldrip's task in 4m 2s —— View job

Claude finished @jwaldrip's task ——

Gather PR context and prior review history
Review latest commit (eb9b36d: "green up merge gate")
Verify status of C3 (god-file budget) and C4 (glab fallback test)
Full re-scan of engine, migration, MCP tools, SPA
Post final review summary

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

The eb9b36d commit closes the final three CI failures flagged in the previous review (C3, C4, and a newly-discovered TAP parser gap). Every issue raised across all prior review rounds is now resolved.

C3, C4, + TAP parser (`eb9b36d`) — all fixed

Issue	Fix
C3 — `session.ts` 61 exports over 60-export budget	ALLOWLIST entry added to `check-no-god-files.mjs` with a written reason (single-concern schema file where `Schema + Type` doubles the count by design)
C4 — glab fallback test fails on GH Actions (`gh` at `/usr/bin/gh` leaks into test PATH)	Stub `which` script injected by `withStubbedPath` — only resolves tools the test explicitly stubbed, so `which gh` returns nothing unless `gh` was stubbed
New — TAP output format (`# pass N` / `# fail N`) not parsed by `run-all.mjs`	Both regex pairs extended to match TAP (`^# pass (\d+)`) alongside the existing glyph form (`ℹ pass N`). Applied to both success-exit and crash-exit paths

The TAP gap is worth noting: 30 of 101 test files were silently reporting 0/0 on Linux CI because tsx defaults to TAP output on Linux runners while emitting the ℹ glyph form on macOS. The fix is correct (both forms now matched) and the silent-test-loss guard provides the right backstop so this class of bug stays caught.

Complete issue history — all resolved

#	Severity	Status	Summary
1	Critical	FIXED ✓	`run-tick.ts` exact semver compare → major-only `sourceMajor !== targetMajor`
2	Critical	FIXED ✓	v3 `"rejected"`/`"advanced"` not normalized in migrator → normalization pass added to both `migrateUnitFile` and `migrateFeedbackFile`
3	Critical	FIXED ✓	`withIntentMainLock` never called → wraps all three merge call-sites in `haiku_run_next` + `side-effects.ts`
4	High	FIXED ✓	`close_feedback` had no prompt builder → `close_feedback.ts` added and registered
5	High	FIXED ✓	`isAlive` bare catch stole live locks on Windows → `catch(err) { return err.code === "EPERM" }`
6	High	FIXED ✓	Stage[0] fallback when all stages merged blocked intent-level approvals → fallback removed
7	Medium	FIXED ✓	`walkFeedbackTrack` cutoff === -1 walked entire stage list → returns `[]`
8	Medium	FIXED ✓	Repo-relative output paths joined against `intentDir` → distinguished by `startsWith("stages/")`, joined against `repoRoot` in both `drift-sweep.ts` and `sign-slot.ts`
9	Medium	FIXED ✓	`parseFbIdFromFilename` raw-basename fallback caused unresolvable dispatch IDs → returns `null`; callers skip
10	Minor	FIXED ✓	Migration error returned `track: "drift"` → now `track: "intent"`
11	Minor	FIXED ✓	No-op `started_at` double-write in unit migrator → removed with explanatory comment
N1	High	FIXED ✓	`close_feedback` handler used `"FB-NNN"` prefix scan; files are `NNN-*.md` → replaced with `findFeedbackFile()`
N2	High	FIXED ✓	Picker timeout caused infinite re-open loop → select tools return `isError: true` on timeout
N3	Medium	FIXED ✓	`clarify_required` instructed `haiku_human_write` on blocked path → `clarifications` added to `INTENT_FRONTMATTER_SCHEMA`; prompt updated to `haiku_intent_set`
N5	Medium	FIXED ✓	`reconcileMisroutedStageMerges` restore not in `finally` → moved into `finally` block
N7	Critical	FIXED ✓	`start_feedback_hat` emitted `"FB-001"` string; tools require integer → `fbNum` integer literal in all three prompt builders
C1	CI	FIXED ✓	Biome lint 42 errors → 0
C2	CI	FIXED ✓	Drift-test guard referenced deleted files → updated to v4-equivalent cursor test files
C3	CI	FIXED ✓	God-file budget 61/60 on `session.ts` → ALLOWLIST entry with written reason
C4	CI	FIXED ✓	glab fallback test failed on Linux (real `gh` leaked in) → stub `which` injected
TAP	CI	FIXED ✓	`run-all.mjs` missed TAP output format → both regex pairs extended

Assessment

All 21 issues across 6 review iterations are resolved. The fixes are thorough and well-implemented — the normalization pass covers both migrateUnitFile and migrateFeedbackFile, the lock serialization wraps all three merge call-sites, the findFeedbackFile replacement in the close handler is correct, and the restoreBranch is now guaranteed by finally.

CI is running now for this commit. No new issues identified. This PR is ready to merge once CI passes.

Response to Brian Suh's "Agents Need Control Flow, Not More Prompts." Frames the v4 engine refactor (PR #323) as the lived version of his argument: prompts no longer carry workflow guarantees, the surface area refuses invalid actions, the cursor decides workflow position without a model in the loop. Grounded in Mike's quality_gates session, Panda's silent cross-stage move, today's intent-scope auto-resolve fix, and the four-round review cycle that landed the bundle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-05-08T04:06:39Z

Claude finished @jwaldrip's task in 5m 2s —— View job

Claude finished @jwaldrip's task ——

Gather PR context and prior review history
Check CI status for latest commits
Review full PR diff (engine, migration, cursor, MCP tools, SPA, tests)
Verify status of all previously-flagged issues
Post final review summary

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

One new commit since the last review: 26bac7b — a blog post ("A Guardrail Isn't a Sign"). All code fixes from the six prior review rounds are still in place and verified.

Status of all previously-flagged issues

#	Severity	Status
1	Critical	FIXED ✓ — `sourceMajor !== targetMajor` in `run-tick.ts:85`
2	Critical	FIXED ✓ — normalization pass in both `migrateUnitFile` (line 253) and `migrateFeedbackFile` (line 364)
3	Critical	FIXED ✓ — `withIntentMainLock` wraps `mergeStageBranchIntoMain` at `haiku_run_next.ts:597`
4	High	FIXED ✓ (prior round)
5	High	FIXED ✓ — `catch(err) { return err.code === "EPERM" }` in `locks.ts:67`
6	High	FIXED ✓ (prior round)
7	Medium	FIXED ✓ — `cutoff === -1` returns `[]` in `cursor.ts:515`
8	Medium	FIXED ✓ — repo-relative paths branch on `startsWith("stages/")` in both `drift-sweep.ts:295` and `sign-slot.ts:115`
9	Medium	FIXED ✓ — `parseFbIdFromFilename` returns `null` on non-numeric filename; `nextActionForFeedback` skips with `cursor.ts:554` guard
10	Minor	FIXED ✓ — migration error returns `track: "intent"`
11	Minor	FIXED ✓ — no-op `started_at` double-write removed
N1	High	FIXED ✓ — `findFeedbackFile(slug, stage, fbId)` replaces broken `startsWith` prefix scan in `haiku_run_next.ts:465`
N2	High	FIXED ✓ (prior round)
N3	Medium	FIXED ✓ — `clarifications` added to `INTENT_FRONTMATTER_SCHEMA` at `intent.ts:98`; prompt uses `haiku_intent_set`
N5	Medium	FIXED ✓ — `restoreBranch` checkout moved into `finally` block at `git-worktree.ts:775`
N7	Critical	FIXED ✓ — `fbNum` integer literal in all prompt builders (`start_feedback_hat.ts:154,160,164`)
C1	CI	FIXED ✓ — 0 Biome errors
C2	CI	FIXED ✓ — drift-test guard updated to v4-equivalent files
C3	CI	FIXED ✓ — god-file allowlist entry with written reason
C4	CI	FIXED ✓ — `withStubbedPath` now stubs `which` so real `/usr/bin/gh` doesn't leak in
TAP	CI	FIXED ✓ — `run-all.mjs` handles both TAP (`# pass N`) and glyph (`ℹ pass N`) output

New commit: blog post (`26bac7b`)

website/content/blog/a-guardrail-isnt-a-sign.md — a post building on Brian Suh's "Agents Need Control Flow, Not More Prompts". The core argument and the concrete grounding (file paths, commit SHAs, tool names) are all accurate. Two minor factual issues worth correcting before publication:

1. Review round count is understated

Line 35: "Four rounds of automated Claude review on the action runner."
Line 41: "all four review rounds went green."

There have been six prior review rounds (six "Claude finished @jwaldrip's task" entries in the PR comments), with this current run being round seven. "Four" understates the actual iteration count, which is part of the post's point about the review rigor.

2. Test count appears to be from an intermediate snapshot

Line 35: "1365 tests across 101 files"

The PR description (written at the start of the 52-hour session) reports 1244 haiku engine tests + 453 SPA tests + 20 website tests = 1717 total. The current branch has 102 haiku test files alone. 1365 looks like an intermediate count from mid-session. The exact current number is the one that can be verified (node packages/haiku/test/run-all.mjs + npx vitest run totals). The phrase "all green at merge" is also premature since CI is still running at time of this review — though given the trajectory, likely accurate once it finishes.

These are editorial; no code changes needed.

Assessment

All 21 code issues across six review rounds are resolved. The engine, migrator, MCP tool schemas, lock serialization, prompt builders, SPA dual-paths, and CI guards are all in the correct state. The only open items are two factual inaccuracies in the blog post that can be corrected before the post goes live.

CI is running — pending outcome of run #25536096645. Based on the fix history and test coverage, this PR is in good shape to merge once CI passes.

Three docs were lagging behind the v4 cursor refactor (commit b743524): ARCHITECTURE.md (canonical structural reference): updated frontmatter spec to drop per-stage state.json, switch to NNN-padded filenames, and document the FM witness fields the cursor walks. Forward-only lifecycle table now derives status from FM (started_at + iterations[-1]) instead of a stored enum, with a new paragraph clarifying that stages aren't sealed (only intents are) and firstUnmergedStage rewinds the cursor when corrective work commits to an earlier stage. Phases section renamed "cursor-derived" with the v4 five-row table (elaborate, execute, review, approve/gate, merge). §5 "Workflow tick semantics" rewritten end-to-end: cursor model with three tracks (drift / feedback / intent), priority order, derivation function semantics, the actual cursor `kind` set, and a new §5.8 documenting the v0→v4 migrator (strips deprecated FM, normalizes past-tense iteration results "rejected"→"reject" / "advanced"→"advance", deletes per-stage state.json, synthesizes user approvals on completed units). FB lifecycle section dropped v3 revisit actions and added classifier-hat triage via haiku_feedback_set_targets. Hook boundary section gets a new §7.1 covering the MCP input gates (TypeBox + AJV with stable named error codes). haiku-method.md (paper): five modes now (continuous / discrete / discrete-hybrid / autopilot / quick) — was "two." Stage loop reframed as five conceptual phases derived by the cursor (was a "fixed six-step loop"); pre-execution adversarial review removed (v4 runs review AFTER execute, not before). Drift detection section rewritten: hashes embed in witness records on FM, no baseline.json, no human-via-mcp / human-implicit authorship classes. "Two commands" became "Commands and the Workflow Tool" — the agent's contract is haiku_run_next, end of story. Studio count nudged from "fifteen" to "two-dozen-plus" with the catalog framed as open-ended. Glossary picked up Workflow Cursor and Feedback Classification entries; Bolt redefined off iterations[] history; Spec Review reframed as the engine-built first reviewer. one-instruction-at-a-time.md (blog): response to Brian Suh's "Agents Need Control Flow, Not More Prompts." Frames v4 as the lived version of his argument — H·AI·K·U has always been workflow-first; v4 is where that lands cleanly. Foregrounds the cursor (derivePosition, nextHatForUnit), the one-instruction-at-a-time loop semantic, drift detection (sweep + content hashes), and the per-finding fix loop dispatch. Slug renamed to match new title. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-05-08T04:31:04Z

Claude finished @jwaldrip's task in 4m 44s —— View job

Claude finished @jwaldrip's task ——

Gather PR context and prior review history
Review new commits since last review (222a863: docs + blog post alignment)
Verify status of all previously-flagged issues (perf: optimize session start hook performance #1–feat(cleanup): detect and remove merged worktrees #11, N1–N7, C1–C4)
Full re-scan of engine, migration, MCP tools, SPA, architecture map
Post final review summary

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

One new commit reviewed: 222a863 docs(v4): align paper + arch doc, ship blog post. All 21 code issues from prior review rounds remain resolved. CI is green (Biome Lint ✅, MCP Tests ✅). One new medium issue: the architecture map's tick semantics modal was not updated as part of the v4 cursor model change.

Status of all previously-flagged issues

#	Severity	Status
1–11, N1–N7	Critical/High/Medium	ALL FIXED ✓ — verified unchanged
C1–C4, TAP	CI	ALL FIXED ✓

What landed in `222a863`

`plugin/studios/ARCHITECTURE.md` — accurate ✓

The full §5 rewrite correctly describes the v4 cursor model. Three-track architecture (drift / feedback / intent), per-stage walk table, §5.8 migrator documentation, v4 action surface table — all verified against the actual cursor.ts implementation. The forward-only lifecycle table now correctly derives status from started_at + iterations[-1].result rather than a stored enum. The new "stages are not sealed, only intents are" paragraph is accurate to firstUnmergedStage behavior.

`website/content/papers/haiku-method.md` — accurate ✓

Five modes updated (was two). Stage loop reframed as five cursor-derived phases. Drift detection section correctly describes body-sha256 in witness FM with no baseline.json. "Two commands" → "Commands and the Workflow Tool" with haiku_run_next as the single workflow verb. Studio count updated to "two-dozen-plus." Glossary entries for Workflow Cursor and Feedback Classification are accurate.

`website/content/blog/one-instruction-at-a-time.md` — accurate ✓

The blog cites real artifacts and all claims check out:

derivePosition is in packages/haiku/src/orchestrator/workflow/cursor.ts ✓
nextHatForUnit at line 291 of the same file ✓ (confirmed with grep)
sign-slot.ts:79 is outputSha256, which dispatches to the two hash helpers ✓
Commit b743524ab is the v4 base commit ✓
The description of nextHatForUnit behavior ("advance → next hat, reject → walk back one") is accurate to the code ✓
Word count: ~1,240 words, within the 800–1200 target ✓

Previous review (2026-05-08T04:31:04Z) noted two factual inaccuracies in the old a-guardrail-isnt-a-sign.md post (understated review round count, stale test count). Both are resolved by replacing the post entirely with the new one-instruction-at-a-time.md, which makes no such claims.

MEDIUM — new issue

Architecture map `tickSemantics` modal is stale — describes v3 pre-advance check model

File: website/app/studios/[slug]/architecture/_components/ModalRouter.tsx:503,534,540,693

The tickSemantics modal (accessible via the "⏱ tick semantics" button on the architecture map page) still describes the v3 pre-advance check architecture:

Line 503: "Engine reads on-disk state (intent.md frontmatter, stage state.json, unit/feedback frontmatter)…" — state.json doesn't exist in v4.
Line 534: "layer 1 — pre-advance checks (run-tick.ts, every tick)" — v4 replaced this with the three-track cursor model.
Line 540: "pre-tick consistency — repairs cached active_stage drift, state.json invariants" — active_stage and state.json are both gone in v4.
Line 693: "…but ≥ 1 sits on a stage earlier than active_stage → emit revisited" — the revisited sideline action was replaced by cursor Track B's cross-stage FB walk.

The .claude/rules/architecture-prototype-sync.md rule at line 57 is explicit: "New tick / pre-advance check / sideline action → Update the tickSemantics modal in ModalRouter.tsx; sync with plugin/studios/ARCHITECTURE.md §5." The v4 cursor model is a fundamental replacement of the pre-advance check system — the modal should describe the three-track cursor (drift / feedback / intent) and the cursor action surface from §5.5 of the newly-updated ARCHITECTURE.md.

Fix this →

CI Status

Check	Status
Biome Lint	✅ pass
MCP Tests	✅ pass

Assessment

The code is clean. All 21 issues from six prior review rounds are resolved. CI is green. The 222a863 docs commit is factually accurate and well-aligned to the v4 implementation.

The one remaining gap is the tickSemantics modal in the architecture map — it's user-facing documentation that now describes a model the engine no longer uses. The project's own sync rules require it to be updated when the tick contract changes, and the v4 cursor is the largest tick-contract change in the PR's history. This can land as a follow-up or be addressed before merge, but it's worth calling out explicitly.

The interactive arch map at /studios/<slug>/architecture and the auto-generated per-studio Mermaid diagrams were lagging behind the v4 cursor refactor. Per .claude/rules/architecture-prototype-sync.md, the map is canonical — divergence is a bug. Map data files (`website/app/studios/[slug]/architecture/_data/`): - `payload-for.ts` — every TransitionKey now maps to a real cursor action (`start_unit_hat`, `start_feedback_hat`, `dispatch_review`, `dispatch_approval`, `dispatch_quality_gates`, `user_gate`, `close_feedback`, `merge_stage`, `merge_intent`, `drift_detected`). Removed v3-only `manual-change-assessment`, `coverage-review-required`, `output-liveness-review-required`. Mode shaping wired in (autopilot collapses reviews to `[spec]` and approvals to `[spec, quality_gates]`). Cross-stage feedback routing now framed as "purely by file location, no `upstream_stage:` field." `withIntentMainLock` noted on merge actions. Visual TransitionKey names (`hat-to-hat`, `wave-to-wave`, `gate-to-next-stage`) retained — they describe map UI positions, not cursor actions. The prose under each makes the v4 reality explicit: hats advance via subagent `haiku_unit_advance_hat` calls, waves emerge from the cursor's wave-ready predicate. - `actors.ts` — five-actor section rewritten. User mode taxonomy updated to continuous/discrete/discrete-hybrid/autopilot/quick; HITL/OHOTL/AHOTL labeled deprecated. Orchestrator notes rewritten around cursor model + `firstUnmergedStage` invariant + numeric `feedback_id` at the wire. Web UI reframed around `user_gate` variants. - `hooks.ts` — 12 entries → 8. Removed nonexistent ones (`track- outputs`, `ensure-deps`, `inject-context`, `subagent-context`, `subagent-hook`, `quality-gate`); added `stamp-agent-write` and `edit-auto-read-hint`. Verified against `plugin/hooks/hooks.json` and `packages/haiku/src/hooks/`. - `types.ts` — `ModalKind` variant `"preTickTriage"` → `"cursorTracks"`. Map components (`_components/`): - `ArchitectureMap.tsx` — gate badge label updated ("pre-tick triage gate" → "cursor track walk: Track C drift → Track B feedback → Track A intent"); spec gate description rewritten around `dispatch_review` per role; reject branch points at `start_feedback_hat (Track B)`; fix-loop section drops v3 `feedback_revisit` / `integrate_fix_chains` callouts; pre-intent card orientation chain updated. - `ModalRouter.tsx` — `tickSemantics` modal rewritten around the three-track cursor walk and the v4 action surface (round 8 review flag); `cursorTracks` modal (was `preTickTriage`) describes Track C/B/A; `gateDetail` rewritten; `subagent` modal tool list updated; `validation` modal now points at TypeBox + AJV input gate. Auto-generated diagrams: regenerated all 24 `website/public/workflow- diagrams/*.mmd` files via `bun run --cwd packages/haiku export: workflow-diagrams`. Last regeneration was pre-v4 (`df4a4758c`); the .mmd files now match the v4 cursor structure. Type check (`cd website && npx tsc --noEmit`): clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

claude · 2026-05-08T09:48:10Z

Claude finished @jwaldrip's task in 4m 37s —— View job

Claude finished @jwaldrip's task ——

Gather PR context and prior review history
Review new commit (d35c79d: arch prototype map + workflow diagrams)
Verify tickSemantics modal update (flagged in last review as stale)
Full re-scan: engine, migration, MCP tools, SPA, architecture map
Check CI status
Post final review summary

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

One new commit since the last review: d35c79d docs(arch): align prototype map + workflow diagrams with v4 cursor. This commit directly addresses the one remaining gap flagged in the prior review — the tickSemantics modal and broader architecture map were still describing the v3 pre-advance check model.

What landed in `d35c79d`

The commit updates five data files and two components in website/app/studios/[slug]/architecture/, plus regenerates all 24 website/public/workflow-diagrams/*.mmd files.

`ModalRouter.tsx` — `tickSemantics` modal fully rewritten ✓

The three specific stale lines called out in the prior review are gone:

Prior round flag	Old (v3)	New (v4)
`state.json` reference	"Engine reads … `stage state.json`…"	"Engine reads … derives cursor position via `derivePosition`"
`active_stage` invariant entry	"pre-tick consistency — repairs `active_stage` drift, state.json invariants"	"Track C — drift sweep. Re-hashes signed witnesses…"
`revisited` sideline action	"…emit `revisited` targeting the earliest such stage"	"Track B — feedback. Walks stages 0..active… Cross-stage routing is purely by file location"

The new modal adds: pre-cursor selection gates section, complete v4 action surface, three-track priority walk (C → B → A), four v4 principles ("State on disk is the truth", "Composition is pure", etc.), and a canonical reference pointer to ARCHITECTURE.md §5.

`actors.ts` — accurate ✓

HITL/OHOTL/AHOTL terminology correctly labeled as deprecated. v4 mode taxonomy (continuous / discrete / discrete-hybrid / autopilot / quick) matches intent.ts. Orchestrator notes rewritten around firstUnmergedStage invariant and numeric feedback_id. All five-actor section verified.

`hooks.ts` — accurate ✓

Trimmed from 12 to 8 entries. Verified against packages/haiku/src/hooks/ — every entry maps to a real file:
redirect-plan-mode.ts, guard-workflow-fields.ts, workflow-guard.ts, prompt-guard.ts, inject-state-file.ts, context-monitor.ts, stamp-agent-write.ts, edit-auto-read-hint.ts. The four removed entries (track-outputs, ensure-deps, inject-context, subagent-context, subagent-hook, quality-gate) do not exist in the codebase.

`payload-for.ts` — accurate ✓

All v3-only actions (manual-change-assessment, coverage-review-required, output-liveness-review-required, feedback_dispatch, feedback_revisit, integrate_fix_chains) removed. Every mapped cursor action verified against cursor.ts: start_unit_hat, start_feedback_hat, dispatch_review, dispatch_approval, dispatch_quality_gates, user_gate, close_feedback, merge_stage, merge_intent, drift_detected — all real. withIntentMainLock noted on merge actions and verified against the actual call at haiku_run_next.ts:597. select_studio / select_mode / select_stage pre-cursor gates documented and verified against run-tick.ts:137-163.

`types.ts` — accurate ✓

ModalKind variant "preTickTriage" → "cursorTracks". Both ModalRouter.tsx (line 639) and ArchitectureMap.tsx (line 358) updated to use cursorTracks. No dangling references to the old name.

Workflow diagrams — accurate ✓

All 24 .mmd files regenerated via bun run --cwd packages/haiku export:workflow-diagrams. Spot-checked compliance/scope: STAGE.md declares fix_hats: [classifier, compliance-analyst, feedback-assessor]; the generated diagram shows scope_review_fix_bolt_1_classifier → scope_review_fix_bolt_1_compliance_analyst → scope_review_fix_bolt_1_feedback_assessor. ✓

Status of all previously-flagged issues

#	Severity	Status
1	Critical	FIXED ✓ — `sourceMajor !== targetMajor` in `run-tick.ts`
2	Critical	FIXED ✓ — normalization pass in `migrateUnitFile` + `migrateFeedbackFile`
3	Critical	FIXED ✓ — `withIntentMainLock` wraps all three merge call-sites
4	High	FIXED ✓ (prior)
5	High	FIXED ✓ — `catch(err) { return err.code === "EPERM" }`
6	High	FIXED ✓ (prior)
7	Medium	FIXED ✓ — `cutoff === -1` returns `[]`
8	Medium	FIXED ✓ — repo-relative paths joined against `repoRoot`
9	Medium	FIXED ✓ — `parseFbIdFromFilename` returns `null` on non-numeric filename
10	Minor	FIXED ✓ — migration error returns `track: "intent"`
11	Minor	FIXED ✓ — no-op `started_at` double-write removed
N1	High	FIXED ✓ — `findFeedbackFile` replaces broken prefix scan
N2	High	FIXED ✓ (prior)
N3	Medium	FIXED ✓ — `clarifications` in schema; prompt uses `haiku_intent_set`
N5	Medium	FIXED ✓ — `restoreBranch` in `finally` block
N7	Critical	FIXED ✓ — `fbNum` integer literal in all prompt builders
C1	CI	FIXED ✓ — Biome 0 errors
C2	CI	FIXED ✓ — drift-test guard updated
C3	CI	FIXED ✓ — god-file allowlist entry
C4	CI	FIXED ✓ — stub `which` in glab fallback test
TAP	CI	FIXED ✓ — `run-all.mjs` handles both output formats
stale arch map	Medium	FIXED ✓ — `tickSemantics` modal, `cursorTracks` modal, `payload-for.ts`, `actors.ts`, `hooks.ts`, all 24 `.mmd` diagrams updated

CI Status

Check	Status
Biome Lint	✅ pass
MCP Tests	⏳ in progress (test suite running)

Assessment

All 22 issues across 8 review rounds are resolved. The code is clean, the architecture map is accurate, the workflow diagrams are freshly regenerated, and the blog post cites real artifacts with verifiable line numbers. No new issues identified in d35c79d.

This PR is ready to merge once MCP Tests completes. Based on the clean run of Biome Lint and the drift-test coverage verification passing, the test suite is in good shape.

The 3.17.0 auto-bump (commit b3c09f1) was the bumper bot misclassifying PR #323 — the squash commit subject was `feat(v4): cursor-driven engine refactor + UX hardening` (no `!:` and no `BREAKING CHANGE:` footer), so the regex pass returned `minor` and the bumper short-circuited before Claude pass 2 could escalate. The cursor refactor is a major release — the migrator gate at `run-tick.ts:77` keys on `targetMajor >= 4`, so shipping under a 3.x stamp meant every v0/v3 intent's auto-migration to v4 silently no-op'd. Cleanup of the bad release: - v3.17.0 GitHub release deleted - v3.17.0 git tag deleted (local + remote) - haiku-method@3.17.0 unpublished from npm - This commit bumps to 4.0.0 explicitly; CHANGELOG rewrites the entry as a major-release announcement covering the breaking changes (per- stage state.json removed, numeric feedback_id at the wire, auto- resolve refusal in git mode, upstream_stage:/triaged_at: gone, manual_change_assessment action removed). Bumper-script fix lands in a follow-up commit so the workflow run on that push correctly bumps 4.0.0 → 4.0.1 (patch) and publishes the fixed bumper alongside the now-current bundle. Until then this commit ships v4.0.0 metadata + a fresh bundle built locally with v4.0.0 baked into MCP_VERSION. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

) The 2026-05-08 v3.17.0-was-supposed-to-be-v4.0.0 incident: PR #323's squash subject was `feat(v4): cursor-driven engine refactor + UX hardening` — `feat:` not `feat!:`, no `BREAKING CHANGE:` footer. The regex pass returned `minor` and short-circuited before Claude pass 2 could escalate. The cursor refactor + numeric-id schema flip + per- stage state.json removal shipped under a 3.x stamp, which silently disabled the v0→v4 migrator (gate keys on `targetMajor >= 4`). Two changes: 1. Pass 2 (Claude haiku) now runs whenever pass 1 returned anything below `major` — not just `patch`. The model can upgrade `minor → major` if the diff smells major (engine source rewrites, schema wire-type flips, ARCHITECTURE.md structural rewrites, migrator-gated version comparisons). 2. Pass 2's result is rank-floored against pass 1: it can upgrade (patch → minor → major) but never downgrade. Pass 1 stays the floor — an explicit `BREAKING CHANGE:` marker still beats Claude. Prompt updated with concrete major-signal hints (engine workflow file renames, schema wire-type flips, ARCHITECTURE/paper structural rewrites, `targetMajor >= N` migrator gates) so future cursor-class refactors don't slip through under a `feat:` prefix again. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jwaldrip and others added 2 commits May 6, 2026 15:38

jwaldrip and others added 4 commits May 7, 2026 09:13

jwaldrip and others added 3 commits May 7, 2026 15:21

jwaldrip merged commit f977488 into main May 8, 2026
4 checks passed

jwaldrip deleted the haiku/v4-engine-refactor branch May 8, 2026 10:20

jwaldrip mentioned this pull request May 8, 2026

chore(plugin): bump version 3.17.0 -> 4.0.0 (manual major bump) #325

Merged

jwaldrip mentioned this pull request May 8, 2026

fix(ci): bump-type detector escalates minor → major when warranted #326

Merged

1 task

jwaldrip mentioned this pull request May 8, 2026

feat(design): intake-first picker with designer uploads #322

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(v4): cursor-driven engine refactor + UX hardening#323

feat(v4): cursor-driven engine refactor + UX hardening#323
jwaldrip merged 17 commits into
mainfrom
haiku/v4-engine-refactor

jwaldrip commented May 7, 2026

Uh oh!

claude Bot commented May 7, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 7, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 7, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 7, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 8, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 8, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 8, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 8, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 8, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 8, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

jwaldrip commented May 7, 2026

Summary

What landed

Engine

P1–P9 (this session's priorities)

Studios

SPA

Website (browse)

Tests

Skill audit

Known gaps (deferred)

Test plan

Uh oh!

claude Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat(v4): cursor-driven engine refactor + UX hardening

CRITICAL

1. Version bump after ship permanently blocks all v4 intents

2. result: "rejected" (v3 form) not normalized — migrated units re-dispatch forever

3. withIntentMainLock is imported and documented but never called — stage→main merges are unserialised

HIGH

4. close_feedback has no prompt builder — agent receives "Unknown Action" JSON dump

5. isAlive(pid) catches EPERM as dead — premature lock-stealing on Windows and cross-user Linux

6. Stage[0] fallback when all stages merged runs drift/feedback tracks on completed work

MEDIUM

7. walkFeedbackTrack slices all stages when currentStage is unknown

8. Drift sweep resolves all output paths relative to intentDir — repo-relative outputs never checked

9. parseFbIdFromFilename fallback returns non-parseable IDs — causes infinite start_feedback_hat dispatch

MINOR

10. Migration error path returns misleading track: "drift" with empty events

11. migrateUnitFile line 243–245: no-op double-write of started_at

Summary

Uh oh!

claude Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

Status of previously-flagged issues

NEW issues from the latest commits

N1 (HIGH): close_feedback handler never finds feedback files

N2 (HIGH): select_* picker infinite loop on timeout

N3 (MEDIUM): clarify_required prompt instructs an impossible write path

Carry-forward: still critical before merge

Uh oh!

claude Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

Status of previously-flagged issues

CRITICAL — unchanged, still blocking

1. Exact semver comparison still breaks all v4 intents on CI bump

2. result: "rejected" (v3 form) still not normalized — migrated units stuck forever

3. withIntentMainLock still never called — stage→main merges unserialised

HIGH — still blocking

N1. close_feedback handler never finds FB files — infinite loop on every production FB

5. isAlive catches EPERM as dead — premature lock-stealing on Windows and cross-user Linux

NEW — introduced by the latest commit (cf1142b)

N5. reconcileMisroutedStageMerges may leave worktree on wrong branch on unexpected throw

MEDIUM — carry-forward open

7. walkFeedbackTrack walks all stages when currentStage is not found

8. Drift sweep joins repo-relative output paths against intentDir — check silently passes

9. parseFbIdFromFilename fallback returns raw basename — causes unresolvable dispatch IDs

N3. clarify_required prompt instructs a permanently-blocked write path

Summary

Uh oh!

claude Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Re-Review: feat(v4): cursor-driven engine refactor + UX hardening

Status of all previously-flagged issues

CRITICAL — still blocking merge

1. Exact semver compare still breaks all v4 intents on first CI bump

2. result: "rejected" still not normalized — migrated units stuck as in-progress forever

3. withIntentMainLock still never called — concurrent merge_stage ticks can clobber each other

HIGH — still blocking

N1. close_feedback handler never finds files in production — infinite loop

MEDIUM — should fix before merge

5. isAlive bare catch steals live locks on Windows / cross-user Linux

N5. restoreBranch checkout not in finally — wrong-branch writes on unexpected throw

7. walkFeedbackTrack walks all stages when current stage is unknown

8. Repo-relative outputs never witnessed — drift on real code files silently passes

9. parseFbIdFromFilename raw-basename fallback creates unresolvable dispatch IDs

claude Bot commented May 7, 2026 •

edited

Loading

2. `result: "rejected"` (v3 form) not normalized — migrated units re-dispatch forever

3. `withIntentMainLock` is imported and documented but never called — stage→main merges are unserialised

4. `close_feedback` has no prompt builder — agent receives "Unknown Action" JSON dump

5. `isAlive(pid)` catches EPERM as dead — premature lock-stealing on Windows and cross-user Linux

7. `walkFeedbackTrack` slices all stages when `currentStage` is unknown

8. Drift sweep resolves all output paths relative to `intentDir` — repo-relative outputs never checked

9. `parseFbIdFromFilename` fallback returns non-parseable IDs — causes infinite `start_feedback_hat` dispatch

10. Migration error path returns misleading `track: "drift"` with empty events

11. `migrateUnitFile` line 243–245: no-op double-write of `started_at`

claude Bot commented May 7, 2026 •

edited

Loading

N1 (HIGH): `close_feedback` handler never finds feedback files

N2 (HIGH): `select_*` picker infinite loop on timeout

N3 (MEDIUM): `clarify_required` prompt instructs an impossible write path

claude Bot commented May 7, 2026 •

edited

Loading

2. `result: "rejected"` (v3 form) still not normalized — migrated units stuck forever

3. `withIntentMainLock` still never called — stage→main merges unserialised

N1. `close_feedback` handler never finds FB files — infinite loop on every production FB

5. `isAlive` catches EPERM as dead — premature lock-stealing on Windows and cross-user Linux

NEW — introduced by the latest commit (`cf1142b`)

N5. `reconcileMisroutedStageMerges` may leave worktree on wrong branch on unexpected throw

7. `walkFeedbackTrack` walks all stages when `currentStage` is not found

8. Drift sweep joins repo-relative output paths against `intentDir` — check silently passes

9. `parseFbIdFromFilename` fallback returns raw basename — causes unresolvable dispatch IDs

N3. `clarify_required` prompt instructs a permanently-blocked write path

claude Bot commented May 7, 2026 •

edited

Loading

2. `result: "rejected"` still not normalized — migrated units stuck as in-progress forever

3. `withIntentMainLock` still never called — concurrent merge_stage ticks can clobber each other

N1. `close_feedback` handler never finds files in production — infinite loop

5. `isAlive` bare catch steals live locks on Windows / cross-user Linux

N5. `restoreBranch` checkout not in `finally` — wrong-branch writes on unexpected throw

7. `walkFeedbackTrack` walks all stages when current stage is unknown

9. `parseFbIdFromFilename` raw-basename fallback creates unresolvable dispatch IDs

N3. `clarify_required` prompt writes via permanently-blocked path

claude Bot commented May 8, 2026 •

edited

Loading

CRITICAL — new issue from `37ec691`

N7. `start_feedback_hat` emits `feedback_id: "FB-001"` (string) but every `haiku_feedback_*` tool now requires integer — fix loop broken for every dispatch

2. `result: "rejected"` still not normalized — migrated units stuck as in-progress forever

3. `withIntentMainLock` still never called — stage→main merges unserialised

N1. `close_feedback` handler never finds production feedback files — infinite loop

5. `isAlive` bare catch steals live locks on Windows / cross-user Linux

7. `walkFeedbackTrack` walks all stages when current stage is unknown

8. Repo-relative output paths still joined against `intentDir`

N3. `clarify_required` prompt instructs a permanently-blocked write path

N5. `restoreBranch` checkout not in `finally` — wrong-branch writes on unexpected throw

claude Bot commented May 8, 2026 •

edited

Loading

N7. `start_feedback_hat` emits `feedback_id: "FB-001"` (string) — fix loop schema-rejected on every call

claude Bot commented May 8, 2026 •

edited

Loading

C3 (Medium): `packages/haiku-api/src/schemas/session.ts` — 61 exports exceeds 60-export budget

C4 (Medium): `squash-merge-fallback.test.mjs` — glab fallback test fails on GitHub Actions

`feedback-variants.ts:14` — comment says `maximum: 9999`, code has `maximum: 999`

What changed since last review (`bb08b27`)

claude Bot commented May 8, 2026 •

edited

Loading

C3, C4, + TAP parser (`eb9b36d`) — all fixed

claude Bot commented May 8, 2026 •

edited

Loading

New commit: blog post (`26bac7b`)

claude Bot commented May 8, 2026 •

edited

Loading