fix: WorkflowBuilder preset field + reviewer double-booking#566
fix: WorkflowBuilder preset field + reviewer double-booking#566khaliqgant merged 11 commits intomainfrom
Conversation
Two fixes: 1. WorkflowBuilder.agent() was not mapping options.preset to the AgentDefinition, so preset: 'worker' had no effect via the builder API. Added preset to AgentOptions interface and the agent() method. This means resolveAgentDef can now see the preset and correctly default interactive to false for worker/reviewer/analyst presets. 2. resolveAutoReviewAgent independently assigned the same agent as reviewer for multiple concurrent steps. When worker-a reviewed both task-b and task-c simultaneously, one review could starve and hit the safety backstop timeout. Now tracks active reviewers in a Set and prefers unassigned agents, falling back to busy ones only when no alternative exists. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The builder's step() method and StepOptions interface only supported
agent steps. Deterministic (shell command) steps had to be defined
in YAML. Now the builder accepts:
.step('check', {
type: 'deterministic',
command: 'npm test',
captureOutput: true,
failOnError: true,
dependsOn: ['build'],
})
Also relaxes the toConfig() validation to allow workflows with only
deterministic steps (no agents required).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
….0.2 - Added preset field to Agent Definition section - Added Deterministic Steps subsection with builder API example - Added hub-pattern gotcha to Common Mistakes table - Bumped writing-agent-relay-workflows version to 1.0.2 in prpm.json Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Addresses Devin review feedback: when multiple concurrent steps resolve to the same reviewer agent, a Set incorrectly removes tracking on the first delete even while another step is still using that reviewer. A Map<string, number> with increment/decrement correctly handles concurrent usage of the same reviewer name. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…-owned agents For DAG/pipeline patterns, disable auto-hardening (owner+specialist+reviewer spawning) since it causes agent explosion. Auto-hardening is only appropriate for hub patterns (fan-out, hub-spoke, etc.) where cross-agent supervision is expected. Also fix self-owned agents being stuck in exit-only completion mode. The evidenceRole was always set to 'owner' in the self-owned path, triggering preserveIdleSupervisor and preventing idle-as-complete. Since self-owned agents have no dedicated owner, they should use 'specialist' role which races exit vs idle for faster completion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When an agent goes idle (30s silence) and has an output_contains verification, check if the verification token is present before treating idle as completion. If not yet present, continue waiting instead of prematurely releasing the agent. This prevents agents that need time to process (e.g., reading channel inbox, computing results) from being released before they complete their work. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. Restore evidenceRole to 'owner' for interactive self-owned agents (fixes signal attribution) while using preserveOnIdle: false for non-hub patterns to prevent exit-only mode blocking DAG agents. 2. Defer reviewer resolution to just before the review gate so activeReviewers is populated correctly for concurrent steps. 3. Track elapsed time in idle verification loop so timeoutMs is respected across iterations instead of restarting full timeout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Only force exit-only mode (preserveOnIdle) for lead-like agents in hub-spoke patterns. Workers/spokes should use idle-as-complete so they don't hang waiting for process exit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… guard The idle-as-complete loop used a simple `includes()` check for output_contains verification. This caused false positives when the verification token appeared in the task text (from the prompt instruction) — the loop found the task-injected occurrence and treated idle as complete, but runVerification requires a second occurrence from the agent's actual output. Also: only resolve auto-reviewer when usesDedicatedOwner is true, preventing spurious review gates for self-owned agents. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The broker fires agent_idle only once per idle transition. When verification fails on idle, the loop's continue would hang because waitForIdle won't resolve again without new output. Now waits a 15s grace period for the agent to produce output and re-idle. If the agent exits during grace, returns that result. If still idle after grace, releases the agent and lets upstream verification handle the completion decision. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| if (graceResult.kind === 'exit') { | ||
| return graceResult.result as 'exited' | 'timeout' | 'released'; | ||
| } |
There was a problem hiding this comment.
🔴 Grace period exit-timeout check doesn't distinguish real exits from timeouts, causing non-deterministic behavior
When an agent goes idle but output_contains verification hasn't passed, the code enters a 15-second grace period at packages/sdk/src/workflows/runner.ts:5367. Both waitForExit and waitForIdle are raced with the same 15s timeout. If the agent is permanently idle, both resolve with 'timeout' near-simultaneously. Which promise wins Promise.race is non-deterministic:
- If exit wins:
graceResult = { kind: 'exit', result: 'timeout' }→ theif (graceResult.kind === 'exit')check at line 5375 matches → returns'timeout'without releasing the agent. UpstreamspawnAndWaitthen treats this as a step timeout (packages/sdk/src/workflows/runner.ts:5146-5174), running verification and potentially throwing a timeout error. - If idle wins:
graceResult = { kind: 'idle', result: 'timeout' }→ falls through to line 5380-5383 → agent is released, returns'released'. Upstream proceeds normally with output collection.
The same scenario (permanently idle agent, grace period expired, verification not passed) produces different outcomes depending on which promise resolves first. The comment at line 5378 ("Grace period timed out — agent is permanently idle") indicates the intended behavior is always the release path, but the overly broad graceResult.kind === 'exit' check intercepts exit timeouts along with real exits.
Fix: filter out timeout results from the exit check
if (graceResult.kind === 'exit' && graceResult.result !== 'timeout') {
return graceResult.result as 'exited' | 'released';
}| if (graceResult.kind === 'exit') { | |
| return graceResult.result as 'exited' | 'timeout' | 'released'; | |
| } | |
| if (graceResult.kind === 'exit' && graceResult.result !== 'timeout') { | |
| return graceResult.result as 'exited' | 'released'; | |
| } |
Was this helpful? React with 👍 or 👎 to provide feedback.
| const idleGraceSecs = 15; | ||
| const graceResult = await Promise.race([ | ||
| agent.waitForExit(idleGraceSecs * 1000).then((r) => ({ kind: 'exit' as const, result: r })), | ||
| agent.waitForIdle(idleGraceSecs * 1000).then((r) => ({ kind: 'idle' as const, result: r })), |
There was a problem hiding this comment.
🟡 Idle verification grace period ignores remaining step timeout, potentially exceeding it by up to 15 seconds
At packages/sdk/src/workflows/runner.ts:5366-5369, the grace period uses a hardcoded idleGraceSecs = 15 without accounting for the remaining step timeout. The outer loop correctly computes remaining from timeoutMs at line 5334 and passes it to the main waitForExit/waitForIdle race. But once idle fires and verification fails, the grace period races with idleGraceSecs * 1000 (15000ms) regardless of how much timeout is left.
For example, if only 2 seconds of timeout remained when idle was detected, the main race resolves within 2s, but the grace period then waits up to 15s — exceeding the step timeout by 13 seconds. The timeout check at the top of the while loop (remaining <= 0) only runs on the next iteration, after the grace period has already blocked.
| const idleGraceSecs = 15; | |
| const graceResult = await Promise.race([ | |
| agent.waitForExit(idleGraceSecs * 1000).then((r) => ({ kind: 'exit' as const, result: r })), | |
| agent.waitForIdle(idleGraceSecs * 1000).then((r) => ({ kind: 'idle' as const, result: r })), | |
| const idleGraceSecs = 15; | |
| const elapsedSinceLoopStart = Date.now() - idleLoopStart; | |
| const graceMs = timeoutMs != null | |
| ? Math.min(idleGraceSecs * 1000, Math.max(0, timeoutMs - elapsedSinceLoopStart)) | |
| : idleGraceSecs * 1000; | |
| const graceResult = await Promise.race([ | |
| agent.waitForExit(graceMs).then((r) => ({ kind: 'exit' as const, result: r })), | |
| agent.waitForIdle(graceMs).then((r) => ({ kind: 'idle' as const, result: r })), | |
| ]); |
Was this helpful? React with 👍 or 👎 to provide feedback.
| if (usesAutoHardening && usesDedicatedOwner && !reviewDef) { | ||
| reviewDef = this.resolveAutoReviewAgent(ownerDef, agentMap); | ||
| supervised.reviewer = reviewDef; | ||
| } |
There was a problem hiding this comment.
🔴 JIT reviewer resolution drops review for self-owned steps in hub patterns
The deferred reviewer resolution at line 3448 requires usesDedicatedOwner in addition to usesAutoHardening, but the old eager resolution only required usesAutoHardening. This means hub-pattern steps where no dedicated lead/coordinator agent was found (i.e., ownerDef.name === specialistDef.name, so usesDedicatedOwner is false) no longer get a reviewer resolved. Previously, self-owned steps in hub patterns still received auto-review because resolveAutoReviewAgent was called unconditionally when usesAutoHardening was true (packages/sdk/src/workflows/runner.ts:3172). Now the review gate at line 3453 (if (usesOwnerFlow && reviewDef)) silently skips because reviewDef remains undefined — the JIT block at line 3448 never fires.
Example scenario where review is lost
A hub-spoke workflow with two agents where neither has an owner-ish name/role:
workflow('test')
.pattern('hub-spoke')
.agent('alpha', { cli: 'claude', role: 'Engineer' })
.agent('beta', { cli: 'claude', role: 'Tester' })
.step('build', { agent: 'alpha', task: 'Build feature' })resolveAutoStepOwner finds no dedicated owner → ownerDef === specialistDef → usesDedicatedOwner = false. Old code: reviewer resolved and review runs. New code: reviewer never resolved, review silently skipped.
| if (usesAutoHardening && usesDedicatedOwner && !reviewDef) { | |
| reviewDef = this.resolveAutoReviewAgent(ownerDef, agentMap); | |
| supervised.reviewer = reviewDef; | |
| } | |
| if (usesAutoHardening && !reviewDef) { | |
| reviewDef = this.resolveAutoReviewAgent(ownerDef, agentMap); | |
| supervised.reviewer = reviewDef; | |
| } |
Was this helpful? React with 👍 or 👎 to provide feedback.
1. Restore evidenceRole to 'owner' for interactive self-owned agents (fixes signal attribution) while using preserveOnIdle: false for non-hub patterns to prevent exit-only mode blocking DAG agents. 2. Defer reviewer resolution to just before the review gate so activeReviewers is populated correctly for concurrent steps. 3. Track elapsed time in idle verification loop so timeoutMs is respected across iterations instead of restarting full timeout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
presetfield:WorkflowBuilder.agent()never mappedoptions.presetto theAgentDefinition. TheAgentOptionsinterface was missing thepresetfield entirely. This meantpreset: 'worker'via the builder API had no effect —resolveAgentDefnever saw the preset, sointeractive: falsewas never applied, and agents were spawned as interactive PTY with the full owner-supervisor-reviewer pipeline.resolveAutoReviewAgentindependently assigned the same agent as reviewer for multiple concurrent steps with no coordination. When e.g.worker-awas assigned to review bothtask-bandtask-csimultaneously, one review could starve and hit the safety backstop timeout (observed in fan-out workflows).Changes
preset?: AgentPresettoAgentOptionsinterface and theagent()method inbuilder.tsactiveReviewersSet toWorkflowRunnerto track agents currently assigned as reviewersresolveAutoReviewAgentto prefer agents not currently reviewing, falling back to busy agents only when no alternative existsactiveReviewersin the finally block after review completes and in the run cleanupTest plan
preset: 'worker'via builder API should run agents as non-interactive (no PTY, no owner/reviewer pipeline)🤖 Generated with Claude Code