fix(workflows): include exitCode and exitSignal in step events#531
fix(workflows): include exitCode and exitSignal in step events#531khaliqgant merged 3 commits intomainfrom
Conversation
step:completed and step:failed events now propagate the agent's exit code and signal so consumers can inspect process-level outcomes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
khaliqgant
left a comment
There was a problem hiding this comment.
Review: fix(workflows): include exitCode and exitSignal in step events
Overall this is a clean, well-structured change. The SpawnResult interface is a good abstraction, the event types are extended correctly, and the Python SDK types are in sync. A few issues to address:
1. Unused reviewResult variable (bug/lint)
At the reviewer spawnAndWait call (~line 2920 on the PR branch), reviewOutput was previously assigned directly:
reviewOutput = await this.spawnAndWait(reviewerDef, reviewStep, ...)Now it's:
const reviewResult = await this.spawnAndWait(reviewerDef, reviewStep, ...)But reviewResult is never read. This will trigger an no-unused-vars lint/TS error. Functionally it's fine since reviewOutput is already built up via the onChunk callback — but either:
- Remove the assignment:
await this.spawnAndWait(...)(discard result), or - Add
reviewOutput = reviewResult.output;after for consistency
2. execNonInteractive doesn't propagate exit info
execNonInteractive returns { output } without exitCode or exitSignal, even though the child process close event handler has access to the exit code and could capture signal. This means non-interactive agents will never have exit info in their step events, which partially undermines the fix.
The close handler already has code; it could also receive signal (Node's close event passes (code, signal)). Suggest threading both through to the SpawnResult:
child.on('close', (code, signal) => {
// ... existing logic ...
resolve({ stdout, exitCode: code, exitSignal: signal });
});Then unwrap accordingly in the outer function.
3. StepExecutor interface still returns string
The StepExecutor.executeAgentStep interface still returns Promise<string>, and the runner uses typeof spawnResult === 'string' to bridge the gap. This works but is a leaky abstraction — consider either:
- Updating
StepExecutorto returnPromise<string | SpawnResult>(documents the dual return explicitly), or - Updating it to return
Promise<SpawnResult>if executor implementations are internal
This is minor and could be a follow-up.
What looks good
SpawnResultinterface is clean and well-placed- The
typeof spawnResult === 'string'compatibility bridge for executor paths is pragmatic - Python SDK types are correctly updated with
exit_codeandexit_signal markStepFailedsignature change is clean with the optionalexitInfoparameter- Event type extensions are backward-compatible (optional fields)
- The supervised path correctly unwraps
.outputfrom both owner and specialist results
🤖 Generated with Claude Code
Remove unused reviewResult variable from spawnAndWait call and thread exitCode/exitSignal through execNonInteractive's close handler so non-interactive agents get exit info in step events. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Reset lastExitCode/lastExitSignal at the start of each retry loop iteration to prevent stale values from leaking across attempts. - Introduce SpawnExitError class that carries exitCode/exitSignal from failed subprocess spawns in execNonInteractive, so step:failed events include the actual exit code instead of always being undefined. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🟡 Supervised (dedicated-owner) path never populates lastExitCode/lastExitSignal on success
When usesDedicatedOwner is true (line 2390), the code calls executeSupervisedAgentStep which returns { specialistOutput, ownerOutput, ownerElapsed } — without any exit info. The lastExitCode and lastExitSignal variables were reset to undefined at runner.ts:2315-2316 but are never assigned on the success path through this branch. Consequently, the step:completed event emitted at runner.ts:2456 always carries exitCode: undefined, exitSignal: undefined for supervised steps, even though both the specialist and owner spawns (via spawnAndWait) do return SpawnResult objects with exit info at runner.ts:2689 and runner.ts:2693. This means the feature (propagating exit info to events) is only implemented for the non-dedicated-owner branch (else at line 2400), leaving an inconsistency in event data.
(Refers to lines 2390-2399)
Prompt for agents
In packages/sdk/src/workflows/runner.ts, inside executeAgentStep, the `usesDedicatedOwner` branch (around line 2390-2399) calls executeSupervisedAgentStep which returns { specialistOutput, ownerOutput, ownerElapsed }. The exit info from the specialist/owner spawns is never propagated back, so lastExitCode/lastExitSignal remain undefined. To fix: either extend executeSupervisedAgentStep's return type to include exitCode/exitSignal (from the specialist SpawnResult or the owner SpawnResult, depending on which is more semantically relevant), and then assign lastExitCode/lastExitSignal from that result; or capture the specialist's exit info since the specialist is the actual worker performing the task.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Valid gap — executeSupervisedAgentStep returns {specialistOutput, ownerOutput, ownerElapsed} without exit info. The supervised path is new (from #511) and exit code propagation wasn't wired through yet. This is a known limitation noted in the real e2e results — it only affects the owner+specialist pattern, not the common single-agent path. Will address in a follow-up.
Remove unused reviewResult variable from spawnAndWait call and thread exitCode/exitSignal through execNonInteractive's close handler so non-interactive agents get exit info in step events. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
exitCodeandexitSignalfields tostep:completedandstep:failedworkflow eventsspawnAndWaitresolves, readsagent.exitCodeandagent.exitSignalfrom the PTY agent handle and propagates them through the event systemmarkStepFailedto accept and forward exit info for failed stepsStepCompletedEventandStepFailedEventtypes to matchCloses #499
Test plan
step:completedevents containexitCodewhen an agent exits normallystep:failedevents containexitCode/exitSignalwhen an agent exits abnormallyGenerated with Claude Code