fix(acp): drain message events before returning end_turn#25422
Closed
truenorth-lj wants to merge 1 commit intoanomalyco:devfrom
Closed
fix(acp): drain message events before returning end_turn#25422truenorth-lj wants to merge 1 commit intoanomalyco:devfrom
truenorth-lj wants to merge 1 commit intoanomalyco:devfrom
Conversation
`Agent.prompt()` returns `stopReason: "end_turn"` as soon as
`sdk.session.prompt()` resolves, but `message.part.delta` events for the
final assistant message text are still queued in the SDK event stream at
that moment. They get processed by `runEventSubscription` and forwarded
to ACP as `agent_message_chunk` frames AFTER the RPC reply has already
been sent — a protocol violation visible to ACP clients as text
appearing post-end_turn.
Cause: two independent async paths share the same ACP wire.
- Path A — event subscription (`runEventSubscription`): consumes
`sdk.global.event()` and forwards `message.part.delta` as
`agent_message_chunk` via `connection.sessionUpdate(...)`.
- Path B — prompt RPC: `await sdk.session.prompt(...)` resolves when
the LLM finishes, then immediately returns `end_turn`.
Path B can return before Path A drains the trailing deltas. Order on the
wire is then: ... earlier chunks ... → end_turn reply → trailing chunk.
Fix: in `prompt()`, after `sdk.session.prompt()` resolves, await the
`message.updated` event for the response message id (i.e. `info.time.completed`
set). Because `runEventSubscription` processes events sequentially via
`for await` and awaits each `handleEvent`, the `message.updated`
(completed) event for a message is necessarily processed AFTER all
prior `message.part.delta` events for the same message — so waiting on
it guarantees every chunk has already been forwarded.
A 5s timeout fallback prevents deadlock if the upstream completion
event is never observed.
Repro:
Send a streaming prompt via ACP. Inspect the wire (DevTools → WS
Messages). Observe that the final `agent_message_chunk` (the
agent's last text delta) arrives 5–50ms AFTER the RPC reply with
`stopReason: end_turn` and matching `id`.
Affects every ACP client that gates UI / input on end_turn (e.g.
disables streaming indicator, re-enables send button) — they snap to
"done" prematurely while text is still being appended.
Contributor
|
This PR doesn't fully meet our contributing guidelines and PR template. What needs to be fixed:
Please edit this PR description to address the above within 2 hours, or it will be automatically closed. If you believe this was flagged incorrectly, please let a maintainer know. |
6 tasks
Contributor
|
This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window. Feel free to open a new pull request that follows our guidelines. |
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #25421.
Summary
Agent.prompt()returnsstopReason: "end_turn"as soon assdk.session.prompt()resolves, butmessage.part.deltaevents for the assistant's final text are still queued in the SDK event stream at that moment — they get processed byrunEventSubscriptionand forwarded to ACP asagent_message_chunkframes AFTER the RPC reply has already been sent. See the linked issue for a wire trace and protocol-level discussion.Fix
In
Agent.prompt(), afterawait sdk.session.prompt(...)resolves, awaitmessage.updatedfor the response message id (i.e.info.time.completedset) before returningend_turn.Why this is sufficient:
runEventSubscriptionconsumessdk.global.event()via sequentialfor await, awaiting eachhandleEventcall.message.updated(withtime.completedset) is the SDK's "this assistant message is fully written" signal, fired AFTER allmessage.part.deltaevents for the same message.message.updatedhas been processed byhandleEvent, every prior delta for that message has already been awaited throughconnection.sessionUpdate(...)— i.e. on the ACP wire.A 5-second timeout fallback prevents deadlock if the upstream completion event is dropped or the message never reaches a completed state.
Applied to both
prompt()branches that have a response message id (the cmd-less branch atagent.ts:1471and the slash-command branch at:1497). The compact path doesn't carry an assistant message id so it's left as-is.Test plan
bun run typecheckcleantest/acp/event-subscription.test.ts(11 tests) all passagent_message_chunkfor the final delta arrives BEFORE theid:N result:end_turnreply (will follow up with results)Happy to add a regression test using the existing
createFakeAgentharness intest/acp/event-subscription.test.ts— kept out of the initial diff to stay focused, but the harness already supports event injection so it's straightforward to push timedmessage.part.delta+message.updatedevents and assert ordering. Let me know.