Skip to content

fix(acp): drain message events before returning end_turn#25422

Closed
truenorth-lj wants to merge 1 commit intoanomalyco:devfrom
truenorth-lj:fix-acp-end-turn-race
Closed

fix(acp): drain message events before returning end_turn#25422
truenorth-lj wants to merge 1 commit intoanomalyco:devfrom
truenorth-lj:fix-acp-end-turn-race

Conversation

@truenorth-lj
Copy link
Copy Markdown

Fixes #25421.

Summary

Agent.prompt() returns stopReason: "end_turn" as soon as sdk.session.prompt() resolves, but message.part.delta events for the assistant's final text are still queued in the SDK event stream at that moment — they get processed by runEventSubscription and forwarded to ACP as agent_message_chunk frames AFTER the RPC reply has already been sent. See the linked issue for a wire trace and protocol-level discussion.

Fix

In Agent.prompt(), after await sdk.session.prompt(...) resolves, await message.updated for the response message id (i.e. info.time.completed set) before returning end_turn.

Why this is sufficient:

  • runEventSubscription consumes sdk.global.event() via sequential for await, awaiting each handleEvent call.
  • message.updated (with time.completed set) is the SDK's "this assistant message is fully written" signal, fired AFTER all message.part.delta events for the same message.
  • Therefore once message.updated has been processed by handleEvent, every prior delta for that message has already been awaited through connection.sessionUpdate(...) — i.e. on the ACP wire.

A 5-second timeout fallback prevents deadlock if the upstream completion event is dropped or the message never reaches a completed state.

Applied to both prompt() branches that have a response message id (the cmd-less branch at agent.ts:1471 and the slash-command branch at :1497). The compact path doesn't carry an assistant message id so it's left as-is.

Test plan

  • bun run typecheck clean
  • Existing test/acp/event-subscription.test.ts (11 tests) all pass
  • No new compiler warnings
  • Manual wire trace verification: re-run a streaming prompt against a built binary and confirm agent_message_chunk for the final delta arrives BEFORE the id:N result:end_turn reply (will follow up with results)

Happy to add a regression test using the existing createFakeAgent harness in test/acp/event-subscription.test.ts — kept out of the initial diff to stay focused, but the harness already supports event injection so it's straightforward to push timed message.part.delta + message.updated events and assert ordering. Let me know.

`Agent.prompt()` returns `stopReason: "end_turn"` as soon as
`sdk.session.prompt()` resolves, but `message.part.delta` events for the
final assistant message text are still queued in the SDK event stream at
that moment. They get processed by `runEventSubscription` and forwarded
to ACP as `agent_message_chunk` frames AFTER the RPC reply has already
been sent — a protocol violation visible to ACP clients as text
appearing post-end_turn.

Cause: two independent async paths share the same ACP wire.

  - Path A — event subscription (`runEventSubscription`): consumes
    `sdk.global.event()` and forwards `message.part.delta` as
    `agent_message_chunk` via `connection.sessionUpdate(...)`.
  - Path B — prompt RPC: `await sdk.session.prompt(...)` resolves when
    the LLM finishes, then immediately returns `end_turn`.

Path B can return before Path A drains the trailing deltas. Order on the
wire is then: ... earlier chunks ... → end_turn reply → trailing chunk.

Fix: in `prompt()`, after `sdk.session.prompt()` resolves, await the
`message.updated` event for the response message id (i.e. `info.time.completed`
set). Because `runEventSubscription` processes events sequentially via
`for await` and awaits each `handleEvent`, the `message.updated`
(completed) event for a message is necessarily processed AFTER all
prior `message.part.delta` events for the same message — so waiting on
it guarantees every chunk has already been forwarded.

A 5s timeout fallback prevents deadlock if the upstream completion
event is never observed.

Repro:

  Send a streaming prompt via ACP. Inspect the wire (DevTools → WS
  Messages). Observe that the final `agent_message_chunk` (the
  agent's last text delta) arrives 5–50ms AFTER the RPC reply with
  `stopReason: end_turn` and matching `id`.

Affects every ACP client that gates UI / input on end_turn (e.g.
disables streaming indicator, re-enables send button) — they snap to
"done" prematurely while text is still being appended.
@github-actions github-actions Bot added the needs:compliance This means the issue will auto-close after 2 hours. label May 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window.

Feel free to open a new pull request that follows our guidelines.

@github-actions github-actions Bot removed the needs:compliance This means the issue will auto-close after 2 hours. label May 2, 2026
@github-actions github-actions Bot closed this May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ACP agent_message_chunk frames land after end_turn RPC reply due to event-subscription / prompt-RPC race

1 participant