fix(acp): flush straggler chunks promptly after session/prompt returns#730
Conversation
After session/prompt returns, HAPI drains buffered agentMessageChunk text and marks the turn complete, but leaves the message handler alive. Models with long streaming tails (DeepSeek, GPT-5.5) continue to push chunks after that drain, causing text to accumulate in the buffer and only appear when the next user prompt triggers the pre-prompt drain — showing up in the wrong turn with broken markdown. Start a 50ms interval timer after the post-prompt drain that keeps calling drainBuffers() on the live handler for up to 6 seconds, so straggler chunks are emitted within one poll tick instead of waiting for the next prompt. The timer is cancelled when the next prompt starts (pre-prompt drain replaces the handler) or on disconnect. Fixes tiann#609. Also applies to Gemini and Kimi which share the same AcpSdkBackend code path. via [HAPI](https://hapi.run) Co-Authored-By: HAPI <noreply@hapi.run>
7e992be to
c25e001
Compare
There was a problem hiding this comment.
Findings
- [Major] Late flush runs after the turn is reported idle —
startLateFlushTimer()is fire-and-forget, soprompt()still emitsturn_complete, clearsisProcessingMessage, resolves waiters, and the launchers sendreadywhile the old handler can continue emitting assistant text for up to 6 seconds. A user can send the next prompt during that window; the nextprompt()stops the timer and drains the old handler only after the new user message has already been queued/displayed, and a tail longer than the pre-prompt drain timeout can still land in the new turn. Evidence:cli/src/agent/backends/acp/AcpSdkBackend.ts:295, with completion immediately after in the samefinallyblock.
Suggested fix:private async drainLateBuffers(): Promise<void> { const deadline = Date.now() + AcpSdkBackend.LATE_FLUSH_WINDOW_MS; while (Date.now() < deadline) { await new Promise<void>((resolve) => setTimeout(resolve, AcpSdkBackend.LATE_FLUSH_INTERVAL_MS)); this.messageHandler?.drainBuffers(); } } // In prompt() before usage/turn_complete and before isProcessingMessage=false: this.messageHandler?.drainBuffers(); await this.drainLateBuffers();
Summary
- Review mode: initial
- The diff improves late chunk flushing, but the timer currently operates outside the turn lifecycle. Keep the late-drain window inside
prompt()completion, or otherwise delayready/notifyResponseCompleteuntil the old handler is no longer accepting tail chunks.
Testing
- Not run (automation)
HAPI Bot
| AcpSdkBackend.UPDATE_DRAIN_TIMEOUT_MS | ||
| ); | ||
| this.messageHandler?.drainBuffers(); | ||
| this.startLateFlushTimer(); |
There was a problem hiding this comment.
[MAJOR] This timer is started but not awaited, so prompt() continues to emit turn_complete, clear isProcessingMessage, notify waiters, and let the launchers send ready while the old handler can still flush assistant text for up to 6 seconds. If the user sends the next prompt during that window, late text can still be emitted after the new user message or captured by the new handler once the pre-prompt drain times out.
Suggested fix:
private async drainLateBuffers(): Promise<void> {
const deadline = Date.now() + AcpSdkBackend.LATE_FLUSH_WINDOW_MS;
while (Date.now() < deadline) {
await new Promise<void>((resolve) => setTimeout(resolve, AcpSdkBackend.LATE_FLUSH_INTERVAL_MS));
this.messageHandler?.drainBuffers();
}
}
this.messageHandler?.drainBuffers();
await this.drainLateBuffers();Addresses the github-actions review on tiann#730: the fire-and-forget late-flush timer let `prompt()` resolve while stragglers were still possibly arriving, so a rapid follow-up prompt could either drop those chunks (during the old null-handler gap) or leak them into the new turn's onUpdate. Pre-prompt phase now keeps the previous turn's handler alive across the quiet wait (bounded by LATE_FLUSH_WINDOW_MS) and swaps in a single phase immediately before sending the new session/prompt. The post-prompt late flush timer is unchanged — it still emits idle-window stragglers promptly without delaying the ready signal or `setModel` / `setConfigOption`. Adds three regression tests: late-chunk flushing within the window, pre-prompt straggler attribution to the previous turn's onUpdate, and disconnect cancelling the timer. Removes now-unused PRE_PROMPT_UPDATE_DRAIN_TIMEOUT_MS. via [HAPI](https://hapi.run) Co-Authored-By: HAPI <noreply@hapi.run>
There was a problem hiding this comment.
Findings
- [Major] Late flush still runs after the turn is reported idle —
startLateFlushTimer()is started atcli/src/agent/backends/acp/AcpSdkBackend.ts:304, butprompt()then emitsturn_complete, clearsisProcessingMessage, resolves waiters, and returns atcli/src/agent/backends/acp/AcpSdkBackend.ts:320. The remote launchers flip thinking off and emitreadyimmediately after that return when the queue is empty (cli/src/opencode/opencodeRemoteLauncher.ts:256,cli/src/agent/runners/runAgentSession.ts:192), so any tail chunk flushed by the background interval can still appear after the ready notification or after the user has already queued the next turn. The added tests cover that a late chunk eventually flushes, but not that no assistant text arrives after turn completion/readiness.
Suggested fix:private async drainLateBuffers(): Promise<void> { const deadline = Date.now() + AcpSdkBackend.LATE_FLUSH_WINDOW_MS; while (Date.now() < deadline) { await new Promise<void>((resolve) => setTimeout(resolve, AcpSdkBackend.LATE_FLUSH_INTERVAL_MS)); this.messageHandler?.drainBuffers(); } } // In prompt(), before usage/turn_complete and before isProcessingMessage=false: this.messageHandler?.drainBuffers(); await this.drainLateBuffers();
Questions
- None
Summary
- Review mode: follow-up after new commits
- The previous lifecycle issue is only partially addressed: straggler chunks are retained for the prior handler, but the turn is still marked complete and the session can still announce ready while that handler is flushing in the background.
Testing
- Not run:
bun test cli/src/agent/backends/acp/AcpSdkBackend.test.tsfailed becausebunis not installed in this runner.
HAPI Bot
| AcpSdkBackend.UPDATE_DRAIN_TIMEOUT_MS | ||
| ); | ||
| this.messageHandler?.drainBuffers(); | ||
| this.startLateFlushTimer(); |
There was a problem hiding this comment.
[MAJOR] This still starts the late flush as background work. Immediately afterward, prompt() emits turn_complete, clears isProcessingMessage, resolves response-complete waiters, and returns; the launchers then flip thinking off and emit ready when the queue is empty. A tail chunk flushed by this interval can therefore appear after the ready notification or after the user has queued the next turn.
Suggested fix:
private async drainLateBuffers(): Promise<void> {
const deadline = Date.now() + AcpSdkBackend.LATE_FLUSH_WINDOW_MS;
while (Date.now() < deadline) {
await new Promise<void>((resolve) => setTimeout(resolve, AcpSdkBackend.LATE_FLUSH_INTERVAL_MS));
this.messageHandler?.drainBuffers();
}
}
// Before usage/turn_complete and before isProcessingMessage=false:
this.messageHandler?.drainBuffers();
await this.drainLateBuffers();Follow-up to bot's repeated MAJOR on tiann#730: even with the pre-prompt gate, the fire-and-forget late-flush timer let `prompt()` resolve before slow tails finished, so the launcher's `ready` signal (and any user follow-up queued against it) raced with text still being emitted to the current turn's onUpdate. Replace the setInterval timer with a synchronous `drainLateBuffers()` awaited in `prompt()`'s finally before turn_complete is sent. It polls drainBuffers every LATE_FLUSH_INTERVAL_MS so the UI keeps streaming smoothly during the wait, and exits early once the model has been quiet for LATE_FLUSH_QUIET_PERIOD_MS (250 ms — adds negligible latency to fast models like Claude whose tail is typically <100 ms) or the LATE_FLUSH_WINDOW_MS upper bound (6 s) elapses. Side effects: - Restore PRE_PROMPT_UPDATE_DRAIN_TIMEOUT_MS (1200 ms): the pre-prompt drain is now just a safety net since the post-prompt wait guarantees the previous turn is quiet by the time the next prompt starts. - Drop the `lateFlushTimer` field, `startLateFlushTimer`, `stopLateFlushTimer`, and their disconnect/pre-prompt cleanup calls. - Update the "emits straggler chunks" test to assert ordering before turn_complete, and add a fast-path test confirming the drain exits promptly when the model is quiet. via [HAPI](https://hapi.run) Co-Authored-By: HAPI <noreply@hapi.run>
There was a problem hiding this comment.
Findings
- [Major] Late-flush quiet window can be skipped for post-return tails —
drainLateBuffers()compares against the oldlastSessionUpdateAtbefore it waits. If the ACP agent pauses for more thanLATE_FLUSH_QUIET_PERIOD_MSbeforesession/promptresolves, this method returns immediately atcli/src/agent/backends/acp/AcpSdkBackend.ts:453, thenturn_completeis emitted and the launcher can sendready; any chunk that arrives shortly after the response returns is buffered until a later boundary, which is the straggler behavior this PR is meant to fix.
Suggested fix:private async drainLateBuffers(): Promise<void> { const quietBaseline = Date.now(); const deadline = quietBaseline + AcpSdkBackend.LATE_FLUSH_WINDOW_MS; while (Date.now() < deadline) { const latestActivityAt = Math.max(this.lastSessionUpdateAt, quietBaseline); const elapsedSinceUpdate = Date.now() - latestActivityAt; if (elapsedSinceUpdate >= AcpSdkBackend.LATE_FLUSH_QUIET_PERIOD_MS) { return; } const remainingBudget = deadline - Date.now(); const waitMs = Math.max(1, Math.min(AcpSdkBackend.LATE_FLUSH_INTERVAL_MS, remainingBudget)); await new Promise<void>((resolve) => setTimeout(resolve, waitMs)); this.messageHandler?.drainBuffers(); } }
Questions
- None
Summary
- Review mode: follow-up after new commits
- The previous background-timer issue is mostly addressed by awaiting the late flush, but the new quiet check can still skip the post-return observation window when the last update is already old.
Testing
- Not run (automation):
bun test cli/src/agent/backends/acp/AcpSdkBackend.test.tscould not run becausebunis not installed in this runner.
HAPI Bot
| private async drainLateBuffers(): Promise<void> { | ||
| const deadline = Date.now() + AcpSdkBackend.LATE_FLUSH_WINDOW_MS; | ||
| while (Date.now() < deadline) { | ||
| const elapsedSinceUpdate = Date.now() - this.lastSessionUpdateAt; |
There was a problem hiding this comment.
[MAJOR] drainLateBuffers() can return before it observes the period after session/prompt resolves. Because lastSessionUpdateAt may already be older than LATE_FLUSH_QUIET_PERIOD_MS when this method starts, the first loop iteration exits immediately; a chunk emitted just after the prompt response is then buffered after turn_complete/ready and is not flushed until a later boundary.
Suggested fix:
private async drainLateBuffers(): Promise<void> {
const quietBaseline = Date.now();
const deadline = quietBaseline + AcpSdkBackend.LATE_FLUSH_WINDOW_MS;
while (Date.now() < deadline) {
const latestActivityAt = Math.max(this.lastSessionUpdateAt, quietBaseline);
const elapsedSinceUpdate = Date.now() - latestActivityAt;
if (elapsedSinceUpdate >= AcpSdkBackend.LATE_FLUSH_QUIET_PERIOD_MS) {
return;
}
const remainingBudget = deadline - Date.now();
const waitMs = Math.max(1, Math.min(AcpSdkBackend.LATE_FLUSH_INTERVAL_MS, remainingBudget));
await new Promise<void>((resolve) => setTimeout(resolve, waitMs));
this.messageHandler?.drainBuffers();
}
}…ionUpdateAt Bot's third MAJOR on tiann#730: drainLateBuffers() compared elapsed time against lastSessionUpdateAt, which can already be older than LATE_FLUSH_QUIET_PERIOD_MS by the time the method starts — e.g. when the model emits chunks early in the turn, pauses, then sends stopReason. In that case the first loop iteration sees a stale "quiet" reading and returns immediately, missing any straggler that arrives just after session/prompt resolves; the chunk then sits in the buffer until the next prompt's pre-prompt drain. Anchor the quiet check to max(lastSessionUpdateAt, entry time) so we always observe at least one quiet period from method entry regardless of when the last chunk was. Adds a regression test that fires a chunk early, awaits a 200ms pause, schedules a post-resolution straggler, and asserts it lands before turn_complete. via [HAPI](https://hapi.run) Co-Authored-By: HAPI <noreply@hapi.run>
There was a problem hiding this comment.
Findings
- None
Questions
- None
Summary
- Review mode: follow-up after new commits
- No issues found in the latest diff. Previous late-flush quiet-window finding appears addressed by anchoring the quiet baseline at
drainLateBuffers()entry and adding a regression test. Residual risk: timing-based behavior depends on real ACP tail cadence and could not be verified against live OpenCode/Gemini in this runner.
Testing
- Not run (automation):
bun test cli/src/agent/backends/acp/AcpSdkBackend.test.tsandbun typecheckcould not run becausebunis not installed in this runner.
HAPI Bot
…hor fix The previous note claimed the 250ms quiet check "exits early for fast models, adding negligible latency". That was true before commit 512d6a4 when the check compared against lastSessionUpdateAt; with the entry-time anchor, drainLateBuffers always observes at least one full quiet period. Document that this minimum wait is the price of catching post-resolution stragglers from paused-mid-turn models. via [HAPI](https://hapi.run) Co-Authored-By: HAPI <noreply@hapi.run>
There was a problem hiding this comment.
Findings
- None
Questions
- None
Summary
Review mode: follow-up after new commits
No issues found in the latest PR diff. The current ACP late-flush path keeps the active handler installed through post-prompt draining, emits late chunks before turn_complete, and includes regression coverage for post-return stragglers and stale lastSessionUpdateAt quiet-window anchoring. Residual risk: real ACP tail cadence could not be validated against live OpenCode/Gemini/Kimi in this runner.
Testing
- Not run (automation):
bun test cli/src/agent/backends/acp/AcpSdkBackend.test.tsandbun typecheckfailed becausebunis not installed in this runner.
HAPI Bot
Summary
Fixes #609. Also applies to Gemini and Kimi which share the same
AcpSdkBackendcode path.Root cause: After
session/promptreturns,AcpSdkBackenddrains bufferedagentMessageChunktext and marks the turn complete, but leavesmessageHandleralive. Models with long streaming tails (GPT-5.5, DeepSeek V4 Pro) keep sending chunks after that drain. Those chunks accumulate in the buffer with nothing to flush them — they only appear when the next user prompt's pre-prompt drain fires the oldonUpdatecallback, showing up in the wrong turn with broken markdown.Fix: After the post-prompt drain, start a 50 ms interval timer (
startLateFlushTimer) that keeps callingdrainBuffers()on the live handler for up to 6 seconds. Straggler chunks are emitted within one poll tick instead of waiting for the next user message. The timer is cancelled at the start of the nextprompt()call (before the handler is replaced) and ondisconnect().Changes
Single file:
cli/src/agent/backends/acp/AcpSdkBackend.tslateFlushTimerfield +LATE_FLUSH_INTERVAL_MS(50 ms) /LATE_FLUSH_WINDOW_MS(6 s) constantsstartLateFlushTimer()— starts the interval, auto-stops after the window expiresstopLateFlushTimer()— clears the interval; called at prompt start and disconnectstartLateFlushTimer()afterdrainBuffers()in thefinallyblock ofprompt()stopLateFlushTimer()at the top ofprompt()and indisconnect()Test plan
🤖 Generated with Claude Code