Skip to content

fix(acp): flush straggler chunks promptly after session/prompt returns#730

Merged
tiann merged 5 commits into
tiann:mainfrom
swear01:fix/acp-straggler-chunks-stuck-in-buffer
May 31, 2026
Merged

fix(acp): flush straggler chunks promptly after session/prompt returns#730
tiann merged 5 commits into
tiann:mainfrom
swear01:fix/acp-straggler-chunks-stuck-in-buffer

Conversation

@swear01
Copy link
Copy Markdown
Contributor

@swear01 swear01 commented May 29, 2026

Summary

Fixes #609. Also applies to Gemini and Kimi which share the same AcpSdkBackend code path.

Root cause: After session/prompt returns, AcpSdkBackend drains buffered agentMessageChunk text and marks the turn complete, but leaves messageHandler alive. Models with long streaming tails (GPT-5.5, DeepSeek V4 Pro) keep sending chunks after that drain. Those chunks accumulate in the buffer with nothing to flush them — they only appear when the next user prompt's pre-prompt drain fires the old onUpdate callback, showing up in the wrong turn with broken markdown.

Fix: After the post-prompt drain, start a 50 ms interval timer (startLateFlushTimer) that keeps calling drainBuffers() on the live handler for up to 6 seconds. Straggler chunks are emitted within one poll tick instead of waiting for the next user message. The timer is cancelled at the start of the next prompt() call (before the handler is replaced) and on disconnect().

Changes

Single file: cli/src/agent/backends/acp/AcpSdkBackend.ts

  • Add lateFlushTimer field + LATE_FLUSH_INTERVAL_MS (50 ms) / LATE_FLUSH_WINDOW_MS (6 s) constants
  • startLateFlushTimer() — starts the interval, auto-stops after the window expires
  • stopLateFlushTimer() — clears the interval; called at prompt start and disconnect
  • Wire startLateFlushTimer() after drainBuffers() in the finally block of prompt()
  • Wire stopLateFlushTimer() at the top of prompt() and in disconnect()

Test plan

  • Start a session with OpenCode + a model that has a long streaming tail (GPT-5.5, DeepSeek V4 Pro)
  • Ask for a long response; confirm the full output appears before the next prompt without needing a follow-up message
  • Confirm code blocks / markdown formatting is intact
  • Verify normal sessions (Claude, fast-streaming models) are unaffected

🤖 Generated with Claude Code

After session/prompt returns, HAPI drains buffered agentMessageChunk text
and marks the turn complete, but leaves the message handler alive. Models
with long streaming tails (DeepSeek, GPT-5.5) continue to push chunks
after that drain, causing text to accumulate in the buffer and only appear
when the next user prompt triggers the pre-prompt drain — showing up in
the wrong turn with broken markdown.

Start a 50ms interval timer after the post-prompt drain that keeps calling
drainBuffers() on the live handler for up to 6 seconds, so straggler
chunks are emitted within one poll tick instead of waiting for the next
prompt. The timer is cancelled when the next prompt starts (pre-prompt
drain replaces the handler) or on disconnect.

Fixes tiann#609. Also applies to Gemini and Kimi which share the same
AcpSdkBackend code path.

via [HAPI](https://hapi.run)

Co-Authored-By: HAPI <noreply@hapi.run>
@swear01 swear01 force-pushed the fix/acp-straggler-chunks-stuck-in-buffer branch from 7e992be to c25e001 Compare May 29, 2026 07:54
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Late flush runs after the turn is reported idle — startLateFlushTimer() is fire-and-forget, so prompt() still emits turn_complete, clears isProcessingMessage, resolves waiters, and the launchers send ready while the old handler can continue emitting assistant text for up to 6 seconds. A user can send the next prompt during that window; the next prompt() stops the timer and drains the old handler only after the new user message has already been queued/displayed, and a tail longer than the pre-prompt drain timeout can still land in the new turn. Evidence: cli/src/agent/backends/acp/AcpSdkBackend.ts:295, with completion immediately after in the same finally block.
    Suggested fix:
    private async drainLateBuffers(): Promise<void> {
        const deadline = Date.now() + AcpSdkBackend.LATE_FLUSH_WINDOW_MS;
        while (Date.now() < deadline) {
            await new Promise<void>((resolve) => setTimeout(resolve, AcpSdkBackend.LATE_FLUSH_INTERVAL_MS));
            this.messageHandler?.drainBuffers();
        }
    }
    
    // In prompt() before usage/turn_complete and before isProcessingMessage=false:
    this.messageHandler?.drainBuffers();
    await this.drainLateBuffers();

Summary

  • Review mode: initial
  • The diff improves late chunk flushing, but the timer currently operates outside the turn lifecycle. Keep the late-drain window inside prompt() completion, or otherwise delay ready/notifyResponseComplete until the old handler is no longer accepting tail chunks.

Testing

  • Not run (automation)

HAPI Bot

AcpSdkBackend.UPDATE_DRAIN_TIMEOUT_MS
);
this.messageHandler?.drainBuffers();
this.startLateFlushTimer();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] This timer is started but not awaited, so prompt() continues to emit turn_complete, clear isProcessingMessage, notify waiters, and let the launchers send ready while the old handler can still flush assistant text for up to 6 seconds. If the user sends the next prompt during that window, late text can still be emitted after the new user message or captured by the new handler once the pre-prompt drain times out.

Suggested fix:

private async drainLateBuffers(): Promise<void> {
    const deadline = Date.now() + AcpSdkBackend.LATE_FLUSH_WINDOW_MS;
    while (Date.now() < deadline) {
        await new Promise<void>((resolve) => setTimeout(resolve, AcpSdkBackend.LATE_FLUSH_INTERVAL_MS));
        this.messageHandler?.drainBuffers();
    }
}

this.messageHandler?.drainBuffers();
await this.drainLateBuffers();

Addresses the github-actions review on tiann#730: the fire-and-forget late-flush
timer let `prompt()` resolve while stragglers were still possibly arriving,
so a rapid follow-up prompt could either drop those chunks (during the
old null-handler gap) or leak them into the new turn's onUpdate.

Pre-prompt phase now keeps the previous turn's handler alive across the
quiet wait (bounded by LATE_FLUSH_WINDOW_MS) and swaps in a single phase
immediately before sending the new session/prompt. The post-prompt late
flush timer is unchanged — it still emits idle-window stragglers promptly
without delaying the ready signal or `setModel` / `setConfigOption`.

Adds three regression tests: late-chunk flushing within the window,
pre-prompt straggler attribution to the previous turn's onUpdate, and
disconnect cancelling the timer. Removes now-unused
PRE_PROMPT_UPDATE_DRAIN_TIMEOUT_MS.

via [HAPI](https://hapi.run)

Co-Authored-By: HAPI <noreply@hapi.run>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Late flush still runs after the turn is reported idle — startLateFlushTimer() is started at cli/src/agent/backends/acp/AcpSdkBackend.ts:304, but prompt() then emits turn_complete, clears isProcessingMessage, resolves waiters, and returns at cli/src/agent/backends/acp/AcpSdkBackend.ts:320. The remote launchers flip thinking off and emit ready immediately after that return when the queue is empty (cli/src/opencode/opencodeRemoteLauncher.ts:256, cli/src/agent/runners/runAgentSession.ts:192), so any tail chunk flushed by the background interval can still appear after the ready notification or after the user has already queued the next turn. The added tests cover that a late chunk eventually flushes, but not that no assistant text arrives after turn completion/readiness.
    Suggested fix:
    private async drainLateBuffers(): Promise<void> {
        const deadline = Date.now() + AcpSdkBackend.LATE_FLUSH_WINDOW_MS;
        while (Date.now() < deadline) {
            await new Promise<void>((resolve) => setTimeout(resolve, AcpSdkBackend.LATE_FLUSH_INTERVAL_MS));
            this.messageHandler?.drainBuffers();
        }
    }
    
    // In prompt(), before usage/turn_complete and before isProcessingMessage=false:
    this.messageHandler?.drainBuffers();
    await this.drainLateBuffers();

Questions

  • None

Summary

  • Review mode: follow-up after new commits
  • The previous lifecycle issue is only partially addressed: straggler chunks are retained for the prior handler, but the turn is still marked complete and the session can still announce ready while that handler is flushing in the background.

Testing

  • Not run: bun test cli/src/agent/backends/acp/AcpSdkBackend.test.ts failed because bun is not installed in this runner.

HAPI Bot

AcpSdkBackend.UPDATE_DRAIN_TIMEOUT_MS
);
this.messageHandler?.drainBuffers();
this.startLateFlushTimer();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] This still starts the late flush as background work. Immediately afterward, prompt() emits turn_complete, clears isProcessingMessage, resolves response-complete waiters, and returns; the launchers then flip thinking off and emit ready when the queue is empty. A tail chunk flushed by this interval can therefore appear after the ready notification or after the user has queued the next turn.

Suggested fix:

private async drainLateBuffers(): Promise<void> {
    const deadline = Date.now() + AcpSdkBackend.LATE_FLUSH_WINDOW_MS;
    while (Date.now() < deadline) {
        await new Promise<void>((resolve) => setTimeout(resolve, AcpSdkBackend.LATE_FLUSH_INTERVAL_MS));
        this.messageHandler?.drainBuffers();
    }
}

// Before usage/turn_complete and before isProcessingMessage=false:
this.messageHandler?.drainBuffers();
await this.drainLateBuffers();

Follow-up to bot's repeated MAJOR on tiann#730: even with the pre-prompt gate,
the fire-and-forget late-flush timer let `prompt()` resolve before slow
tails finished, so the launcher's `ready` signal (and any user follow-up
queued against it) raced with text still being emitted to the current
turn's onUpdate.

Replace the setInterval timer with a synchronous `drainLateBuffers()`
awaited in `prompt()`'s finally before turn_complete is sent. It polls
drainBuffers every LATE_FLUSH_INTERVAL_MS so the UI keeps streaming
smoothly during the wait, and exits early once the model has been quiet
for LATE_FLUSH_QUIET_PERIOD_MS (250 ms — adds negligible latency to fast
models like Claude whose tail is typically <100 ms) or the
LATE_FLUSH_WINDOW_MS upper bound (6 s) elapses.

Side effects:
- Restore PRE_PROMPT_UPDATE_DRAIN_TIMEOUT_MS (1200 ms): the pre-prompt
  drain is now just a safety net since the post-prompt wait guarantees
  the previous turn is quiet by the time the next prompt starts.
- Drop the `lateFlushTimer` field, `startLateFlushTimer`,
  `stopLateFlushTimer`, and their disconnect/pre-prompt cleanup calls.
- Update the "emits straggler chunks" test to assert ordering before
  turn_complete, and add a fast-path test confirming the drain exits
  promptly when the model is quiet.

via [HAPI](https://hapi.run)

Co-Authored-By: HAPI <noreply@hapi.run>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Late-flush quiet window can be skipped for post-return tails — drainLateBuffers() compares against the old lastSessionUpdateAt before it waits. If the ACP agent pauses for more than LATE_FLUSH_QUIET_PERIOD_MS before session/prompt resolves, this method returns immediately at cli/src/agent/backends/acp/AcpSdkBackend.ts:453, then turn_complete is emitted and the launcher can send ready; any chunk that arrives shortly after the response returns is buffered until a later boundary, which is the straggler behavior this PR is meant to fix.
    Suggested fix:
    private async drainLateBuffers(): Promise<void> {
        const quietBaseline = Date.now();
        const deadline = quietBaseline + AcpSdkBackend.LATE_FLUSH_WINDOW_MS;
    
        while (Date.now() < deadline) {
            const latestActivityAt = Math.max(this.lastSessionUpdateAt, quietBaseline);
            const elapsedSinceUpdate = Date.now() - latestActivityAt;
            if (elapsedSinceUpdate >= AcpSdkBackend.LATE_FLUSH_QUIET_PERIOD_MS) {
                return;
            }
    
            const remainingBudget = deadline - Date.now();
            const waitMs = Math.max(1, Math.min(AcpSdkBackend.LATE_FLUSH_INTERVAL_MS, remainingBudget));
            await new Promise<void>((resolve) => setTimeout(resolve, waitMs));
            this.messageHandler?.drainBuffers();
        }
    }

Questions

  • None

Summary

  • Review mode: follow-up after new commits
  • The previous background-timer issue is mostly addressed by awaiting the late flush, but the new quiet check can still skip the post-return observation window when the last update is already old.

Testing

  • Not run (automation): bun test cli/src/agent/backends/acp/AcpSdkBackend.test.ts could not run because bun is not installed in this runner.

HAPI Bot

private async drainLateBuffers(): Promise<void> {
const deadline = Date.now() + AcpSdkBackend.LATE_FLUSH_WINDOW_MS;
while (Date.now() < deadline) {
const elapsedSinceUpdate = Date.now() - this.lastSessionUpdateAt;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] drainLateBuffers() can return before it observes the period after session/prompt resolves. Because lastSessionUpdateAt may already be older than LATE_FLUSH_QUIET_PERIOD_MS when this method starts, the first loop iteration exits immediately; a chunk emitted just after the prompt response is then buffered after turn_complete/ready and is not flushed until a later boundary.

Suggested fix:

private async drainLateBuffers(): Promise<void> {
    const quietBaseline = Date.now();
    const deadline = quietBaseline + AcpSdkBackend.LATE_FLUSH_WINDOW_MS;

    while (Date.now() < deadline) {
        const latestActivityAt = Math.max(this.lastSessionUpdateAt, quietBaseline);
        const elapsedSinceUpdate = Date.now() - latestActivityAt;
        if (elapsedSinceUpdate >= AcpSdkBackend.LATE_FLUSH_QUIET_PERIOD_MS) {
            return;
        }

        const remainingBudget = deadline - Date.now();
        const waitMs = Math.max(1, Math.min(AcpSdkBackend.LATE_FLUSH_INTERVAL_MS, remainingBudget));
        await new Promise<void>((resolve) => setTimeout(resolve, waitMs));
        this.messageHandler?.drainBuffers();
    }
}

…ionUpdateAt

Bot's third MAJOR on tiann#730: drainLateBuffers() compared elapsed time
against lastSessionUpdateAt, which can already be older than
LATE_FLUSH_QUIET_PERIOD_MS by the time the method starts — e.g. when the
model emits chunks early in the turn, pauses, then sends stopReason. In
that case the first loop iteration sees a stale "quiet" reading and
returns immediately, missing any straggler that arrives just after
session/prompt resolves; the chunk then sits in the buffer until the
next prompt's pre-prompt drain.

Anchor the quiet check to max(lastSessionUpdateAt, entry time) so we
always observe at least one quiet period from method entry regardless of
when the last chunk was. Adds a regression test that fires a chunk
early, awaits a 200ms pause, schedules a post-resolution straggler, and
asserts it lands before turn_complete.

via [HAPI](https://hapi.run)

Co-Authored-By: HAPI <noreply@hapi.run>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • None

Questions

  • None

Summary

  • Review mode: follow-up after new commits
  • No issues found in the latest diff. Previous late-flush quiet-window finding appears addressed by anchoring the quiet baseline at drainLateBuffers() entry and adding a regression test. Residual risk: timing-based behavior depends on real ACP tail cadence and could not be verified against live OpenCode/Gemini in this runner.

Testing

  • Not run (automation): bun test cli/src/agent/backends/acp/AcpSdkBackend.test.ts and bun typecheck could not run because bun is not installed in this runner.

HAPI Bot

…hor fix

The previous note claimed the 250ms quiet check "exits early for fast
models, adding negligible latency". That was true before commit 512d6a4
when the check compared against lastSessionUpdateAt; with the entry-time
anchor, drainLateBuffers always observes at least one full quiet period.
Document that this minimum wait is the price of catching post-resolution
stragglers from paused-mid-turn models.

via [HAPI](https://hapi.run)

Co-Authored-By: HAPI <noreply@hapi.run>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • None

Questions

  • None

Summary
Review mode: follow-up after new commits
No issues found in the latest PR diff. The current ACP late-flush path keeps the active handler installed through post-prompt draining, emits late chunks before turn_complete, and includes regression coverage for post-return stragglers and stale lastSessionUpdateAt quiet-window anchoring. Residual risk: real ACP tail cadence could not be validated against live OpenCode/Gemini/Kimi in this runner.

Testing

  • Not run (automation): bun test cli/src/agent/backends/acp/AcpSdkBackend.test.ts and bun typecheck failed because bun is not installed in this runner.

HAPI Bot

@tiann tiann merged commit 4b24528 into tiann:main May 31, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Output stopps in the middle and only shown after next prompt (OpenCode w GPT 5.5)

2 participants