You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue was filed as a duplicate-content-at-continuation-boundary bug. Based on observed behavior across multiple sessions, the streaming pipeline has a wider set of failure modes that all trace to the same structural source. Expanding scope here.
Failure modes (all observed in production)
1. Duplicate / repeated content at chunk boundary
A long reply (≥2200 chars after tool work) splits into a stream bubble + continuation post. Content near the split boundary appears to repeat or restart because:
shouldAutoStartStreaming fires at deltaCount >= 2 on provisional pre-tool prose
onToolCall arrives after streaming has already started — cannot retract
awaitingPostToolAssistantMessage is NOT set, so onAssistantMessageStart after tool completion does NOT reset streamedReplyState
Final reply appends to the same accumulator as the provisional prefix
Combined length (provisional + final) misaligns the split point vs. what the user expects to see
Slack renders the bubble ending mid-content and the continuation starting at what looks like a duplicated section
Replies that overflow the continuation budget are split via splitSlackReplyText. The split is computed on accumulated raw delta text, but normalizeForSlack (runs ensureBlockSpacing) expands the rendered text after the fact by inserting blank lines between content blocks. The result:
The rendered content in the stream bubble is longer than the raw character count suggests
The continuation starts at a raw-offset boundary that doesn't correspond to a clean semantic break
To the user: the first message appears cut off mid-sentence; the second message re-starts in the middle of content they've already seen
Additionally: the continuation post has no guaranteed delivery. If RetryableTurnError fires (JUNIOR-1D, still active post-0.23.0) or the post fails, the overflow content is silently lost — no truncation marker, no user-visible signal.
The pendingStreamText redundancy gate in reply-executor.ts holds all deltas until the accumulated text doesn't match any known ack prefix ("ok", "sure", "let me", "on it", partial emoji). Simultaneously, createNormalizingStream in the streaming path holds content until a newline is seen. Both gates compound:
The stream message post is created by Slack before any content lands
The user sees a blank bubble for 15–60+ seconds
For tool-heavy turns, this window extends further because the LLM emits no text deltas during tool execution
In multi-turn threads, the runtime injects thread transcript context at the start of each turn. Several confirmed failure modes:
The injected transcript lags the live thread. If a turn fails (RetryableTurnError, delivery error), the transcript still reflects the last successful state — the bot "doesn't know" about its own failed or partial reply in the previous turn and cannot correct for it.
Provisional pre-tool text in a failed stream doesn't get cleaned up. If a stream starts (bubble created), tool execution fails, and the turn errors out, the partial bubble stays in Slack. On the next turn, the transcript doesn't include that orphaned bubble, creating a permanent disconnect between what the user sees and what the bot knows about the thread.
Split replies are seen as one message by the runtime but two by the user. The bot tracks that it sent "the answer" but doesn't know the user saw it split mid-sentence. This makes the bot confidently describe a complete reply when the user experienced a truncated one.
5. Output/streaming disconnect (general)
The structural problem across all four failure modes: the pipeline commits a visible Slack artifact (stream bubble) before it knows whether the content is final, complete, or correctly budgeted. There is no retraction path and no delivery confirmation signal fed back into the runtime's thread model. Concrete gaps:
No span/log when streamOverflowed triggers — no way to correlate "overflow happened" with "user saw duplicate"
No TTL or orphan cleanup for stream bubbles that were opened but whose turns failed
No signal from Slack delivery back to the transcript/context layer — the bot's view of the thread is permanently optimistic
Root cause summary
All four failure modes share one upstream source: shouldAutoStartStreaming fires too early (at deltaCount >= 2) and there is no retraction path once a stream bubble is opened.
Add overflow observability — emit a span when streamOverflowed triggers, log stream bubble IDs so orphaned bubbles can be detected and cleaned up.
Add a truncation/delivery guarantee — if the continuation post fails, post a short fallback marker rather than silently dropping content.
Thread context sync — investigate whether orphaned stream bubbles and failed turns can be tracked so the transcript reflects actual user-visible state, not just what was successfully posted.
Updated scope
This issue was filed as a duplicate-content-at-continuation-boundary bug. Based on observed behavior across multiple sessions, the streaming pipeline has a wider set of failure modes that all trace to the same structural source. Expanding scope here.
Failure modes (all observed in production)
1. Duplicate / repeated content at chunk boundary
A long reply (≥2200 chars after tool work) splits into a stream bubble + continuation post. Content near the split boundary appears to repeat or restart because:
shouldAutoStartStreamingfires atdeltaCount >= 2on provisional pre-tool proseonToolCallarrives after streaming has already started — cannot retractawaitingPostToolAssistantMessageis NOT set, soonAssistantMessageStartafter tool completion does NOT resetstreamedReplyStateRelated: #200
2. Cut-off / truncated replies
Replies that overflow the continuation budget are split via
splitSlackReplyText. The split is computed on accumulated raw delta text, butnormalizeForSlack(runsensureBlockSpacing) expands the rendered text after the fact by inserting blank lines between content blocks. The result:Additionally: the continuation post has no guaranteed delivery. If
RetryableTurnErrorfires (JUNIOR-1D, still active post-0.23.0) or the post fails, the overflow content is silently lost — no truncation marker, no user-visible signal.Related: #187
3. Blank message / stream stall
The
pendingStreamTextredundancy gate inreply-executor.tsholds all deltas until the accumulated text doesn't match any known ack prefix ("ok", "sure", "let me", "on it", partial emoji). Simultaneously,createNormalizingStreamin the streaming path holds content until a newline is seen. Both gates compound:Related: #97
4. Missing or stale thread context
In multi-turn threads, the runtime injects thread transcript context at the start of each turn. Several confirmed failure modes:
5. Output/streaming disconnect (general)
The structural problem across all four failure modes: the pipeline commits a visible Slack artifact (stream bubble) before it knows whether the content is final, complete, or correctly budgeted. There is no retraction path and no delivery confirmation signal fed back into the runtime's thread model. Concrete gaps:
streamOverflowedtriggers — no way to correlate "overflow happened" with "user saw duplicate"Root cause summary
All four failure modes share one upstream source:
shouldAutoStartStreamingfires too early (atdeltaCount >= 2) and there is no retraction path once a stream bubble is opened.Fix hierarchy:
createNormalizingStreamfrom the streaming delivery path (normalization at finalize time only). This eliminates the compound buffering that causes the blank-bubble stall.streamOverflowedtriggers, log stream bubble IDs so orphaned bubbles can be detected and cleaned up.Related
JUNIOR-1D(RetryableTurnError, active post-0.23.0)JUNIOR-1G(message_not_in_streaming_state, pre-0.23.0, monitoring for recurrence)Action taken on behalf of David Cramer.