Skip to content

Extract shared "segment separator" helper to prevent fused assistant text across SDK integrations #319871

@ulugbekna

Description

@ulugbekna

Background

We've now shipped the same fix to the same bug pattern twice:

In both cases the symptom in the Agents window chat was the same: consecutive assistant text items fused into one run-on paragraph (e.g. "...wiring:Now add..." instead of two paragraphs).

The Claude Code SDK path (extensions/copilot/src/extension/chatSessions/claude/common/claudeMessageDispatch.ts) likely has the same latent multiple text content blocks in one SDKAssistantMessage (or consecutive messages in one turn) are forwarded via stream.markdown(item.text) with no but no user-visible report has come in yet.separator bug

Why this keeps happening

  1. Impedance mismatch. Every model SDK models a turn as discrete items (content_blocks, output_index-tagged events, messageId-tagged messages). vscode.ChatResponseStream.markdown(...) is a purely concatenative sink. The natural way to write the handler (for (item) stream.markdown(item.text)) silently fuses items.
  2. Hand-rolled each time. No shared helper exists, so each new integration reinvents the same ~5-line state or forgets to.machine
  3. Hides in dev testing. Only surfaces when the model emits multiple text items in one turn; short answers don't trigger it. There is no invariant test that would catch the omission at PR review or CI time.

Proposal

Extract a tiny, sink-agnostic segment-boundary helper, then refactor both existing call sites onto it. Sketch:

// e.g. extensions/copilot/src/util/common/markdownSegmentSeparator.ts
export class MarkdownSegmentSeparator {
    private lastSegmentKey: string | number | undefined;
    constructor(private readonly emitSeparator: () => void) { }

    /** Call before emitting text from `segmentKey`. Emits `\n\n` if the
     *  segment has changed and this is not the first emission. Defined-on-both
     *  keys are compared; otherwise no separator is emitted (legacy fallback). */
    onSegment(segmentKey: string | number | undefined): void {
        if (
            segmentKey !== undefined &&
            this.lastSegmentKey !== undefined &&
            segmentKey !== this.lastSegmentKey
        ) {
            this.emitSeparator();
        }
        if (segmentKey !== undefined) {
            this.lastSegmentKey = segmentKey;
        }
    }

    /** Call at request/turn boundaries to avoid spurious separators on the next turn. */
    reset(): void { this.lastSegmentKey = undefined; }
}

Call sites become:

// Copilot CLI (assistant.message_delta / assistant.message)
const sep = new MarkdownSegmentSeparator(() => requestStream?.markdown('\n\n'));
// ... in handler:
sep.onSegment(event.data.messageId);
requestStream?.markdown(event.data.deltaContent);

// OpenAI Responses API (response.output_text.delta)
const sep = new MarkdownSegmentSeparator(() => onProgress({ text: '\n\n' }));
// ... in handler:
sep.onSegment(capiChunk.output_index);
return onProgress({ text: capiChunk.delta, ... });

The value isn't saving 5 lines per it's:site

  • Naming the operation so the intent (onSegment(...)) is visible at every call site, making a missing call easier to spot during review.
  • One place for the invariant test that locks in the behavior so the pattern can't regress silently.
  • Lower friction for the next SDK integration to do the right thing by default.

Scope of work

  1. Add MarkdownSegmentSeparator (location probably extensions/copilot/src/util/common/) with focused unit tests covering: first emission (no separator), same key repeated (no separator), key change (separator), undefined-key legacy fallback (no separator), and reset().TBD
  2. Refactor responsesApi.ts OpenAIResponsesProcessor to use the helper (replace lastTextDeltaOutputIndex).
  3. Refactor copilotcliSession.ts to use the helper (replace lastEmittedAssistantMessageId + maybeEmitMessageSeparator).
  4. Audit and fix the Claude Code SDK path (claudeMessageDispatch.ts handleAssistantMessage) using the same helper, keyed on <message.uuid>:<contentIndex>. This is the third site likely affected; fixing it preemptively is cheap once the helper exists.
  5. Migrate the existing tests in responsesApi.spec.ts and copilotcliSession.spec.ts to remain passing; add a Claude Code SDK test.

Non-goals

  • Changing the separator string (\n\n is correct and markdown-safe between blocks).
  • Touching the chunkMessageIds/assistantMessageChunks dedup logic in copilotcliSession. that's pre-existing and out of scope.ts
  • Auto-detecting fusion at the chat-rendering layer (heuristic would break legitimate intra-word streaming like "writ" + "ing").

Acceptance

  • All three integrations use MarkdownSegmentSeparator.
  • Existing tests still pass; new tests assert no regression on each site.
  • A future new SDK integration that forgets to call onSegment(...) is caught either by review (intent is now explicit) or by a written test pattern this issue establishes.

References

Metadata

Metadata

Assignees

Labels

copilot-cli-agentBackground Agent related features/bugsdebtCode quality issues

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions