Skip to content

Extract shared MarkdownSegmentSeparator helper (fixes #319871)#319923

Closed
ulugbekna wants to merge 4 commits into
mainfrom
agents/please-fix-https-github-com-microsoft-vscode-iss-f08ea8e8
Closed

Extract shared MarkdownSegmentSeparator helper (fixes #319871)#319923
ulugbekna wants to merge 4 commits into
mainfrom
agents/please-fix-https-github-com-microsoft-vscode-iss-f08ea8e8

Conversation

@ulugbekna
Copy link
Copy Markdown
Contributor

Fixes #319871.

Why

The same fused-assistant-text bug has now shipped first in the OpenAI Responses API integration (fixed in #312173) and then again in the Copilot CLI integration (fixed in #319727). In both cases the natural way to write the handlertwice

for (const item of message.content) {
    stream.markdown(item.text);
}

silently fuses adjacent text items into a single run-on paragraph (e.g. "...wiring:Now add..."). Each fix has been a hand-rolled ~5-line state machine sprinkled into the call site, which means the next SDK integration is almost guaranteed to reintroduce the same bug. Spot check: the Claude Code SDK integration was missing the separator entirely.

What

Extract a tiny, sink-agnostic MarkdownSegmentSeparator helper that makes the segment-boundary an explicit operation at every call site, and refactor all three integrations onto it.

const sep = new MarkdownSegmentSeparator(() => stream.markdown('\n\n'));
// for each text fragment from logical segment `segmentKey`:
sep.onSegment(segmentKey);
stream.markdown(text);

Semantics:

  • First emission: no separator.
  • Same key as previous: no separator.
  • Different key from previous: emit separator.
  • undefined key: no-op (preserves legacy fallback when an SDK event has no id).
  • reset() for turn/request boundaries.

Call sites

Integration File Segment key
OpenAI Responses API responsesApi.ts (OpenAIResponsesProcessor) output_index from response.output_text.delta
Copilot CLI copilotcliSession.ts messageId from assistant.message_delta / assistant.message
Claude Code SDK claudeMessageDispatch.ts (handleAssistantMessage) `` `${message.uuid}:${ covers both multi-block messages within a turn and consecutive messages

For Claude the helper is stateful across handleAssistantMessage invocations, so it lives on MessageHandlerState and is constructed once per _processMessages loop in ClaudeCodeSession.

The Copilot CLI change re-lands the #319727 fix on this branch via the new helper.

Tests

  • 7 focused unit tests for MarkdownSegmentSeparator (first emission, repeated key, key change, multi-transition, numeric keys, undefined no-op, reset()).
  • New "assistant text segment separation" suite in claudeMessageDispatch.spec.ts (within-message, cross-message, no leading separator).
    no separator, mixed message_delta / message, first emission no separator).
  • The existing OpenAI Responses "phase commentary followed by phase final_answer" test continues to pass against the refactored processor.

Verification

  • npx tsgo --noEmit --project tsconfig. cleanjson
  • ESLint on touched files (--max-warnings= clean0)
  • Pre-commit cleanhygiene
  • 222 tests in the directly-affected spec files pass; a broader run over util/common/test/, chatSessions/claude/, chatSessions/copilotcli/, platform/endpoint/node/test/ shows 1609 passing / 4 skipped / 0 failing.

The same fused-assistant-text bug had shipped twice (OpenAI Responses
API in #312173, Copilot CLI in #319727). Each fix was a hand-rolled
~5-line state machine, and the natural way to write the next SDK
integration ('for (item) stream.markdown(item.text)') silently fuses
distinct text items into a single run-on paragraph
(e.g. '...wiring:Now add...').

Extract a tiny, sink-agnostic 'MarkdownSegmentSeparator' helper that
names the operation explicitly at each call site, then refactor all
three integrations onto it:

- OpenAI Responses API: replaces lastTextDeltaOutputIndex bookkeeping
  in OpenAIResponsesProcessor; keyed on output_index.
- Copilot CLI: applies the helper to assistant.message_delta /
  assistant.message handlers in CopilotCLISession; keyed on
  messageId. This is the same fix as #319727, re-landed via the helper
  in this branch.
- Claude Code SDK: applies the helper to handleAssistantMessage in
  claudeMessageDispatch; keyed on '<message.uuid>:<contentIndex>' so
  both multi-block messages and consecutive messages in a turn are
  covered. State lives on MessageHandlerState and is constructed once
  per _processMessages loop in ClaudeCodeSession.

Tests:
- 7 focused unit tests for the helper (first emission, repeated key,
  key change, undefined-key legacy fallback, numeric keys, reset()).
- New 'assistant text segment separation' suite in
  claudeMessageDispatch.spec.ts (within-message and cross-message
  separators, no leading separator).
- New 'assistant message text separation' suite in
  copilotcliSession.spec.ts mirroring the patterns from #319727.
- responsesApi.spec.ts 'phase commentary followed by phase
  final_answer' continues to pass against the refactored processor.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 4, 2026 13:57
…-vscode-iss-f08ea8e8

Resolve conflicts in copilotcli session + spec: keep MarkdownSegmentSeparator
helper version (functionally equivalent to #319727 inline impl). Kept the new
'legacy undefined messageId' test from  passes against the helpermain
because onSegment(undefined) is a no-op.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extracts a small shared MarkdownSegmentSeparator utility into the Copilot extension and refactors multiple SDK integrations (OpenAI Responses API, Copilot CLI, Claude Code SDK) to explicitly insert \n\n between logical assistant text segments, preventing “fused” run-on output in the chat stream. It also adds focused unit tests for the helper and regression tests for affected integrations.

Changes:

  • Added a sink-agnostic MarkdownSegmentSeparator helper with unit test coverage.
  • Refactored OpenAI Responses API and Copilot CLI streaming handlers to use the shared separator.
  • Refactored Claude Code SDK assistant message dispatch to insert separators between content blocks / consecutive messages and added targeted tests.
Show a summary per file
File Description
extensions/copilot/src/util/common/markdownSegmentSeparator.ts Introduces the shared segment-separation helper.
extensions/copilot/src/util/common/test/markdownSegmentSeparator.spec.ts Adds unit tests covering segment transition semantics and reset().
extensions/copilot/src/platform/endpoint/node/responsesApi.ts Switches Responses API text delta handling to use the shared separator keyed by output_index.
extensions/copilot/src/extension/chatSessions/copilotcli/node/copilotcliSession.ts Uses the shared separator keyed by messageId to prevent fused assistant output in the stream.
extensions/copilot/src/extension/chatSessions/copilotcli/node/test/copilotcliSession.spec.ts Adds regression tests for message separation behavior in CLI integration.
extensions/copilot/src/extension/chatSessions/claude/node/claudeCodeAgent.ts Wires a shared separator instance into Claude’s per-message dispatch state.
extensions/copilot/src/extension/chatSessions/claude/common/claudeMessageDispatch.ts Inserts separator calls before emitting assistant text blocks (keyed by <uuid>:<contentIndex>).
extensions/copilot/src/extension/chatSessions/claude/common/test/claudeMessageDispatch.spec.ts Adds regression tests ensuring no fused text within/between Claude assistant messages.

Copilot's findings

  • Files reviewed: 7/7 changed files
  • Comments generated: 3

Comment on lines +636 to 640
const assistantTextSegmentSeparator = new MarkdownSegmentSeparator(() => {
const currentRequest = this._currentRequest;
currentRequest?.stream.markdown('\n\n');
});
try {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addressed in 1419b7d. _processMessages now calls assistantTextSegmentSeparator.reset() on requestComplete, alongside the existing subagentTraceContexts.clear(). Regression coverage: dispatch-level contract test in claudeMessageDispatch.spec.ts and an integration test in claudeCodeAgent.spec.ts that spies on MarkdownSegmentSeparator.prototype.reset (confirmed to fail without the fix).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addressed in 1419b7d. _processMessages now calls assistantTextSegmentSeparator.reset() on requestComplete, alongside the existing subagentTraceContexts.clear(). Regression coverage: a dispatch-level contract test in claudeMessageDispatch.spec.ts and an integration test in claudeCodeAgent.spec.ts that spies on MarkdownSegmentSeparator.prototype.reset (confirmed to fail without the fix).catch

// phases). Without this the markdown chunks fuse into a run-on
// paragraph like `"...wiring:Now add..."`. Streaming deltas that
// share a `messageId` are unaffected.
const assistantTextSegmentSeparator = new MarkdownSegmentSeparator(() => requestStream?.markdown('\n\n'));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fixed in c09163b. The separator callback now writes to both requestStream?.markdown('\n\n') and assistantMessageChunks.push('\n\n'), so logs and UI agree. Regression test in copilotcliSession.spec.ts captures requestLogger.addEntry calls and asserts the completion log's markdown contains First.\n\nSecond..point

Comment on lines 1444 to 1448
assistantMessageChunks.push(event.data.content);
flushPendingInvocationMessages();
maybeEmitMessageSeparator(event.data.messageId);
assistantTextSegmentSeparator.onSegment(event.data.messageId);
wroteResponseContent = true;
requestStream?.markdown(event.data.content);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fixed in the same commit (c09163b). Reordered the assistant.message handler so onSegment(...) runs before assistantMessageChunks.push(content), matching the order already used in assistant.message_delta. With the chunks-mirroring fix above, this keeps the separator in the correct position: [..., '\n\n', content] rather than [..., content, '\n\n'].catch

ulugbekna and others added 2 commits June 4, 2026 16:10
…ages

The separator is constructed once per session-long _processMessages
loop. Without resetting it on requestComplete, lastSegmentKey persists
from one turn into the next, and the first text emission of turn N+1
 against a fresh  would be prefixed with aChatResponseStream
spurious '\n\n'.

Regression tests:
- common/test/claudeMessageDispatch.spec.ts: dispatch-level contract
  test verifying that calling .reset() between two turns (distinct
  message.uuids) keeps the next turn's first markdown call clean.
- node/test/claudeCodeAgent.spec.ts: integration test that spies on
  MarkdownSegmentSeparator.prototype.reset and verifies _processMessages
  actually calls it on request completion. Confirmed to fail without
  the fix.

Reported by code-review subagent on PR #319923.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…sistant.message

Addresses two Copilot reviewer comments on PR #319923:

1. The separator was only being written to the request stream, not to
   `assistantMessageChunks`. Since the joined chunks are passed to
   `_logConversation` as the 'Assistant Response', the log showed fused
   output (`"First.Second."`) while the UI stream showed the separator
   (`"First.\n\nSecond."`). The separator callback now writes to both.

2. In the `assistant.message` handler, `onSegment(...)` ran AFTER
   `assistantMessageChunks.push(content)`, which (now that the separator
   also feeds the chunks) would produce `[content, '\n\n']` in the log
   rather than `['\n\n', content]`. Reordered so `onSegment` runs first,
   matching the order already used in `assistant.message_delta`.

Regression test added in `copilotcliSession.spec.ts` that captures
`requestLogger.addEntry` calls and asserts the final completion log's
markdown contains `'First.\n\nSecond.'`. Verified to fail without the
fix and pass with it.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@ulugbekna ulugbekna closed this Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extract shared "segment separator" helper to prevent fused assistant text across SDK integrations

2 participants