Skip to content

fix(translate): route OpenAI-compat reasoning to Anthropic thinking blocks#237

Merged
steventohme merged 1 commit into
mainfrom
steven/fix-qwen-reasoning-leak
May 23, 2026
Merged

fix(translate): route OpenAI-compat reasoning to Anthropic thinking blocks#237
steventohme merged 1 commit into
mainfrom
steven/fix-qwen-reasoning-leak

Conversation

@steventohme
Copy link
Copy Markdown
Collaborator

Problem

Qwen models routed through OpenRouter (and DeepSeek native) emit reasoning traces in delta.reasoning / delta.reasoning_content. The AnthropicSSETranslator only read delta.content, so reasoning either silently dropped or — when the model embedded its planning narration inline — bled into the visible text channel. Observed symptom: Claude Code clients saw repeated "Let me check..." narration mixed into the answer instead of a clean response.

Fix

internal/translate/stream.go

  • New state field thinkingOpen plus emitContentBlockStartThinking / emitContentBlockDeltaThinking helpers.
  • emitDelta reads reasoning_content (DeepSeek/Qwen native) then reasoning (OpenRouter normalized) and emits Anthropic thinking content blocks. The block-open state machine closes thinking before opening text / tool_use (and vice versa), with finishStream closing any still-open thinking block.

internal/translate/translate.go

  • writeAnthropicContentFromOpenAI emits a thinking block before the text block when the buffered (non-streaming) response carries reasoning / reasoning_content.

Tests

Added in translate_test.go:

  • TestAnthropicSSETranslator_StreamingReasoningEmitsThinkingBlockreasoning deltas followed by content produce a thinking block then a text block in that order; reasoning prose is asserted to NOT appear in the text channel.
  • TestAnthropicSSETranslator_StreamingReasoningContentFieldreasoning_content (DeepSeek convention) maps to the same thinking block.
  • TestAnthropicSSETranslator_NonStreamingReasoningEmitsThinkingBlock — buffered responses emit thinking block before text block.

go test ./internal/translate/... and go test ./internal/proxy/... both pass.

Test plan

  • CI green
  • Manual check: route a Qwen3 (or DeepSeek-V4-Pro) turn via the router and confirm Claude Code renders reasoning in its thinking pane rather than as inline assistant text.

…locks

Qwen models routed via OpenRouter (and DeepSeek native) emit reasoning
in delta.reasoning / delta.reasoning_content. The AnthropicSSE
translator only read delta.content, so reasoning either silently dropped
or — when the model embedded its planning narration inline — bled into
the visible text channel.

Convert both reasoning fields into Anthropic thinking content blocks on
the streaming path (emit_*Thinking helpers + open/close state machine
that interleaves with text and tool_use blocks) and the non-streaming
path (writeAnthropicContentFromOpenAI emits a thinking block before the
text block).

Tests cover: reasoning-then-text ordering on streaming, reasoning_content
parity with reasoning, and non-streaming reasoning emission.
@steventohme steventohme merged commit 4a66d50 into main May 23, 2026
7 checks passed
@steventohme steventohme deleted the steven/fix-qwen-reasoning-leak branch May 23, 2026 00:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant