fix(translate): route OpenAI-compat reasoning to Anthropic thinking blocks#237
Merged
Conversation
…locks Qwen models routed via OpenRouter (and DeepSeek native) emit reasoning in delta.reasoning / delta.reasoning_content. The AnthropicSSE translator only read delta.content, so reasoning either silently dropped or — when the model embedded its planning narration inline — bled into the visible text channel. Convert both reasoning fields into Anthropic thinking content blocks on the streaming path (emit_*Thinking helpers + open/close state machine that interleaves with text and tool_use blocks) and the non-streaming path (writeAnthropicContentFromOpenAI emits a thinking block before the text block). Tests cover: reasoning-then-text ordering on streaming, reasoning_content parity with reasoning, and non-streaming reasoning emission.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Qwen models routed through OpenRouter (and DeepSeek native) emit reasoning traces in
delta.reasoning/delta.reasoning_content. TheAnthropicSSETranslatoronly readdelta.content, so reasoning either silently dropped or — when the model embedded its planning narration inline — bled into the visible text channel. Observed symptom: Claude Code clients saw repeated "Let me check..." narration mixed into the answer instead of a clean response.Fix
internal/translate/stream.gothinkingOpenplusemitContentBlockStartThinking/emitContentBlockDeltaThinkinghelpers.emitDeltareadsreasoning_content(DeepSeek/Qwen native) thenreasoning(OpenRouter normalized) and emits Anthropicthinkingcontent blocks. The block-open state machine closes thinking before opening text / tool_use (and vice versa), withfinishStreamclosing any still-open thinking block.internal/translate/translate.gowriteAnthropicContentFromOpenAIemits athinkingblock before the text block when the buffered (non-streaming) response carriesreasoning/reasoning_content.Tests
Added in
translate_test.go:TestAnthropicSSETranslator_StreamingReasoningEmitsThinkingBlock—reasoningdeltas followed bycontentproduce a thinking block then a text block in that order; reasoning prose is asserted to NOT appear in the text channel.TestAnthropicSSETranslator_StreamingReasoningContentField—reasoning_content(DeepSeek convention) maps to the same thinking block.TestAnthropicSSETranslator_NonStreamingReasoningEmitsThinkingBlock— buffered responses emit thinking block before text block.go test ./internal/translate/...andgo test ./internal/proxy/...both pass.Test plan