fix: pass reasoning_content back in thinking mode to avoid HTTP 400 by ModerRAS · Pull Request #324 · ModerRAS/TelegramSearchBot

ModerRAS · 2026-04-25T07:05:12Z

Summary

Fix HTTP 400 error (invalid_request_error) when using thinking mode models (e.g., Kimi-thinking-preview, QwQ)
The API requires reasoning_content to be passed back in subsequent requests; without it, subsequent calls fail

Changes

LlmContinuationSnapshot.cs: Added ReasoningContent field to SerializedChatMessage for snapshot persistence
OpenAIService.cs: Capture and restore reasoning_content for multi-turn conversations

Testing

Build passed

Summary by CodeRabbit

New Features
- Added support for AI reasoning/thinking mode, enabling the system to capture and preserve advanced reasoning text from AI responses.
- Reasoning data now persists across chat continuations, improving consistency when resuming conversations with specialized LLM modes.

For thinking mode models (e.g., Kimi-thinking-preview, QwQ), the API requires the reasoning_content to be passed back in subsequent requests. Without this, the API returns 'invalid_request_error: invalid_request_error'. Changes: - Add ReasoningContent field to SerializedChatMessage for snapshot persistence - Capture reasoning_content during streaming updates - Restore reasoning_content when deserializing history for API calls - Use reflection to access OpenAI SDK internal properties This ensures multi-turn conversations with thinking mode models work correctly.

coderabbitai · 2026-04-25T07:05:24Z

📝 Walkthrough

Walkthrough

The changes extend the serialized chat message model with a new optional field to store LLM thinking-mode reasoning content. The OpenAI service is updated to capture reasoning from streaming responses and propagate it through serialization and deserialization layers using reflection helpers.

Changes

Cohort / File(s)	Summary
Message Model Extension `TelegramSearchBot.Common/Model/AI/LlmContinuationSnapshot.cs`	Added optional `ReasoningContent` property to `SerializedChatMessage` to persist thinking-mode reasoning text.
OpenAI Service Streaming & Serialization `TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs`	Extended native tool-calling streaming to capture reasoning updates from `StreamingChatCompletionUpdate`. Updated `SerializeProviderHistory` and `DeserializeProviderHistory` methods to handle reasoning data. Added private reflection-based helpers for extracting and restoring reasoning on assistant messages.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A curious thought, now captured whole,
The AI's whispers, reasoning's soul,
Through streams and serializations they flow,
Where thinking modes shine bright and glow!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically addresses the main fix: passing reasoning_content back in thinking mode to prevent HTTP 400 errors. It matches the core objective of the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/reasoning-content-thinking-mode

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs (2)
1371-1463: ⚠️ Potential issue | 🔴 Critical

Reflection-based round-trip won't fix the HTTP 400 — OpenAI SDK 2.10.0 does not expose these properties.

The official OpenAI .NET SDK (version 2.10.0, as pinned in this repo) does not define public Reasoning or ReasoningContentUpdate properties on AssistantChatMessage or StreamingChatCompletionUpdate. As a result, this entire reflection-based approach is non-functional:

GetStreamingReasoningContent will always return null since the properties don't exist — reasoningContentBuilder never accumulates anything.

SetAssistantReasoningContent silently fails when the property check returns false, even if a property of that name were found, the SDK's request serializer won't emit reasoning_content in outgoing JSON (it's not part of the Chat Completions API request schema).

The HTTP 400 in thinking-mode multi-turn calls remains unfixed. For providers requiring reasoning_content to be echoed back, use one of:

A custom PipelinePolicy on OpenAIClientOptions.Transport to mutate outgoing request JSON and inject reasoning_content onto assistant messages before send.

Call the endpoint directly with HttpClient for these models instead of using ChatClient.

Verify end-to-end (not just "build passes") on a thinking-mode model that:

Streaming reasoning is captured (reasoningContent non-empty at Line 1016).

The next request includes reasoning_content in the assistant message JSON.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs` around lines 1371 -
1463, The reflection-based getters/setters (GetAssistantReasoningContent,
GetStreamingReasoningContent, SetAssistantReasoningContent and the
DeserializeProviderHistory usage) won’t produce or serialize reasoning_content
with OpenAI .NET SDK v2.10.0; replace this approach by mutating outgoing request
JSON (injecting reasoning_content into assistant messages) via a custom
PipelinePolicy on OpenAIClientOptions.Transport or by sending requests directly
with HttpClient to the thinking-mode endpoint, and remove/stop relying on the
reflection helpers — then verify end-to-end that streaming reasoning is captured
and the subsequent request JSON includes reasoning_content in assistant
messages.
1244-1244: ⚠️ Potential issue | 🟠 Major

ResumeFromSnapshotAsync doesn't restore tool-call assistant messages from snapshot.

DeserializeProviderHistory only reconstructs plain text assistant messages using new AssistantChatMessage(msg.Content ?? ""). Snapshots taken mid tool-cycle contain AssistantChatMessage(chatToolCalls) entries whose tool-call structure is lost on resume. Tool result messages (ToolChatMessage with "tool" role) are also not serialized or restored. This re-introduces HTTP 400 errors once a snapshot is resumed against a thinking-mode provider that expects the full tool-calling context.

ReasoningContent is properly handled via SetAssistantReasoningContent(), so it survives snapshot restore.

If snapshot resume should support thinking-mode + tool-calling flows, SerializedChatMessage needs fields for tool-call structure (tool IDs, names, arguments) and support for "tool" role messages, not just Role/Content/ReasoningContent.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs` at line 1244,
ResumeFromSnapshotAsync currently calls DeserializeProviderHistory which
recreates only plain AssistantChatMessage instances and drops tool-call
structure and ToolChatMessage ("tool" role) entries from snapshots; update
SerializedChatMessage to include tool-call fields (tool id/name/args and any
serialized tool response) and extend DeserializeProviderHistory to reconstruct
AssistantChatMessage instances with tool-call metadata and ToolChatMessage
entries so thinking-mode providers get full context; ensure
ResumeFromSnapshotAsync uses the new deserialization and preserve
ReasoningContent via SetAssistantReasoningContent as before.

🧹 Nitpick comments (2)

TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs (2)
1375-1419: Cache PropertyInfo lookups; called per streaming chunk.

GetStreamingReasoningContent is invoked for every StreamingChatCompletionUpdate (potentially hundreds of times per response) and each call does two Type.GetProperty lookups. Cache the resolved PropertyInfo (or a typed Func<,>) in static readonly fields keyed by Type to avoid the repeated reflection cost on the hot streaming path. The same applies to the get/set helpers on AssistantChatMessage.

Also applies to: 1454-1463
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs` around lines 1375 -
1419, GetStreamingReasoningContent and GetAssistantReasoningContent are doing
Type.GetProperty on every call (hot path) — cache the resolved PropertyInfo or
compiled accessors to avoid repeated reflection. Add static readonly
ConcurrentDictionary<Type, PropertyInfo?> (or ConcurrentDictionary<Type,
Func<object, string?>> for typed getters) for the ReasoningContentUpdate and
Reasoning properties and use those caches in GetStreamingReasoningContent and
GetAssistantReasoningContent (look up by update.GetType() or
assistantMsg.GetType(), retrieve cached PropertyInfo/Func, and invoke it if
present). Apply the same caching pattern to the corresponding helper code
referenced around the other block (the get/set helpers at the 1454–1463 region)
so reflection is resolved once per Type instead of per streaming chunk.
1342-1369: Tool-call assistant messages serialize as empty Content.

For an AssistantChatMessage constructed from chatToolCalls (Line 1032), assistantMsg.Content is typically empty, so this serializer writes Content = "" and drops the ChatToolCall list entirely. Combined with the resume-path issue above, snapshots of mid tool-cycle states are lossy: on restore the model sees a no-op assistant turn followed by orphan tool messages — which most providers reject with 400 (tool message must follow assistant message containing tool_calls).

This isn't introduced by this PR, but the new ReasoningContent plumbing makes the gap more visible (reasoning is preserved, tool calls aren't). Worth scoping a follow-up to extend SerializedChatMessage with ToolCalls/ToolCallId so resume actually round-trips.

Want me to draft the extended SerializedChatMessage schema plus matching serialize/deserialize logic and open a follow-up issue?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs` around lines 1342 -
1369, SerializeProviderHistory currently builds SerializedChatMessage.Content
from AssistantChatMessage.Content, but AssistantChatMessage instances created
from chatToolCalls have empty Content and instead carry tool call data, so tool
calls are lost on serialization; update SerializedChatMessage to include tool
call metadata (e.g., ToolCalls and/or ToolCallId) and modify
SerializeProviderHistory to extract and populate those fields from
AssistantChatMessage (and preserve existing ReasoningContent via
GetAssistantReasoningContent) so mid-tool-cycle assistant turns round-trip
correctly; ensure the new fields are set when msg is AssistantChatMessage (and
include any ChatToolCall list from assistantMsg or related properties) and
adjust deserialization to restore assistant messages plus their tool_calls.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs`:
- Around line 1387-1390: The three static reflection helpers
(SetAssistantReasoningContent, GetAssistantReasoningContent,
GetStreamingReasoningContent) currently swallow all exceptions; change each
empty catch to catch Exception ex and emit a Debug/Trace-level log containing
the exception message/stack (e.g., Debug.WriteLine or
Trace.TraceInformation/TraceEvent) and make the logging occur only once by
guarding with a static bool flag per method so you don’t flood logs; keep the
methods static (no DI) and ensure the original behavior still returns null/false
after logging.
- Around line 1015-1016: The code trims reasoningContent which breaks provider
compatibility; change the assignment that uses
reasoningContentBuilder.ToString().Trim() so it uses the raw string
(reasoningContentBuilder.ToString()) without calling Trim(), leaving
responseText (responseText = contentBuilder.ToString().Trim()) unchanged; ensure
the variable reasoningContent (and any downstream use in OpenAIService.cs /
methods that send chat.completions) preserves leading/trailing whitespace
exactly as produced by reasoningContentBuilder.

---

Outside diff comments:
In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs`:
- Around line 1371-1463: The reflection-based getters/setters
(GetAssistantReasoningContent, GetStreamingReasoningContent,
SetAssistantReasoningContent and the DeserializeProviderHistory usage) won’t
produce or serialize reasoning_content with OpenAI .NET SDK v2.10.0; replace
this approach by mutating outgoing request JSON (injecting reasoning_content
into assistant messages) via a custom PipelinePolicy on
OpenAIClientOptions.Transport or by sending requests directly with HttpClient to
the thinking-mode endpoint, and remove/stop relying on the reflection helpers —
then verify end-to-end that streaming reasoning is captured and the subsequent
request JSON includes reasoning_content in assistant messages.
- Line 1244: ResumeFromSnapshotAsync currently calls DeserializeProviderHistory
which recreates only plain AssistantChatMessage instances and drops tool-call
structure and ToolChatMessage ("tool" role) entries from snapshots; update
SerializedChatMessage to include tool-call fields (tool id/name/args and any
serialized tool response) and extend DeserializeProviderHistory to reconstruct
AssistantChatMessage instances with tool-call metadata and ToolChatMessage
entries so thinking-mode providers get full context; ensure
ResumeFromSnapshotAsync uses the new deserialization and preserve
ReasoningContent via SetAssistantReasoningContent as before.

---

Nitpick comments:
In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs`:
- Around line 1375-1419: GetStreamingReasoningContent and
GetAssistantReasoningContent are doing Type.GetProperty on every call (hot path)
— cache the resolved PropertyInfo or compiled accessors to avoid repeated
reflection. Add static readonly ConcurrentDictionary<Type, PropertyInfo?> (or
ConcurrentDictionary<Type, Func<object, string?>> for typed getters) for the
ReasoningContentUpdate and Reasoning properties and use those caches in
GetStreamingReasoningContent and GetAssistantReasoningContent (look up by
update.GetType() or assistantMsg.GetType(), retrieve cached PropertyInfo/Func,
and invoke it if present). Apply the same caching pattern to the corresponding
helper code referenced around the other block (the get/set helpers at the
1454–1463 region) so reflection is resolved once per Type instead of per
streaming chunk.
- Around line 1342-1369: SerializeProviderHistory currently builds
SerializedChatMessage.Content from AssistantChatMessage.Content, but
AssistantChatMessage instances created from chatToolCalls have empty Content and
instead carry tool call data, so tool calls are lost on serialization; update
SerializedChatMessage to include tool call metadata (e.g., ToolCalls and/or
ToolCallId) and modify SerializeProviderHistory to extract and populate those
fields from AssistantChatMessage (and preserve existing ReasoningContent via
GetAssistantReasoningContent) so mid-tool-cycle assistant turns round-trip
correctly; ensure the new fields are set when msg is AssistantChatMessage (and
include any ChatToolCall list from assistantMsg or related properties) and
adjust deserialization to restore assistant messages plus their tool_calls.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: faa699ed-b425-41fa-8700-a054d6e82283

📥 Commits

Reviewing files that changed from the base of the PR and between 20acf24 and 4feb3c8.

📒 Files selected for processing (2)

TelegramSearchBot.Common/Model/AI/LlmContinuationSnapshot.cs
TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs

coderabbitai · 2026-04-25T07:09:54Z

                    string responseText = contentBuilder.ToString().Trim();
+                    string reasoningContent = reasoningContentBuilder.ToString().Trim();


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Does Kimi-thinking-preview or QwQ require the reasoning_content echoed back in subsequent chat.completions requests to match exactly (including leading/trailing whitespace) the streamed value, or is whitespace-trimmed content accepted?

💡 Result:

Kimi-thinking-preview (likely referring to Moonshot AI's Kimi K2 Thinking models like kimi-k2-thinking or kimi-k2.6 with thinking enabled) and QwQ (likely QwQ-32B-Preview, a reasoning model comparable to Kimi K2 Thinking) require the reasoning_content echoed back in subsequent chat.completions requests to match exactly the streamed value, including leading/trailing whitespace. Official Moonshot AI/Kimi docs explicitly state to "keep the reasoning_content from every historical assistant message in messages as-is. The simplest way is to append the assistant message returned from the previous API call directly back into messages." No sources indicate whitespace trimming is accepted; "as-is" implies exact preservation is necessary for proper preserved thinking and chain-of-thought continuity in multi-turn conversations. This applies to both official API and compatible implementations like vLLM, where field handling bugs highlight the need for precise input matching.

Citations:

1: https://platform.kimi.ai/docs/guide/use-kimi-k2-thinking-model

2: https://platform.moonshot.ai/docs/api/chat

3: https://platform.kimi.ai/docs/api/overview

4: https://anotherwrapper.com/tools/llm-pricing/kimi-k2-thinking/qwq-32b-preview

5: https://artificialanalysis.ai/models/comparisons/kimi-k2-5-vs-QwQ-32B-Preview

Remove Trim() on reasoningContent to preserve provider compatibility.

Kimi-thinking-preview and QwQ require the reasoning_content field to be echoed back in subsequent chat.completions requests with exact preservation, including leading/trailing whitespace. Official Moonshot AI documentation explicitly states to "keep the reasoning_content from every historical assistant message in messages as-is." The current Trim() operation violates this requirement and may cause validation or continuity issues in multi-turn conversations.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs` around lines 1015 - 1016, The code trims reasoningContent which breaks provider compatibility; change the assignment that uses reasoningContentBuilder.ToString().Trim() so it uses the raw string (reasoningContentBuilder.ToString()) without calling Trim(), leaving responseText (responseText = contentBuilder.ToString().Trim()) unchanged; ensure the variable reasoningContent (and any downstream use in OpenAIService.cs / methods that send chat.completions) preserves leading/trailing whitespace exactly as produced by reasoningContentBuilder.

coderabbitai · 2026-04-25T07:09:54Z

+            } catch {
+                // Reflection failed, return null
+            }
+            return null;


⚠️ Potential issue | 🟠 Major

Don't silently swallow reflection failures — at least log at Debug.

All three reflection helpers swallow every exception with empty catch { }. Combined with the concern above, that means if the SDK changes property names, or the property exists but isn't writable, or the cast fails, you'll get zero diagnostics — the feature will simply appear not to work in production. Capture the exception and log it at Debug/Trace level (and ideally only once via a static flag) so this is actually observable.

🛡️ Suggested change (illustrative, applied to SetAssistantReasoningContent)

- private static void SetAssistantReasoningContent(AssistantChatMessage msg, string reasoningContent) { - try { - var prop = msg.GetType().GetProperty("Reasoning"); - if (prop != null && prop.CanWrite) { - prop.SetValue(msg, reasoningContent); - } - } catch { - // Reflection failed, ignore - } - } + private static int _reasoningReflectionWarned; + private static void SetAssistantReasoningContent(AssistantChatMessage msg, string reasoningContent) { + try { + var prop = msg.GetType().GetProperty("Reasoning"); + if (prop != null && prop.CanWrite) { + prop.SetValue(msg, reasoningContent); + } else if (System.Threading.Interlocked.Exchange(ref _reasoningReflectionWarned, 1) == 0) { + System.Diagnostics.Debug.WriteLine( + "AssistantChatMessage has no writable 'Reasoning' property; reasoning_content round-trip is a no-op."); + } + } catch (Exception ex) { + System.Diagnostics.Debug.WriteLine($"SetAssistantReasoningContent failed: {ex}"); + } + }

(Same pattern applies to GetAssistantReasoningContent and GetStreamingReasoningContent. An injected ILogger would be even better, but these methods are static.)

Also applies to: 1415-1418, 1460-1462

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs` around lines 1387 - 1390, The three static reflection helpers (SetAssistantReasoningContent, GetAssistantReasoningContent, GetStreamingReasoningContent) currently swallow all exceptions; change each empty catch to catch Exception ex and emit a Debug/Trace-level log containing the exception message/stack (e.g., Debug.WriteLine or Trace.TraceInformation/TraceEvent) and make the logging occur only once by guarding with a static bool flag per method so you don’t flood logs; keep the methods static (no DI) and ensure the original behavior still returns null/false after logging.

github-actions · 2026-04-25T07:11:07Z

🔍 PR检查报告

📋 检查概览

PR: fix: pass reasoning_content back in thinking mode to avoid HTTP 400 #324
分支: fix/reasoning-content-thinking-mode → master
触发事件: pull_request
提交: 82f94ad

🧪 测试结果

平台	状态	详情
Ubuntu	🟢 成功	测试通过，产物已上传
Windows	🟢 成功	测试通过，产物已上传

📊 代码质量

✅ 代码格式化检查
✅ 安全漏洞扫描
✅ 依赖包分析
✅ 代码覆盖率收集

📁 测试产物

测试结果 artifacts 数量: 2
代码覆盖率已上传到Codecov

🔗 相关链接

此报告由GitHub Actions自动生成

coderabbitai Bot reviewed Apr 25, 2026

View reviewed changes

ModerRAS merged commit ee3d454 into master Apr 25, 2026
10 checks passed

ModerRAS deleted the fix/reasoning-content-thinking-mode branch April 25, 2026 09:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: pass reasoning_content back in thinking mode to avoid HTTP 400#324

fix: pass reasoning_content back in thinking mode to avoid HTTP 400#324
ModerRAS merged 1 commit intomasterfrom
fix/reasoning-content-thinking-mode

ModerRAS commented Apr 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 25, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 25, 2026

Uh oh!

coderabbitai Bot Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		string responseText = contentBuilder.ToString().Trim();
		string reasoningContent = reasoningContentBuilder.ToString().Trim();

Conversation

ModerRAS commented Apr 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 25, 2026

🔍 PR检查报告

📋 检查概览

🧪 测试结果

📊 代码质量

📁 测试产物

🔗 相关链接

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ModerRAS commented Apr 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 25, 2026 •

edited

Loading