Skip to content

.Net: Fix: Surface Ollama reasoning ("thinking") content in ChatMessageContent#13888

Open
PrathamAditya wants to merge 3 commits intomicrosoft:mainfrom
PrathamAditya:19April2026
Open

.Net: Fix: Surface Ollama reasoning ("thinking") content in ChatMessageContent#13888
PrathamAditya wants to merge 3 commits intomicrosoft:mainfrom
PrathamAditya:19April2026

Conversation

@PrathamAditya
Copy link
Copy Markdown
Contributor

Summary

Fixes #13860
Ollama responses can include a thinking (reasoning) field, but this information is currently not exposed through ChatMessageContent in Semantic Kernel.

This PR adds support to surface reasoning content by extracting it from the underlying RawRepresentation.


Changes

  • Added a unit test demonstrating that reasoning (thinking) content is currently not propagated.
  • Implemented a minimal fix to extract thinking from RawRepresentation and include it in ChatMessageContent.Items as TextContent.

Behavior

Before:

  • Only final response (content) is available.

After:

  • Reasoning (thinking) is included alongside final response.

Design Notes

  • The fix uses RawRepresentation to avoid introducing a dependency on Ollama-specific types in SemanticKernel.Abstractions.
  • This is intentionally a minimal and non-breaking change.

Limitations / Future Work

  • This highlights a broader gap in the current abstraction:

  • The Microsoft.Extensions.AI model does not currently support reasoning/thinking as a first-class concept.

  • Modern LLMs (Ollama, OpenAI reasoning models, etc.) are increasingly returning structured reasoning.

Potential improvements:

  • Introduce a dedicated ReasoningContent type
  • Or expose reasoning via standardized metadata

Happy to align with maintainers on preferred direction.

Testing

Added unit test: GetChatMessageContent_ShouldIncludeThinking_WhenPresentInResponseAsync

@PrathamAditya PrathamAditya requested a review from a team as a code owner April 19, 2026 12:33
@moonbox3 moonbox3 added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel labels Apr 19, 2026
@github-actions github-actions bot changed the title Fix: Surface Ollama reasoning ("thinking") content in ChatMessageContent .Net: Fix: Surface Ollama reasoning ("thinking") content in ChatMessageContent Apr 19, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 91%

✗ Correctness

The change adds Ollama thinking/reasoning content extraction via reflection in a shared extension method (ChatMessageExtensions.cs). The main correctness concern is that the thinking TextContent item is added to result.Items before the regular content items (which are added by the subsequent foreach loop). Since ChatMessageContent.Content returns Items.OfType<TextContent>().FirstOrDefault()?.Text (ChatMessageContent.cs line 40), this causes .Content to return the thinking text (e.g. [THINKING]\nThis is reasoning content) instead of the actual assistant response (e.g. Final answer). This silently breaks any consumer that reads .Content to get the model's answer. The test only asserts Contains("This is reasoning content", result.Content) which passes due to this ordering, but doesn't verify the actual response text is still accessible.

✗ Security Reliability

The PR adds Ollama-specific reflection-based thinking extraction to ChatMessageExtensions.cs, a shared internal utility compiled into SemanticKernel.Abstractions via MEAIUtilities.props (confirmed at SemanticKernel.Abstractions.csproj:18). This means the reflection code runs for ALL providers (OpenAI, Gemini, Mistral, Bedrock, etc.) on every call to ToChatMessageContent, not just Ollama. While the null-conditional chain prevents crashes for non-Ollama types, this is fragile: any future provider whose RawRepresentation accidentally exposes 'Message' and 'Thinking' properties would get unexpected content injected into results. The code should include a type guard to ensure it only runs for Ollama's ChatDoneResponseStream type, or this logic should be moved to the Ollama connector layer rather than the shared abstraction.

✗ Test Coverage

The new test verifies that thinking content appears in the response, but has significant coverage gaps. The thinking TextContent is inserted into Items before the regular content, which means result.Content (which returns the first TextContent) now returns thinking text instead of the actual response. The test doesn't verify the Items structure, doesn't check that 'Final answer' is accessible, has no negative test for responses without thinking, no streaming test, and no direct unit test for the shared extension method's thinking extraction logic.

✗ Design Approach

The change extracts Ollama's 'thinking' field from chat responses, but does so in the wrong architectural layer using reflection, and stores it in the wrong content type. ChatMessageExtensions.cs is a shared M.E.AI utility used by all connectors via ChatClientChatCompletionService; embedding Ollama-specific reflection logic there leaks a connector implementation detail into a generic abstraction. More critically, using TextContent("[THINKING]\n..") rather than the existing ReasoningContent type causes ChatMessageContent.Content (which returns the text of the first TextContent item) to surface the thinking string instead of the actual assistant answer whenever thinking precedes regular content in Items.

Flagged Issues

  • Ollama-specific reflection runs unguarded for all providers. ChatMessageExtensions.cs is compiled into SemanticKernel.Abstractions (via MEAIUtilities.props) and called from ChatClientChatCompletionService.cs:71,75 for every IChatClient. The reflection on Message and Thinking properties executes against OpenAI, Gemini, Mistral, Bedrock, etc. on every chat completion call. Without a type guard, any provider whose raw representation coincidentally exposes matching property names would get unexpected content injected. This logic should either check the concrete type before reflecting or be moved to the Ollama connector layer.

Suggestions

  • Add a negative test verifying that responses without a thinking field still produce the expected single content item, to guard against regressions from the reflection-based extraction.
  • Wrap the GetValue() calls on reflected properties in try-catch for defensive reliability against TargetInvocationException.

Automated review by PrathamAditya's agents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

.Net: Bug: chat/completions no reasoning

2 participants