Skip to content

Python: fix(python/google): filter thinking text parts from chat completion responses#13711

Open
dariusFTOS wants to merge 2 commits intomicrosoft:mainfrom
dariusFTOS:fix/python-google-filter-thought-parts
Open

Python: fix(python/google): filter thinking text parts from chat completion responses#13711
dariusFTOS wants to merge 2 commits intomicrosoft:mainfrom
dariusFTOS:fix/python-google-filter-thought-parts

Conversation

@dariusFTOS
Copy link

@dariusFTOS dariusFTOS commented Mar 26, 2026

Motivation and Context

Fixes #13710

When using Gemini 3 Pro (preview) with thinking enabled, the API returns text parts with part.thought = True containing the model's internal reasoning. These thinking parts are incorrectly included in ChatMessageContent.items alongside the actual response text, causing the model's chain-of-thought to leak into application-visible responses.

This breaks downstream processing (e.g. JSON parsing of structured agent responses) because the response contains thinking text instead of the actual answer. The fix in #13609 correctly handled thought_signature on function call parts, but did not filter thinking text parts from the response content.

Description

Response parsing (filter thinking text parts):

  • google_ai_chat_completion.py: In _create_chat_message_content(), skip parts where part.thought is True before adding them as TextContent
  • google_ai_chat_completion.py: Same filter in _create_streaming_chat_message_content() for the streaming path

Backward compatible: When part.thought is None or False (thinking disabled or older models), behavior is identical to before. The raw GenerateContentResponse is still available via inner_content for consumers who need access to thinking parts.

Test Coverage

TODO: Add tests for:

  • test_create_chat_message_content_filters_thought_parts — verifies thinking parts are excluded from response items
  • test_create_chat_message_content_without_thought_parts — verifies backward compatibility when no thinking parts present
  • test_create_streaming_chat_message_content_filters_thought_parts — same for streaming path

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the SK Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • I didn't break anyone 😄

Gemini models with thinking enabled return text parts with part.thought=True.
These thinking/reasoning parts were being included in ChatMessageContent
alongside the actual response, causing thinking text to leak into responses.

This adds a check to skip parts where part.thought is True in both
_create_chat_message_content and _create_streaming_chat_message_content.

Fixes microsoft#13710
@dariusFTOS dariusFTOS requested a review from a team as a code owner March 26, 2026 15:54
@dariusFTOS dariusFTOS changed the title Python: Filter thought parts from Google AI chat completion responses Python: fix(python/google): filter thinking text parts from chat completion responses Mar 26, 2026
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 89%

✓ Correctness

The diff adds filtering for Google AI 'thought' parts in both non-streaming and streaming chat completion paths. When the Gemini model returns parts with thought=True (internal chain-of-thought reasoning), these are now skipped before being converted to TextContent or other content types. The Part.thought attribute is present in google-genai ~1.51.0 (the pined SDK version) and defaults to None for non-thought parts, so the truthiness check is safe. The guard is correctly placed before the if part.text: check, since thought parts carry text that should not be surfaced. The Vertex AI connector does not have the same filtering, but it uses a different SDK (vertexai) and may have different behavior for thought parts — this is outside the scope of this PR.

✓ Security Reliability

The diff adds filtering of 'thought' parts (thinking/reasoning tokens) from Google AI chat completion responses, both in streaming and non-streaming paths. The primary reliability concern is that part.thought is accessed via direct attribute access, which is inconsistent with the defensive getattr(part, "thought_signature", None) pattern already used in the same functions for SDK compatibility. If the google-genai SDK version constraint is ever relaxed or the thought attribute is removed/renamed, direct access would raise an unhandled AttributeError, crashing response parsing. The existing test for SDK attribute guards (test_create_chat_message_content_getattr_guard_on_missing_attribute) doesn't cover this new attribute since MagicMock auto-creates attributes on access.

✗ Test Coverage

The PR adds logic to skip 'thought' parts in both _create_chat_message_content and _create_streaming_chat_message_content, but no tests cover this new behavior. The existing test suite only uses parts with thought=None/False (via Part.from_text() and Part.from_function_call()), so the new if part.thought: continue branches have zero test coverage. Tests should verify that thought-only parts are filtered out, that mixed thought/non-thought responses retain only the non-thought items, and that the streaming path behaves identically.

✗ Design Approach

The PR silently discards part.thought (reasoning/thinking) content from Google AI responses by skipping those parts entirely. This is a symptom-level fix that treats thought content as noise to be suppressed, when Semantic Kernel already has a purpose-built ReasoningContent (and StreamingReasoningContent) type that is part of CMC_ITEM_TYPES and STREAMING_ITEM_TYPES. In Gemini's thinking API, part.thought == True with part.text carrying the actual reasoning text — the correct design is to surface these as ReasoningContent items rather than drop them. The Vertex AI connector has the same gap and would need the same fix. Silent discard also breaks any caller that wants to inspect model reasoning or pass it back in multi-turn conversations.

Flagged Issues

  • Thought parts are silently discarded instead of being surfaced as ReasoningContent / StreamingReasoningContent, which already exist in SK and are part of CMC_ITEM_TYPES. For Gemini thinking models, part.thought == True and part.text holds the reasoning text — the fix should wrap these as ReasoningContent (and the streaming equivalent) rather than dropping them. Silent discard breaks calers that need to inspect model reasoning for display, logging, or multi-turn context.
  • No tests cover the new part.thought filtering logic in either _create_chat_message_content or _create_streaming_chat_message_content. Add tests that verify: (1) a Part with thought=True and text produces the correct content type, (2) a response mixing thought and non-thought parts returns both appropriately, and (3) the streaming path mirrors the same behavior.

Suggestions

  • Use getattr(part, "thought", False) instead of direct part.thought access for consistency with the existing getattr(part, "thought_signature", None) pattern, providing resilience against SDK version mismatches where the thought attribute may not exist on Part.
  • Apply the same thought-part handling to the Vertex AI chat completion connector (vertex_ai_chat_completion.py), which has a parallel code structure and the same gap.
  • Add a SDK guard test simulating a Part without the thought attribute, similar to the existing test_create_chat_message_content_getattr_guard_on_missing_attribute.

Automated review by dariusFTOS's agents

- Use ReasoningContent for non-streaming and StreamingReasoningContent
  for streaming thought parts (part.thought == True)
- Use getattr(part, "thought", False) for SDK compatibility
- Thought parts are now properly typed rather than silently dropped
@dariusFTOS
Copy link
Author

@dariusFTOS please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree company="FintechOS"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: Bug: Google AI connector leaks thinking/thought text parts into ChatMessageContent

1 participant