fix: parse Gemma 4 <thought> reasoning tags alongside <think>#324
fix: parse Gemma 4 <thought> reasoning tags alongside <think>#324sagidM wants to merge 3 commits into
Conversation
📝 WalkthroughWalkthroughTagMatcher now accepts multiple tag names and tracks active tag(s) during streaming. Providers pass both "think" and "thought" to TagMatcher. Tests validate streamed parsing for ChangesMulti-Tag Reasoning Support
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
src/api/providers/__tests__/base-openai-compatible-provider.spec.tsESLint skipped: missing config or dependency (missing-dependency). The ESLint configuration references a package that is not available in the sandbox. src/api/providers/__tests__/openai.spec.tsESLint skipped: the ESLint configuration for this file references a package that is not available in the sandbox. src/api/providers/base-openai-compatible-provider.tsESLint skipped: the ESLint configuration for this file references a package that is not available in the sandbox.
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/utils/tag-matcher.ts`:
- Around line 75-76: The code currently uses a single activeTagName which is
overwritten by nested opens; change this to a stack (e.g., tagStack) so nested
tags are tracked: when an open tag is matched push its name onto tagStack and
set matched (or matched state) based on the top of the stack; when a close tag
is seen, compare it to the stack top and pop only if it matches (handle
mismatches gracefully), and update activeTagName/matched derived from
tagStack.peek() instead of a single variable; apply this stack pattern in the
logic around activeTagName/matched and the corresponding open/close handling
(the blocks around the current activeTagName usage and the similar code at the
later section referenced) so outer closes are recognized correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 2023dbbb-6f09-4629-bd89-d0a559731db9
📒 Files selected for processing (4)
src/api/providers/__tests__/base-openai-compatible-provider.spec.tssrc/api/providers/base-openai-compatible-provider.tssrc/api/providers/openai.tssrc/utils/tag-matcher.ts
taltas
left a comment
There was a problem hiding this comment.
Thanks for the PR, I messed around with Gemma 4 in roo and had the same thing, thanks for the fix! Left a couple of comments, we should be good to merge after you address these
| ]) | ||
| }) | ||
|
|
||
| it("should handle reasoning tags (<thought>) from stream", async () => { |
There was a problem hiding this comment.
These tests are great, but given you modified openai.ts, you should also update the relevant test file in openai.spec.ts, we need some tests to cover
<think>/<thought>streaming regression tests- assertions around streamed reasoning extraction and any provider specific sequencing behavior.
There was a problem hiding this comment.
I will try.
Note though, I changed src/api/providers/base-openai-compatible-provider.ts purely for tests! Without those changes, the code worked, changes in openai.ts were sufficient. But the tests failed.
And, if you ask me, I would probably make a map for different models as gemma-4-* outputs <thought> but other models can output other values. But since there was already TagMatcher used with <think> tags, I choose the easier and safer-to-be-approved path.
However, mapping the model name with the output tags they produce would eliminate the problem of overlapping tags. Probably.
Although, keeping that info (startingWith("gemma-4-*" or "gemma-*") or direct 2 keys of full names in a map) in the code base seems kind of dirty too.
sagidM
left a comment
There was a problem hiding this comment.
Before pushing, I was struggling to decide, how TagMatcher should work.
Consider these AI responses and the reasonings verdicts I implemented:
Gemma4 produced: <think>outer<thought>inner</thought> middle</think>final
"reasoning_content": "outer<thought>inner</thought> middle",
"content": "final"
Gemma4 produced: <think>first</thought>second</think>final
"reasoning_content": "first</thought>second",
"content": "final"
Gemma4 produced: <think>User asks about <think>these</think>tags...</think>final
"reasoning_content": "User asks about <think>these</think>tags...",
"content": "final"
All those tests cases are covered. I am push in a minute.
Backtick problem.
For the prompt: What does </thought> mean? Give me the short answer
Here is the result (replaced \n inside json for readability):
<thought>* Target phrase: `</thought>`
* Goal: Explain what it means.
* Constraint: \"Give me the short answer.\"
* The user is interacting with an AI.
* The AI often uses a `<thought>` block (Chain-of-Thought) to reason before providing the final response.
* `</thought>` is the closing tag of that reasoning block.
* *Detailed version:* It's a closing XML/HTML-style tag used by AI models to signify the end of their internal reasoning process (Chain-of-Thought) and the beginning of the final answer.
* *Short version:* It marks the end of the AI's internal reasoning process.</thought>It is a closing tag used by AI models to signal the **end of their internal reasoning process** (Chain-of-Thought) and the beginning of the final response.
Google wraps both tags into backticks.
However, this is a 1% use case. For the majority of responses, I don't think a user would see this problem. And even if it appears, the chances are that they expand the Thinking.
However, the request to
https://generativelanguage.googleapis.com/v1beta/models/gemma-4-31b-it:generateContent?key=<API_KEY>
Actually returns a normal result and has
`</thought>`
(wrapped into backticks) many times in parts[0].text.
Content is in parts[1].text.
Conclusion. While this PR would solve 99% of the leaking reasoning cases, the complete solution would be to support gemma-4-* models for the Google API Provider.
Backticks Solution.
Regardless, I make another commit but I may wait until you guys confirm that this is really a problem.
Because right now, looking at the code, I have doubts whether it is worth to solve this problem. Because it brings extra complexity layer that TagMatcher should solve for some reason. There are may be other models which have a different way of escaping it and these changes will collide.
So I think it is better to publish this as-is and maybe focus on supporting Gemma for Google provider instead.
| ]) | ||
| }) | ||
|
|
||
| it("should handle reasoning tags (<thought>) from stream", async () => { |
There was a problem hiding this comment.
I will try.
Note though, I changed src/api/providers/base-openai-compatible-provider.ts purely for tests! Without those changes, the code worked, changes in openai.ts were sufficient. But the tests failed.
And, if you ask me, I would probably make a map for different models as gemma-4-* outputs <thought> but other models can output other values. But since there was already TagMatcher used with <think> tags, I choose the easier and safer-to-be-approved path.
However, mapping the model name with the output tags they produce would eliminate the problem of overlapping tags. Probably.
Although, keeping that info (startingWith("gemma-4-*" or "gemma-*") or direct 2 keys of full names in a map) in the code base seems kind of dirty too.
6966807 to
b3b3dcc
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/utils/tag-matcher.ts`:
- Around line 112-116: Guard the closing-tag handling so we never decrement
this.depth or pop this.activeTagNames when there is no matching opener: change
the condition around the block that checks char === ">" && this.index ===
tagName.length to additionally require this.depth > 0 (and/or
this.activeTagNames.length > 0) before doing this.depth-- and
this.activeTagNames.pop(); if there is no opener, still set this.state = "TEXT"
but skip the decrement/pop to avoid underflow.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: ec8bf8e4-7031-4d56-9521-9b52c582992c
📒 Files selected for processing (5)
src/api/providers/__tests__/base-openai-compatible-provider.spec.tssrc/api/providers/__tests__/openai.spec.tssrc/api/providers/base-openai-compatible-provider.tssrc/api/providers/openai.tssrc/utils/tag-matcher.ts
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Gemma 4 streams reasoning inside <thought>...</thought> instead of
<think>...</think>. Without this the content leaks into chat text and
the agent triggers a retry on the first turn.
- TagMatcher: support multiple tag names - string[], track activeTagName so
<think> is never closed by </thought> (and vice-versa).
- base-openai-compatible-provider and openai handler: match both tags.
- Tests: <thought> parsing, cross-tag isolation, and invariants.
…e streaming tests
Add two regression tests that verify depth never goes negative: 1. stray closer with no opener "final</think>text" → stays text 2. duplicate closer after a proper close "<think>thinking</think>final</think>text" → second </think> stays text Both cases ensure we only decrement depth and pop activeTagNames when depth > 0, preventing underflow and treating the extra tag as plain text.
b3b3dcc to
80b7dbb
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/utils/tag-matcher.ts (1)
113-116:⚠️ Potential issue | 🟠 Major | ⚡ Quick winGuard closing-tag unwind to prevent depth underflow.
At Line 115,
this.depth--runs even when no opener is active. A leading stray close tag can push depth negative and corrupt subsequent state.Suggested fix
- if (char === ">" && this.index === tagName.length) { + if (char === ">" && this.index === tagName.length) { this.state = "TEXT" - this.depth-- - this.activeTagNames.pop() + if (this.depth > 0 && this.activeTagNames.length > 0) { + this.depth-- + this.activeTagNames.pop() + } this.matched = this.depth > 0 if (!this.matched) { this.cached = [] } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/utils/tag-matcher.ts` around lines 113 - 116, Guard the closing-tag unwind so depth can't go negative: when detecting the end of a tag in the block that checks if (char === ">" && this.index === tagName.length), only decrement this.depth and pop this.activeTagNames if there is an open tag to close (e.g., this.depth > 0 and this.activeTagNames.length > 0); still set this.state = "TEXT" but avoid running this.depth-- or this.activeTagNames.pop() when no opener exists to prevent underflow/corruption.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/utils/tag-matcher.ts`:
- Around line 113-116: Guard the closing-tag unwind so depth can't go negative:
when detecting the end of a tag in the block that checks if (char === ">" &&
this.index === tagName.length), only decrement this.depth and pop
this.activeTagNames if there is an open tag to close (e.g., this.depth > 0 and
this.activeTagNames.length > 0); still set this.state = "TEXT" but avoid running
this.depth-- or this.activeTagNames.pop() when no opener exists to prevent
underflow/corruption.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 84d620aa-bc19-4ba7-ae28-849edb1f74b7
📒 Files selected for processing (5)
src/api/providers/__tests__/base-openai-compatible-provider.spec.tssrc/api/providers/__tests__/openai.spec.tssrc/api/providers/base-openai-compatible-provider.tssrc/api/providers/openai.tssrc/utils/tag-matcher.ts
Related GitHub Issue
Closes #323
Closes a similar issue: #7615 from Roo Code
Similar PR to #7617 from Roo Code
Description
Gemma 4 streams reasoning inside
<thought>...</thought>instead of<think>...</think>. Without this the content leaks into chat text and the agent triggers a retry on the first turn.string[], trackactiveTagNameso<think>is never closed by</thought>(and vice-versa).<thought>parsing, cross-tag isolation, and invariants.Test Procedure
To Reproduce
Steps to reproduce the behavior:
Add an OpenAI-Compatible provider pointing to a Gemma 4 model.
Base URLset to https://generativelanguage.googleapis.com/v1beta/openaiSet API Key obtained from https://aistudio.google.com/api-keys
Choose a reasoning Gemma4 model. There are only 2 options:
Expected behavior
models/gemma-4-26b-a4b-itandmodels/gemma-4-31b-it. Optionally, you can set Context Window Size to256000below.Save. Make sure it is selected down below.
Send any simple prompt (e.g., "2+2? Give only the answer")
Observe raw ... in output
Version: 3.55.0 (d63e7bd)
The tags
<thought></thought>wrapped around the reasoning are not expected to show up. Moreover, the whole reasoning should be wrapped into the expandable Thinking item.Expected
The reasoning/thinking content should be wrapped into an expandable menu option.
Version: 3.55.0 (6966807)
Pre-Submission Checklist
Documentation Updates
Additional Notes
I specified some additional notes in the issue but they are not related to this PR.
Get in Touch
Telegram: sagidM
Discord: sagidm
Summary by CodeRabbit
New Features
Tests