Skip to content

fix: parse Gemma 4 <thought> reasoning tags alongside <think>#324

Open
sagidM wants to merge 3 commits into
Zoo-Code-Org:mainfrom
sagidM:fix/gemma4-fix-thought-tags
Open

fix: parse Gemma 4 <thought> reasoning tags alongside <think>#324
sagidM wants to merge 3 commits into
Zoo-Code-Org:mainfrom
sagidM:fix/gemma4-fix-thought-tags

Conversation

@sagidM
Copy link
Copy Markdown

@sagidM sagidM commented May 25, 2026

Related GitHub Issue

Closes #323
Closes a similar issue: #7615 from Roo Code
Similar PR to #7617 from Roo Code

Description

Gemma 4 streams reasoning inside <thought>...</thought> instead of <think>...</think>. Without this the content leaks into chat text and the agent triggers a retry on the first turn.

  • TagMatcher: support multiple tag names - string[], track activeTagName so <think> is never closed by </thought> (and vice-versa).
  • base-openai-compatible-provider and openai handler: match both tags.
  • Tests: <thought> parsing, cross-tag isolation, and invariants.

Test Procedure

To Reproduce
Steps to reproduce the behavior:

  1. Add an OpenAI-Compatible provider pointing to a Gemma 4 model.

  2. Base URL set to https://generativelanguage.googleapis.com/v1beta/openai

  3. Set API Key obtained from https://aistudio.google.com/api-keys

  4. Choose a reasoning Gemma4 model. There are only 2 options: models/gemma-4-26b-a4b-it and models/gemma-4-31b-it. Optionally, you can set Context Window Size to 256000 below.
    ImageExpected behavior

  5. Save. Make sure it is selected down below.

  6. Send any simple prompt (e.g., "2+2? Give only the answer")

  7. Observe raw ... in output

Image

Version: 3.55.0 (d63e7bd)

The tags <thought></thought> wrapped around the reasoning are not expected to show up. Moreover, the whole reasoning should be wrapped into the expandable Thinking item.

Expected

The reasoning/thinking content should be wrapped into an expandable menu option.

Image

Version: 3.55.0 (6966807)

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: New and/or updated tests have been added to cover my changes (if applicable).
  • Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Documentation Updates

  • No documentation updates are required.

Additional Notes

I specified some additional notes in the issue but they are not related to this PR.

Get in Touch

Telegram: sagidM
Discord: sagidm

Summary by CodeRabbit

  • New Features

    • Streaming parsing now recognizes both and tags, handles mixed/nested tags, mismatched closings, and flushes partial reasoning at stream end; reasoning output is separated from normal text.
  • Tests

    • Added comprehensive streaming tests for multi-tag recognition, nesting/unwinding, mismatched closes, partial/complete flushes, and combined tag-based plus provider reasoning output.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 25, 2026

📝 Walkthrough

Walkthrough

TagMatcher now accepts multiple tag names and tracks active tag(s) during streaming. Providers pass both "think" and "thought" to TagMatcher. Tests validate streamed parsing for <think> and <thought>, mismatched closing tags, nesting, start/end-of-stream, and combined native reasoning deltas.

Changes

Multi-Tag Reasoning Support

Layer / File(s) Summary
TagMatcher multi-tag state machine
src/utils/tag-matcher.ts
Constructor now accepts `string
Provider reasoning tag configuration
src/api/providers/base-openai-compatible-provider.ts, src/api/providers/openai.ts
Providers updated to initialize TagMatcher with both ["think","thought"], enabling recognition of either reasoning marker in streamed content.
Streaming tests for reasoning tags
src/api/providers/__tests__/base-openai-compatible-provider.spec.ts, src/api/providers/__tests__/openai.spec.ts
New test cases covering <thought> parsing into reasoning chunks, mismatched closing-tag behavior (e.g., </thought> during <think>), start/end-of-stream flushing, single-chunk complete tags, nested/mixed tags handling, and combined native reasoning_content plus tag-based reasoning.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • navedmerchant
  • hannesrudolph
  • edelauna
  • JamesRobert20

Poem

🐰 I nibble tags both big and small,
Think and Thought — I catch them all,
Streams no longer leak the mind,
Reasoning bits are tucked behind,
Hooray — a tidy chat for all! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main change: adding support for tags in Gemma 4 models alongside the existing tag support.
Description check ✅ Passed The PR description covers the linked issue, implementation approach (TagMatcher changes, provider updates, tests), test procedure with reproduction steps, and includes a pre-submission checklist.
Linked Issues check ✅ Passed The PR directly addresses issue #323 by implementing support for tags in Gemma 4 models, preventing reasoning leaks and enabling proper grouping into the Thinking UI.
Out of Scope Changes check ✅ Passed All changes are directly related to supporting tags: TagMatcher enhancements to handle multiple tag names, provider updates to recognize both tags, and comprehensive test coverage for the new functionality.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

src/api/providers/__tests__/base-openai-compatible-provider.spec.ts

ESLint skipped: missing config or dependency (missing-dependency). The ESLint configuration references a package that is not available in the sandbox.

src/api/providers/__tests__/openai.spec.ts

ESLint skipped: the ESLint configuration for this file references a package that is not available in the sandbox.

src/api/providers/base-openai-compatible-provider.ts

ESLint skipped: the ESLint configuration for this file references a package that is not available in the sandbox.

  • 2 others

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/utils/tag-matcher.ts`:
- Around line 75-76: The code currently uses a single activeTagName which is
overwritten by nested opens; change this to a stack (e.g., tagStack) so nested
tags are tracked: when an open tag is matched push its name onto tagStack and
set matched (or matched state) based on the top of the stack; when a close tag
is seen, compare it to the stack top and pop only if it matches (handle
mismatches gracefully), and update activeTagName/matched derived from
tagStack.peek() instead of a single variable; apply this stack pattern in the
logic around activeTagName/matched and the corresponding open/close handling
(the blocks around the current activeTagName usage and the similar code at the
later section referenced) so outer closes are recognized correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 2023dbbb-6f09-4629-bd89-d0a559731db9

📥 Commits

Reviewing files that changed from the base of the PR and between 9d022d4 and 6966807.

📒 Files selected for processing (4)
  • src/api/providers/__tests__/base-openai-compatible-provider.spec.ts
  • src/api/providers/base-openai-compatible-provider.ts
  • src/api/providers/openai.ts
  • src/utils/tag-matcher.ts

Comment thread src/utils/tag-matcher.ts Outdated
Copy link
Copy Markdown
Contributor

@taltas taltas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, I messed around with Gemma 4 in roo and had the same thing, thanks for the fix! Left a couple of comments, we should be good to merge after you address these

])
})

it("should handle reasoning tags (<thought>) from stream", async () => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are great, but given you modified openai.ts, you should also update the relevant test file in openai.spec.ts, we need some tests to cover

  • <think>/<thought> streaming regression tests
  • assertions around streamed reasoning extraction and any provider specific sequencing behavior.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try.
Note though, I changed src/api/providers/base-openai-compatible-provider.ts purely for tests! Without those changes, the code worked, changes in openai.ts were sufficient. But the tests failed.

And, if you ask me, I would probably make a map for different models as gemma-4-* outputs <thought> but other models can output other values. But since there was already TagMatcher used with <think> tags, I choose the easier and safer-to-be-approved path.
However, mapping the model name with the output tags they produce would eliminate the problem of overlapping tags. Probably.
Although, keeping that info (startingWith("gemma-4-*" or "gemma-*") or direct 2 keys of full names in a map) in the code base seems kind of dirty too.

Comment thread src/utils/tag-matcher.ts Outdated
Copy link
Copy Markdown
Author

@sagidM sagidM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before pushing, I was struggling to decide, how TagMatcher should work.
Consider these AI responses and the reasonings verdicts I implemented:
Gemma4 produced: <think>outer<thought>inner</thought> middle</think>final

"reasoning_content": "outer<thought>inner</thought> middle",
"content": "final"

Gemma4 produced: <think>first</thought>second</think>final

"reasoning_content": "first</thought>second",
"content": "final"

Gemma4 produced: <think>User asks about <think>these</think>tags...</think>final

"reasoning_content": "User asks about <think>these</think>tags...",
"content": "final"

All those tests cases are covered. I am push in a minute.


Backtick problem.

For the prompt: What does </thought> mean? Give me the short answer
Here is the result (replaced \n inside json for readability):

<thought>*   Target phrase: `</thought>`
    *   Goal: Explain what it means.
    *   Constraint: \"Give me the short answer.\"

    *   The user is interacting with an AI.
    *   The AI often uses a `<thought>` block (Chain-of-Thought) to reason before providing the final response.
    *   `</thought>` is the closing tag of that reasoning block.

    *   *Detailed version:* It's a closing XML/HTML-style tag used by AI models to signify the end of their internal reasoning process (Chain-of-Thought) and the beginning of the final answer.
    *   *Short version:* It marks the end of the AI's internal reasoning process.</thought>It is a closing tag used by AI models to signal the **end of their internal reasoning process** (Chain-of-Thought) and the beginning of the final response.

Google wraps both tags into backticks.
However, this is a 1% use case. For the majority of responses, I don't think a user would see this problem. And even if it appears, the chances are that they expand the Thinking.

However, the request to
https://generativelanguage.googleapis.com/v1beta/models/gemma-4-31b-it:generateContent?key=<API_KEY>

Actually returns a normal result and has

`</thought>`

(wrapped into backticks) many times in parts[0].text.
Content is in parts[1].text.

Conclusion. While this PR would solve 99% of the leaking reasoning cases, the complete solution would be to support gemma-4-* models for the Google API Provider.

Backticks Solution.

Regardless, I make another commit but I may wait until you guys confirm that this is really a problem.
Because right now, looking at the code, I have doubts whether it is worth to solve this problem. Because it brings extra complexity layer that TagMatcher should solve for some reason. There are may be other models which have a different way of escaping it and these changes will collide.
So I think it is better to publish this as-is and maybe focus on supporting Gemma for Google provider instead.

])
})

it("should handle reasoning tags (<thought>) from stream", async () => {
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try.
Note though, I changed src/api/providers/base-openai-compatible-provider.ts purely for tests! Without those changes, the code worked, changes in openai.ts were sufficient. But the tests failed.

And, if you ask me, I would probably make a map for different models as gemma-4-* outputs <thought> but other models can output other values. But since there was already TagMatcher used with <think> tags, I choose the easier and safer-to-be-approved path.
However, mapping the model name with the output tags they produce would eliminate the problem of overlapping tags. Probably.
Although, keeping that info (startingWith("gemma-4-*" or "gemma-*") or direct 2 keys of full names in a map) in the code base seems kind of dirty too.

@sagidM sagidM force-pushed the fix/gemma4-fix-thought-tags branch from 6966807 to b3b3dcc Compare May 26, 2026 13:45
Copy link
Copy Markdown
Author

@sagidM sagidM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I accidentally send the previous message as Comment instead of Submitting review

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/utils/tag-matcher.ts`:
- Around line 112-116: Guard the closing-tag handling so we never decrement
this.depth or pop this.activeTagNames when there is no matching opener: change
the condition around the block that checks char === ">" && this.index ===
tagName.length to additionally require this.depth > 0 (and/or
this.activeTagNames.length > 0) before doing this.depth-- and
this.activeTagNames.pop(); if there is no opener, still set this.state = "TEXT"
but skip the decrement/pop to avoid underflow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ec8bf8e4-7031-4d56-9521-9b52c582992c

📥 Commits

Reviewing files that changed from the base of the PR and between 6966807 and b3b3dcc.

📒 Files selected for processing (5)
  • src/api/providers/__tests__/base-openai-compatible-provider.spec.ts
  • src/api/providers/__tests__/openai.spec.ts
  • src/api/providers/base-openai-compatible-provider.ts
  • src/api/providers/openai.ts
  • src/utils/tag-matcher.ts

Comment thread src/utils/tag-matcher.ts
@codecov
Copy link
Copy Markdown

codecov Bot commented May 26, 2026

Codecov Report

❌ Patch coverage is 83.63636% with 9 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/utils/tag-matcher.ts 83.01% 7 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

Sagid Magomedov added 3 commits May 26, 2026 15:44
Gemma 4 streams reasoning inside <thought>...</thought> instead of
<think>...</think>.  Without this the content leaks into chat text and
the agent triggers a retry on the first turn.

- TagMatcher: support multiple tag names - string[], track activeTagName so
    <think> is never closed by </thought> (and vice-versa).
- base-openai-compatible-provider and openai handler: match both tags.
- Tests: <thought> parsing, cross-tag isolation, and invariants.
Add two regression tests that verify depth never goes negative:

1. stray closer with no opener
   "final</think>text" → stays text

2. duplicate closer after a proper close
   "<think>thinking</think>final</think>text" → second </think> stays text

Both cases ensure we only decrement depth and pop activeTagNames
when depth > 0, preventing underflow and treating the extra tag
as plain text.
@sagidM sagidM force-pushed the fix/gemma4-fix-thought-tags branch from b3b3dcc to 80b7dbb Compare May 26, 2026 14:44
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/utils/tag-matcher.ts (1)

113-116: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard closing-tag unwind to prevent depth underflow.

At Line 115, this.depth-- runs even when no opener is active. A leading stray close tag can push depth negative and corrupt subsequent state.

Suggested fix
-				if (char === ">" && this.index === tagName.length) {
+				if (char === ">" && this.index === tagName.length) {
 					this.state = "TEXT"
-					this.depth--
-					this.activeTagNames.pop()
+					if (this.depth > 0 && this.activeTagNames.length > 0) {
+						this.depth--
+						this.activeTagNames.pop()
+					}
 					this.matched = this.depth > 0
 					if (!this.matched) {
 						this.cached = []
 					}
 				}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/utils/tag-matcher.ts` around lines 113 - 116, Guard the closing-tag
unwind so depth can't go negative: when detecting the end of a tag in the block
that checks if (char === ">" && this.index === tagName.length), only decrement
this.depth and pop this.activeTagNames if there is an open tag to close (e.g.,
this.depth > 0 and this.activeTagNames.length > 0); still set this.state =
"TEXT" but avoid running this.depth-- or this.activeTagNames.pop() when no
opener exists to prevent underflow/corruption.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/utils/tag-matcher.ts`:
- Around line 113-116: Guard the closing-tag unwind so depth can't go negative:
when detecting the end of a tag in the block that checks if (char === ">" &&
this.index === tagName.length), only decrement this.depth and pop
this.activeTagNames if there is an open tag to close (e.g., this.depth > 0 and
this.activeTagNames.length > 0); still set this.state = "TEXT" but avoid running
this.depth-- or this.activeTagNames.pop() when no opener exists to prevent
underflow/corruption.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 84d620aa-bc19-4ba7-ae28-849edb1f74b7

📥 Commits

Reviewing files that changed from the base of the PR and between b3b3dcc and 80b7dbb.

📒 Files selected for processing (5)
  • src/api/providers/__tests__/base-openai-compatible-provider.spec.ts
  • src/api/providers/__tests__/openai.spec.ts
  • src/api/providers/base-openai-compatible-provider.ts
  • src/api/providers/openai.ts
  • src/utils/tag-matcher.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] <thought> tag is displayed as plain text instead of hidden reasoning in OpenAI Compatible provider for Gemma4

2 participants