Fix ReAct agent parsing failures with reasoning models (`<think>` tags) by yczhang-nv · Pull Request #1667 · NVIDIA/NeMo-Agent-Toolkit

yczhang-nv · 2026-02-26T06:06:46Z

Description

Reasoning models (e.g., nemotron-3-nano-30b-a3b) wrap their chain-of-thought in ... tags, which breaks the ReAct output parser. This PR adds multi-layered handling to recover usable content from reasoning model outputs.
Preserve reasoning_content from streamed LLM chunks in _stream_llm, which previously discarded additional_kwargs when constructing the final AIMessage
Strip tags from LLM output before parsing, with fallbacks: extract content from inside tags, then fall back to reasoning_content from additional_kwargs
Accept direct (non-ReAct-formatted) answers from reasoning models instead of failing with ReActAgentParsingFailedError
Improve retry hints when content is empty, guiding the model toward producing a Final Answer:

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

Bug Fixes
- Streaming responses from reasoning models now capture and combine main content and separate reasoning streams reliably.
- Agents more reliably recognize direct final answers even when not in strict ReAct format, reducing missed responses.
- Improved fallback behavior that adds clear guidance prompts when parsing fails.
Improvements
- More consistent normalization of agent outputs across initial and subsequent interaction flows for steadier results.

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>

coderabbitai · 2026-02-26T06:07:17Z

Walkthrough

Base agent separates streaming LLM content and reasoning content and returns both (reasoning via additional_kwargs); ReAct agent normalizes outputs by stripping R1 think tags, falls back to extracting ... or reasoning_content, treats certain non-ReAct outputs as direct final answers, and adds fallback prompts on parse failures.

Changes

Cohort / File(s)	Summary
Base Streaming Handler `packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/base.py`	Reworked `_stream_llm` to collect `content_parts` and `reasoning_parts` from streaming events, return concatenated `content`, and attach `reasoning_content` in `AIMessage.additional_kwargs` when present.
ReAct Agent Logic `packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`	Added `remove_r1_think_tags` normalization and a fallback chain: normalized content → extract content inside `<think>...</think>` → use `reasoning_content` from `additional_kwargs`. Handles non-ReAct reasoning-model outputs as final answers when appropriate and appends fallback guidance prompts on parsing failures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title is fully related to the main changes in the PR, which focus on fixing ReAct agent parsing failures with reasoning models that use `<think>` tags.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py (1)

183-193: Extract duplicated content normalization logic into a helper method.

This normalization logic is duplicated at lines 217-227. Consider extracting it into a private method to improve maintainability.

♻️ Proposed refactor

Add a helper method to the class:

def _normalize_reasoning_model_output(self, output_message: AIMessage) -> None:
    """Normalize output from reasoning models by stripping think tags and extracting content."""
    if not isinstance(output_message.content, str):
        return
    raw_content = output_message.content
    output_message.content = remove_r1_think_tags(raw_content)
    if not output_message.content.strip():
        think_match = re.search(r'<think>(.*?)</think>', raw_content, re.DOTALL)
        if think_match:
            output_message.content = think_match.group(1).strip()
    if not output_message.content.strip():
        reasoning = output_message.additional_kwargs.get('reasoning_content', '')
        if reasoning:
            output_message.content = reasoning

Then replace both occurrences with:

-                    if isinstance(output_message.content, str):
-                        raw_content = output_message.content
-                        output_message.content = remove_r1_think_tags(raw_content)
-                        if not output_message.content.strip():
-                            think_match = re.search(r'<think>(.*?)</think>', raw_content, re.DOTALL)
-                            if think_match:
-                                output_message.content = think_match.group(1).strip()
-                        if not output_message.content.strip():
-                            reasoning = output_message.additional_kwargs.get('reasoning_content', '')
-                            if reasoning:
-                                output_message.content = reasoning
+                    self._normalize_reasoning_model_output(output_message)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`
around lines 183 - 193, Extract the duplicated content-normalization logic into
a private helper on the class (e.g., def _normalize_reasoning_model_output(self,
output_message: AIMessage) -> None) that returns early if output_message.content
is not a str, calls remove_r1_think_tags(raw_content), falls back to
re.search(r'<think>(.*?)</think>', raw_content, re.DOTALL) to extract inner text
if the stripped content is empty, and finally uses
output_message.additional_kwargs.get('reasoning_content', '') as the last
fallback; then replace both inline blocks (the one around output_message
handling at lines shown and the duplicate at 217-227) with calls to
self._normalize_reasoning_model_output(output_message) and add a short docstring
to the helper for clarity.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`:
- Around line 183-193: Extract the duplicated content-normalization logic into a
private helper on the class (e.g., def _normalize_reasoning_model_output(self,
output_message: AIMessage) -> None) that returns early if output_message.content
is not a str, calls remove_r1_think_tags(raw_content), falls back to
re.search(r'<think>(.*?)</think>', raw_content, re.DOTALL) to extract inner text
if the stripped content is empty, and finally uses
output_message.additional_kwargs.get('reasoning_content', '') as the last
fallback; then replace both inline blocks (the one around output_message
handling at lines shown and the duplicate at 217-227) with calls to
self._normalize_reasoning_model_output(output_message) and add a short docstring
to the helper for clarity.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3892d75 and 063612d.

📒 Files selected for processing (2)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/base.py
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py (1)

182-192: Extract duplicated normalization/fallback logic into one helper.

The same block appears twice; centralizing it avoids drift and keeps first/subsequent cycle behavior consistent.

Refactor sketch

+    `@staticmethod`
+    def _normalize_agent_output_content(output_message: AIMessage) -> None:
+        if not isinstance(output_message.content, str):
+            return
+        raw_content = output_message.content
+        output_message.content = remove_r1_think_tags(raw_content)
+        if not output_message.content.strip():
+            think_match = re.search(r"<think>(.*?)</think>", raw_content, re.DOTALL)
+            if think_match:
+                output_message.content = think_match.group(1).strip()
+        if not output_message.content.strip():
+            reasoning = output_message.additional_kwargs.get("reasoning_content", "")
+            if isinstance(reasoning, str) and reasoning.strip():
+                output_message.content = reasoning.strip()

-                    if isinstance(output_message.content, str):
-                        ...
+                    self._normalize_agent_output_content(output_message)

Also applies to: 216-226

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`
around lines 182 - 192, Extract the duplicated normalization/fallback logic that
processes output_message.content (calls remove_r1_think_tags, falls back to
<think> tag content, then to additional_kwargs['reasoning_content']) into a
single helper function (e.g., normalize_output_message_content(output_message)
or similar) and replace both code blocks (the one around output_message handling
at the current location and the other at the later occurrence) with calls to
that helper; ensure the helper accepts an OutputMessage-like object, performs
the remove_r1_think_tags -> check empty -> extract <think> -> check empty -> use
reasoning_content flow, and preserves trimming/strip behavior so first and
subsequent cycle behavior stays identical.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`:
- Around line 182-192: The code assigns output_message.content = reasoning from
output_message.additional_kwargs['reasoning_content'] without validating its
type; guard that value in the blocks inside the remove_r1_think_tags flow and
the similar block later (the code around output_message, additional_kwargs and
reasoning_content) by checking isinstance(reasoning, str) before assigning, and
if it is not a str either skip the assignment or coerce safely (e.g., convert to
str(reasoning) only if that behavior is acceptable), ensuring downstream text
parsing always receives a string.
- Around line 281-290: The fallback that accepts parse-failed outputs as a final
answer (the block using ex.missing_action, content_str, AGENT_LOG_PREFIX,
state.messages, and state.final_answer) is too permissive because it only
rejects content that starts with "Thought:"; update the gating so you only
accept content as a final answer when it contains no ReAct markers anywhere.
Specifically, modify the condition around ex.missing_action to scan content_str
(case-insensitive) for any occurrences of ReAct tokens such as "thought:",
"action:", "observation:", "action input:" (anchored at line starts or
standalone tokens) and only proceed to log, append AIMessage, set
state.final_answer and return state when none of those markers are present;
otherwise continue allowing tools to run or raise the parse error path. Ensure
you use the existing variables (content_str, ex.missing_action, state,
AGENT_LOG_PREFIX) so the change is localized to that block.

---

Nitpick comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`:
- Around line 182-192: Extract the duplicated normalization/fallback logic that
processes output_message.content (calls remove_r1_think_tags, falls back to
<think> tag content, then to additional_kwargs['reasoning_content']) into a
single helper function (e.g., normalize_output_message_content(output_message)
or similar) and replace both code blocks (the one around output_message handling
at the current location and the other at the later occurrence) with calls to
that helper; ensure the helper accepts an OutputMessage-like object, performs
the remove_r1_think_tags -> check empty -> extract <think> -> check empty -> use
reasoning_content flow, and preserves trimming/strip behavior so first and
subsequent cycle behavior stays identical.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 063612d and 5a62c65.

📒 Files selected for processing (1)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>

coderabbitai

♻️ Duplicate comments (2)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py (2)

189-192: ⚠️ Potential issue | 🟡 Minor

Guard reasoning_content type before assignment.

The additional_kwargs dictionary is provider-defined and may contain non-string values. Add a type check to prevent downstream parsing failures.

Suggested fix

                         if not output_message.content.strip():
                             reasoning = output_message.additional_kwargs.get('reasoning_content', '')
-                            if reasoning:
+                            if isinstance(reasoning, str) and reasoning.strip():
                                 output_message.content = reasoning

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`
around lines 189 - 192, The code assigns output_message.content = reasoning from
output_message.additional_kwargs['reasoning_content'] without validating type;
update the branch in the block handling output_message so you first fetch
reasoning = output_message.additional_kwargs.get('reasoning_content', None) and
only assign it if it is a non-empty string (e.g., isinstance(reasoning, str) and
reasoning.strip()), otherwise leave output_message.content unchanged (or coerce
safely to string only after explicit validation) to avoid downstream parsing
errors.

278-291: ⚠️ Potential issue | 🟡 Minor

Consider stricter validation for direct-answer fallback.

The current check only rejects content starting with Thought:, Question:, or Previous conversation. Content with Action: or Observation: markers on later lines could be incorrectly accepted as a final answer.

Since ex.missing_action is already true (meaning no valid action was parsed), this may be acceptable. However, if you want stricter validation to avoid accepting malformed ReAct outputs:

Suggested stricter check

                     content_str = str(output_message.content).strip()
-                    if (ex.missing_action and content_str and not re.match(
-                            r'\s*(thought\s*:|question\s*:|previous\s+conversation)', content_str, re.IGNORECASE)):
+                    react_marker_pattern = re.compile(
+                        r'^\s*(thought|action|action\s*input|observation)\s*:',
+                        re.IGNORECASE | re.MULTILINE,
+                    )
+                    if ex.missing_action and content_str and not react_marker_pattern.search(content_str):

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`
around lines 278 - 291, The direct-answer fallback currently accepts any
non-empty content when ex.missing_action is true as long as it doesn't start
with Thought:/Question:/Previous..., which can still let ReAct markers like
"Action:" or "Observation:" appear later and be accepted; update the validation
in the block that builds content_str (used with ex.missing_action,
AGENT_LOG_PREFIX, state.messages, AIMessage, state.final_answer) to perform a
stricter check—either reject when any line (not just the first) matches common
ReAct markers (e.g., Action:, Observation:, Input:, Answer:) using a regex
search (re.search) or require the first non-empty line to be free of ReAct
markers—so malformed ReAct outputs are not treated as final answers.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`:
- Around line 189-192: The code assigns output_message.content = reasoning from
output_message.additional_kwargs['reasoning_content'] without validating type;
update the branch in the block handling output_message so you first fetch
reasoning = output_message.additional_kwargs.get('reasoning_content', None) and
only assign it if it is a non-empty string (e.g., isinstance(reasoning, str) and
reasoning.strip()), otherwise leave output_message.content unchanged (or coerce
safely to string only after explicit validation) to avoid downstream parsing
errors.
- Around line 278-291: The direct-answer fallback currently accepts any
non-empty content when ex.missing_action is true as long as it doesn't start
with Thought:/Question:/Previous..., which can still let ReAct markers like
"Action:" or "Observation:" appear later and be accepted; update the validation
in the block that builds content_str (used with ex.missing_action,
AGENT_LOG_PREFIX, state.messages, AIMessage, state.final_answer) to perform a
stricter check—either reject when any line (not just the first) matches common
ReAct markers (e.g., Action:, Observation:, Input:, Answer:) using a regex
search (re.search) or require the first non-empty line to be free of ReAct
markers—so malformed ReAct outputs are not treated as final answers.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a62c65 and 222e57e.

📒 Files selected for processing (2)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/base.py
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py

🚧 Files skipped from review as they are similar to previous changes (1)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/base.py

yczhang-nv · 2026-02-26T19:56:59Z

/merge

…s) (NVIDIA#1667) - Reasoning models (e.g., nemotron-3-nano-30b-a3b) wrap their chain-of-thought in <think>...</think> tags, which breaks the ReAct output parser. This PR adds multi-layered handling to recover usable content from reasoning model outputs. - Preserve reasoning_content from streamed LLM chunks in _stream_llm, which previously discarded additional_kwargs when constructing the final AIMessage - Strip <think> tags from LLM output before parsing, with fallbacks: extract content from inside tags, then fall back to reasoning_content from additional_kwargs - Accept direct (non-ReAct-formatted) answers from reasoning models instead of failing with ReActAgentParsingFailedError - Improve retry hints when content is empty, guiding the model toward producing a Final Answer: ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. ## Summary by CodeRabbit * **Bug Fixes** * Streaming responses from reasoning models now capture and combine main content and separate reasoning streams reliably. * Agents more reliably recognize direct final answers even when not in strict ReAct format, reducing missed responses. * Improved fallback behavior that adds clear guidance prompts when parsing fails. * **Improvements** * More consistent normalization of agent outputs across initial and subsequent interaction flows for steadier results. Authors: - Yuchen Zhang (https://github.com/yczhang-nv) Approvers: - Will Killian (https://github.com/willkill07) URL: NVIDIA#1667

fix ReAct agent parsing logic

063612d

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>

yczhang-nv self-assigned this Feb 26, 2026

yczhang-nv requested a review from a team as a code owner February 26, 2026 06:06

yczhang-nv added bug Something isn't working non-breaking Non-breaking change labels Feb 26, 2026

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

willkill07 approved these changes Feb 26, 2026

View reviewed changes

fix CI

5a62c65

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py Show resolved Hide resolved

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py Show resolved Hide resolved

fix pytest

222e57e

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

rapids-bot bot merged commit 7bb10aa into NVIDIA:release/1.5 Feb 26, 2026
15 checks passed

yczhang-nv deleted the yuchen-fix-5928823 branch February 26, 2026 19:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ReAct agent parsing failures with reasoning models (`<think>` tags)#1667

Fix ReAct agent parsing failures with reasoning models (`<think>` tags)#1667
rapids-bot[bot] merged 3 commits intoNVIDIA:release/1.5from
yczhang-nv:yuchen-fix-5928823

yczhang-nv commented Feb 26, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 26, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

yczhang-nv commented Feb 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yczhang-nv commented Feb 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

By Submitting this PR I confirm:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

yczhang-nv commented Feb 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yczhang-nv commented Feb 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 26, 2026 •

edited

Loading