Fix ReAct agent parsing failures with reasoning models (<think> tags)#1667
Conversation
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
WalkthroughBase agent separates streaming LLM content and reasoning content and returns both (reasoning via additional_kwargs); ReAct agent normalizes outputs by stripping R1 think tags, falls back to extracting ... or reasoning_content, treats certain non-ReAct outputs as direct final answers, and adds fallback prompts on parse failures. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py (1)
183-193: Extract duplicated content normalization logic into a helper method.This normalization logic is duplicated at lines 217-227. Consider extracting it into a private method to improve maintainability.
♻️ Proposed refactor
Add a helper method to the class:
def _normalize_reasoning_model_output(self, output_message: AIMessage) -> None: """Normalize output from reasoning models by stripping think tags and extracting content.""" if not isinstance(output_message.content, str): return raw_content = output_message.content output_message.content = remove_r1_think_tags(raw_content) if not output_message.content.strip(): think_match = re.search(r'<think>(.*?)</think>', raw_content, re.DOTALL) if think_match: output_message.content = think_match.group(1).strip() if not output_message.content.strip(): reasoning = output_message.additional_kwargs.get('reasoning_content', '') if reasoning: output_message.content = reasoningThen replace both occurrences with:
- if isinstance(output_message.content, str): - raw_content = output_message.content - output_message.content = remove_r1_think_tags(raw_content) - if not output_message.content.strip(): - think_match = re.search(r'<think>(.*?)</think>', raw_content, re.DOTALL) - if think_match: - output_message.content = think_match.group(1).strip() - if not output_message.content.strip(): - reasoning = output_message.additional_kwargs.get('reasoning_content', '') - if reasoning: - output_message.content = reasoning + self._normalize_reasoning_model_output(output_message)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py` around lines 183 - 193, Extract the duplicated content-normalization logic into a private helper on the class (e.g., def _normalize_reasoning_model_output(self, output_message: AIMessage) -> None) that returns early if output_message.content is not a str, calls remove_r1_think_tags(raw_content), falls back to re.search(r'<think>(.*?)</think>', raw_content, re.DOTALL) to extract inner text if the stripped content is empty, and finally uses output_message.additional_kwargs.get('reasoning_content', '') as the last fallback; then replace both inline blocks (the one around output_message handling at lines shown and the duplicate at 217-227) with calls to self._normalize_reasoning_model_output(output_message) and add a short docstring to the helper for clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`:
- Around line 183-193: Extract the duplicated content-normalization logic into a
private helper on the class (e.g., def _normalize_reasoning_model_output(self,
output_message: AIMessage) -> None) that returns early if output_message.content
is not a str, calls remove_r1_think_tags(raw_content), falls back to
re.search(r'<think>(.*?)</think>', raw_content, re.DOTALL) to extract inner text
if the stripped content is empty, and finally uses
output_message.additional_kwargs.get('reasoning_content', '') as the last
fallback; then replace both inline blocks (the one around output_message
handling at lines shown and the duplicate at 217-227) with calls to
self._normalize_reasoning_model_output(output_message) and add a short docstring
to the helper for clarity.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/base.pypackages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py (1)
182-192: Extract duplicated normalization/fallback logic into one helper.The same block appears twice; centralizing it avoids drift and keeps first/subsequent cycle behavior consistent.
Refactor sketch
+ `@staticmethod` + def _normalize_agent_output_content(output_message: AIMessage) -> None: + if not isinstance(output_message.content, str): + return + raw_content = output_message.content + output_message.content = remove_r1_think_tags(raw_content) + if not output_message.content.strip(): + think_match = re.search(r"<think>(.*?)</think>", raw_content, re.DOTALL) + if think_match: + output_message.content = think_match.group(1).strip() + if not output_message.content.strip(): + reasoning = output_message.additional_kwargs.get("reasoning_content", "") + if isinstance(reasoning, str) and reasoning.strip(): + output_message.content = reasoning.strip()- if isinstance(output_message.content, str): - ... + self._normalize_agent_output_content(output_message)Also applies to: 216-226
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py` around lines 182 - 192, Extract the duplicated normalization/fallback logic that processes output_message.content (calls remove_r1_think_tags, falls back to <think> tag content, then to additional_kwargs['reasoning_content']) into a single helper function (e.g., normalize_output_message_content(output_message) or similar) and replace both code blocks (the one around output_message handling at the current location and the other at the later occurrence) with calls to that helper; ensure the helper accepts an OutputMessage-like object, performs the remove_r1_think_tags -> check empty -> extract <think> -> check empty -> use reasoning_content flow, and preserves trimming/strip behavior so first and subsequent cycle behavior stays identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`:
- Around line 182-192: The code assigns output_message.content = reasoning from
output_message.additional_kwargs['reasoning_content'] without validating its
type; guard that value in the blocks inside the remove_r1_think_tags flow and
the similar block later (the code around output_message, additional_kwargs and
reasoning_content) by checking isinstance(reasoning, str) before assigning, and
if it is not a str either skip the assignment or coerce safely (e.g., convert to
str(reasoning) only if that behavior is acceptable), ensuring downstream text
parsing always receives a string.
- Around line 281-290: The fallback that accepts parse-failed outputs as a final
answer (the block using ex.missing_action, content_str, AGENT_LOG_PREFIX,
state.messages, and state.final_answer) is too permissive because it only
rejects content that starts with "Thought:"; update the gating so you only
accept content as a final answer when it contains no ReAct markers anywhere.
Specifically, modify the condition around ex.missing_action to scan content_str
(case-insensitive) for any occurrences of ReAct tokens such as "thought:",
"action:", "observation:", "action input:" (anchored at line starts or
standalone tokens) and only proceed to log, append AIMessage, set
state.final_answer and return state when none of those markers are present;
otherwise continue allowing tools to run or raise the parse error path. Ensure
you use the existing variables (content_str, ex.missing_action, state,
AGENT_LOG_PREFIX) so the change is localized to that block.
---
Nitpick comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`:
- Around line 182-192: Extract the duplicated normalization/fallback logic that
processes output_message.content (calls remove_r1_think_tags, falls back to
<think> tag content, then to additional_kwargs['reasoning_content']) into a
single helper function (e.g., normalize_output_message_content(output_message)
or similar) and replace both code blocks (the one around output_message handling
at the current location and the other at the later occurrence) with calls to
that helper; ensure the helper accepts an OutputMessage-like object, performs
the remove_r1_think_tags -> check empty -> extract <think> -> check empty -> use
reasoning_content flow, and preserves trimming/strip behavior so first and
subsequent cycle behavior stays identical.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py
Show resolved
Hide resolved
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py
Show resolved
Hide resolved
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
There was a problem hiding this comment.
♻️ Duplicate comments (2)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py (2)
189-192:⚠️ Potential issue | 🟡 MinorGuard
reasoning_contenttype before assignment.The
additional_kwargsdictionary is provider-defined and may contain non-string values. Add a type check to prevent downstream parsing failures.Suggested fix
if not output_message.content.strip(): reasoning = output_message.additional_kwargs.get('reasoning_content', '') - if reasoning: + if isinstance(reasoning, str) and reasoning.strip(): output_message.content = reasoning🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py` around lines 189 - 192, The code assigns output_message.content = reasoning from output_message.additional_kwargs['reasoning_content'] without validating type; update the branch in the block handling output_message so you first fetch reasoning = output_message.additional_kwargs.get('reasoning_content', None) and only assign it if it is a non-empty string (e.g., isinstance(reasoning, str) and reasoning.strip()), otherwise leave output_message.content unchanged (or coerce safely to string only after explicit validation) to avoid downstream parsing errors.
278-291:⚠️ Potential issue | 🟡 MinorConsider stricter validation for direct-answer fallback.
The current check only rejects content starting with
Thought:,Question:, orPrevious conversation. Content withAction:orObservation:markers on later lines could be incorrectly accepted as a final answer.Since
ex.missing_actionis already true (meaning no valid action was parsed), this may be acceptable. However, if you want stricter validation to avoid accepting malformed ReAct outputs:Suggested stricter check
content_str = str(output_message.content).strip() - if (ex.missing_action and content_str and not re.match( - r'\s*(thought\s*:|question\s*:|previous\s+conversation)', content_str, re.IGNORECASE)): + react_marker_pattern = re.compile( + r'^\s*(thought|action|action\s*input|observation)\s*:', + re.IGNORECASE | re.MULTILINE, + ) + if ex.missing_action and content_str and not react_marker_pattern.search(content_str):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py` around lines 278 - 291, The direct-answer fallback currently accepts any non-empty content when ex.missing_action is true as long as it doesn't start with Thought:/Question:/Previous..., which can still let ReAct markers like "Action:" or "Observation:" appear later and be accepted; update the validation in the block that builds content_str (used with ex.missing_action, AGENT_LOG_PREFIX, state.messages, AIMessage, state.final_answer) to perform a stricter check—either reject when any line (not just the first) matches common ReAct markers (e.g., Action:, Observation:, Input:, Answer:) using a regex search (re.search) or require the first non-empty line to be free of ReAct markers—so malformed ReAct outputs are not treated as final answers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`:
- Around line 189-192: The code assigns output_message.content = reasoning from
output_message.additional_kwargs['reasoning_content'] without validating type;
update the branch in the block handling output_message so you first fetch
reasoning = output_message.additional_kwargs.get('reasoning_content', None) and
only assign it if it is a non-empty string (e.g., isinstance(reasoning, str) and
reasoning.strip()), otherwise leave output_message.content unchanged (or coerce
safely to string only after explicit validation) to avoid downstream parsing
errors.
- Around line 278-291: The direct-answer fallback currently accepts any
non-empty content when ex.missing_action is true as long as it doesn't start
with Thought:/Question:/Previous..., which can still let ReAct markers like
"Action:" or "Observation:" appear later and be accepted; update the validation
in the block that builds content_str (used with ex.missing_action,
AGENT_LOG_PREFIX, state.messages, AIMessage, state.final_answer) to perform a
stricter check—either reject when any line (not just the first) matches common
ReAct markers (e.g., Action:, Observation:, Input:, Answer:) using a regex
search (re.search) or require the first non-empty line to be free of ReAct
markers—so malformed ReAct outputs are not treated as final answers.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/base.pypackages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/base.py
|
/merge |
…s) (NVIDIA#1667) - Reasoning models (e.g., nemotron-3-nano-30b-a3b) wrap their chain-of-thought in <think>...</think> tags, which breaks the ReAct output parser. This PR adds multi-layered handling to recover usable content from reasoning model outputs. - Preserve reasoning_content from streamed LLM chunks in _stream_llm, which previously discarded additional_kwargs when constructing the final AIMessage - Strip <think> tags from LLM output before parsing, with fallbacks: extract content from inside tags, then fall back to reasoning_content from additional_kwargs - Accept direct (non-ReAct-formatted) answers from reasoning models instead of failing with ReActAgentParsingFailedError - Improve retry hints when content is empty, guiding the model toward producing a Final Answer: ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. ## Summary by CodeRabbit * **Bug Fixes** * Streaming responses from reasoning models now capture and combine main content and separate reasoning streams reliably. * Agents more reliably recognize direct final answers even when not in strict ReAct format, reducing missed responses. * Improved fallback behavior that adds clear guidance prompts when parsing fails. * **Improvements** * More consistent normalization of agent outputs across initial and subsequent interaction flows for steadier results. Authors: - Yuchen Zhang (https://github.com/yczhang-nv) Approvers: - Will Killian (https://github.com/willkill07) URL: NVIDIA#1667
Description
By Submitting this PR I confirm:
Summary by CodeRabbit
Bug Fixes
Improvements