Skip to content

Fix: Improve tool call parser to handle nested tool tags #4426

@daniel-lxs

Description

@daniel-lxs

Problem Statement

The current tool call parser in the Roo extension cannot properly handle tool calls that contain other tool calls nested within them. When a tool call contains text that looks like another tool call (such as XML-style tags), the parser may incorrectly interpret these nested tags as actual tool calls, leading to parsing errors or incorrect tool execution.

Technical Details

The main parsing logic is implemented in two files:

  • src/core/assistant-message/parseAssistantMessage.ts: The original parser
  • src/core/assistant-message/parseAssistantMessageV2.ts: An optimized version

Both parsers convert the assistant's message text into structured content blocks, identifying tool calls by looking for XML-style tags. The parsers work by:

  1. Scanning through the assistant's message character by character
  2. Identifying opening tags like <tool_name> to start a tool call
  3. Collecting parameter values between parameter tags like <param_name>value</param_name>
  4. Identifying closing tags like </tool_name> to complete a tool call

The current implementation has some handling for nested tags, but it's not robust enough:

  • In parseAssistantMessageV2.ts (lines 145-146), there's special handling for the write_to_file tool that uses lastIndexOf to find the last closing tag
  • Test cases in parseAssistantMessage.test.ts show that the parser can handle certain cases where tool tags appear within parameter values (lines 215-227), but this is limited

Proposed Solution

The solution would involve modifying the parsing logic to:

  1. Track the "depth" of tool call parsing
  2. Ignore opening tags for new tools when already inside a tool call
  3. Only look for the closing tag of the current tool being parsed

Files to Modify

  • src/core/assistant-message/parseAssistantMessage.ts
  • src/core/assistant-message/parseAssistantMessageV2.ts

Implementation Approach

In parseAssistantMessageV2.ts, the changes would focus on:

  1. Adding a flag to track when we're inside a tool call
  2. Modifying the tool detection logic (around line 174) to only look for new tool tags when not already inside a tool
  3. Ensuring that when inside a tool, we only look for the specific closing tag of that tool

The implementation would be similar to how the XmlMatcher class in src/utils/xml-matcher.ts handles nested tags, which uses a depth counter to track nesting levels.

Specific Code Areas for Modification

In parseAssistantMessageV2.ts, around line 174 where it checks for new tool starts:

// Check if starting a new tool use.
let startedNewTool = false;

// Only look for new tool starts if we're not already inside a tool
if (!currentToolUse) {
  for (const [tag, toolName] of toolUseOpenTags.entries()) {
    // Existing logic to detect tool start
  }
}

The tool closing detection logic around line 118 is already correctly looking for the specific closing tag of the current tool. The parameter parsing logic should remain largely unchanged as it's already scoped to the current tool.

Reproduction Steps

  1. Create a task that requires the assistant to use a tool that contains content with XML-like tags
  2. Observe that the parser incorrectly interprets nested tags as actual tool calls

Expected Behavior

The parser should correctly identify the outermost tool call and treat any content within it (including XML-like tags) as part of the parameter values, not as separate tool calls.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue - Needs ScopingValid, but needs effort estimate or design input before work can start.bugSomething isn't workingparser

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions