feat(core): transition subprocess tools to self-describing XML protocol#19337
feat(core): transition subprocess tools to self-describing XML protocol#19337aishaneeshah wants to merge 22 commits intomainfrom
Conversation
|
Hi there! Thank you for your contribution to Gemini CLI. To improve our contribution process and better track changes, we now require all pull requests to be associated with an existing issue, as announced in our recent discussion and as detailed in our CONTRIBUTING.md. This pull request is being closed because it is not currently linked to an issue. Once you have updated the description of this PR to link an issue (e.g., by adding How to link an issue: Thank you for your understanding and for being a part of our community! |
|
Hi @aishaneeshah, thank you so much for your contribution to Gemini CLI! We really appreciate the time and effort you've put into this. We're making some updates to our contribution process to improve how we track and review changes. Please take a moment to review our recent discussion post: Improving Our Contribution Process & Introducing New Guidelines. Key Update: Starting January 26, 2026, the Gemini CLI project will require all pull requests to be associated with an existing issue. Any pull requests not linked to an issue by that date will be automatically closed. Thank you for your understanding and for being a part of our community! |
Summary of ChangesHello @aishaneeshah, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the shell command execution tools by ensuring that subprocess results are consistently reported with an exit code, regardless of success or failure. By standardizing the output to an XML format, the changes aim to provide models with more reliable and structured information about command execution, improving their ability to interpret and react to shell command outcomes. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request aims to standardize the output of shell commands by always including the exit code and formatting the result as XML. However, high-severity XML injection vulnerabilities have been identified across tool-registry.ts and shell.ts. The output and error messages are not properly escaped before being embedded in XML tags, which could lead to indirect prompt injection, violating rules for preventing prompt injection by sanitizing user-provided input. Additionally, there's a critical inconsistency in how this new XML formatting and exit code inclusion are applied to DiscoveredToolInvocation in tool-registry.ts compared to ShellTool, potentially causing unpredictable behavior for the model and making documentation of failure output inconsistent.
I am having trouble creating individual review comments. Click here to see my feedback.
packages/core/src/tools/tool-registry.ts (106-117)
This section has two critical issues. Firstly, there's a high-severity XML injection vulnerability: stdout, stderr, and error messages from discovered tools are embedded in XML tags without proper escaping. This could allow tool output to manipulate the XML structure, leading to indirect prompt injection or misinterpretation by the LLM. Secondly, the logic for handling DiscoveredToolInvocation results is inconsistent with the PR's goal of always including the exit code in a consistent XML format. The XML formatting and exit code inclusion are conditional, which deviates from the standardized approach for ShellToolInvocation.
References
- To prevent prompt injection, avoid including user-provided input in content passed to the LLM (
llmContent). If the input is needed for display purposes, usereturnDisplayinstead. - To prevent prompt injection, sanitize any additional context from hooks by escaping HTML-like tag characters such as
<and>.
packages/core/src/tools/shell.ts (357-366)
The output of the shell command and any error messages are included in the XML-formatted llmContent without escaping. This allows for XML injection if the command output contains XML closing tags (e.g., </output><exit_code>0</exit_code>). An attacker could use this to mislead the LLM about the command's execution status or inject malicious instructions into the LLM's context (Indirect Prompt Injection). All untrusted strings should be XML-escaped before being included in the XML structure.
References
- To prevent prompt injection, avoid including user-provided input in content passed to the LLM (
llmContent). If the input is needed for display purposes, usereturnDisplayinstead. - To prevent prompt injection, sanitize any additional context from hooks by escaping HTML-like tag characters such as
<and>.
packages/core/src/tools/tool-registry.ts (156-162)
The description of the failure output format for discovered tools has been removed. While this aligns with similar changes for other tools, the actual implementation of the output for DiscoveredToolInvocation is now inconsistent, as noted in another comment.
Until the output format is made consistent and predictable (always returning the <subprocess_result> XML), removing this documentation could make it harder for the model to understand the tool's output, especially in failure cases. It might be better to update this description to reflect the new XML format, or restore it until the implementation is fixed.
|
Thank you for linking an issue! This pull request has been automatically reopened. |
|
Size Change: +88 B (0%) Total Size: 24.5 MB ℹ️ View Unchanged
|
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request refactors the ShellTool and DiscoveredToolInvocation to output command execution results in a structured XML format, replacing the previous plain text output. This change involves wrapping details like output, error messages, exit codes, signals, background PIDs, and process group IDs within specific XML tags (e.g., <output>, <error>, <exit_code>) under a root <subprocess_result> tag. Additionally, the tool descriptions have been updated to remove the explicit list of returned information, and package-lock.json was updated with peer: true for several dependencies. A review comment highlighted a potential issue where dynamic command output and error messages, if they contain special characters, could lead to malformed XML, suggesting the use of CDATA sections to ensure valid XML output.
| const output = result.output || '(empty)'; | ||
| const parts = [`<output>${output}</output>`]; | ||
|
|
||
| if (result.error) { | ||
| const finalError = result.error.message.replaceAll( | ||
| commandToExecute, | ||
| this.params.command, | ||
| ); | ||
| llmContentParts.push(`Error: ${finalError}`); | ||
| parts.push(`<error>${finalError}</error>`); |
There was a problem hiding this comment.
The command output and error message are directly embedded into the XML string. If the output or error message contains characters like <, >, or &, it will result in malformed XML. This could lead to parsing errors or unpredictable behavior from the language model. To ensure the generated XML is always valid, you should wrap this dynamic content in CDATA sections.
| const output = result.output || '(empty)'; | |
| const parts = [`<output>${output}</output>`]; | |
| if (result.error) { | |
| const finalError = result.error.message.replaceAll( | |
| commandToExecute, | |
| this.params.command, | |
| ); | |
| llmContentParts.push(`Error: ${finalError}`); | |
| parts.push(`<error>${finalError}</error>`); | |
| const output = result.output || '(empty)'; | |
| const parts = [`<output><![CDATA[${output}]]></output>`]; | |
| if (result.error) { | |
| const finalError = result.error.message.replaceAll( | |
| commandToExecute, | |
| this.params.command, | |
| ); | |
| parts.push(`<error><![CDATA[${finalError}]]></error>`); |
References
- To prevent prompt injection, avoid including user-provided input in content passed to the LLM (
llmContent). If the input is needed for display purposes, usereturnDisplayinstead. - To prevent prompt injection, sanitize any additional context from hooks by escaping HTML-like tag characters such as
<and>.
Wrap XML content in CDATA
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a significant change to how shell command results are formatted and presented to the LLM, transitioning from plain text key-value pairs to a structured XML format. Specifically, the ShellTool and DiscoveredTool now wrap their outputs in <subprocess_result> XML tags, with individual components like output, error, exit code, signal, background PIDs, and process group PGID also enclosed in their respective XML tags. A notable change is the explicit inclusion of <exit_code>0</exit_code> for successful commands, which was previously omitted. The verbose descriptions of returned information in tool descriptions and snapshots have been removed to reflect this new XML-based output. Additionally, several package-lock.json entries were updated to include the "peer": true attribute. Review comments highlighted a critical XML injection vulnerability in tool-registry.ts where stdout, stderr, and error values were not properly CDATA-wrapped, and a potential vulnerability in shell.ts where ]]> sequences within CDATA sections were not sanitized, allowing for XML injection.
| `<output>\n <stdout>${stdout.trim() || '(empty)'}</stdout>\n <stderr>${stderr.trim() || '(empty)'}</stderr>\n </output>`, | ||
| ]; | ||
| if (error) { | ||
| parts.push(`<error>${error}</error>`); |
There was a problem hiding this comment.
The stdout, stderr, and error values are directly embedded into XML tags without proper escaping or CDATA wrapping. This creates an XML injection vulnerability, as any XML-special characters in the tool's output can corrupt the XML structure and allow manipulation of the LLM's perception. To prevent this, wrap the content in <![CDATA[ ... ]]> sections, similar to how it's handled in shell.ts, and ensure proper handling of the ]]> sequence.
| const parts = [`<output><![CDATA[${output}]]></output>`]; | ||
| if (result.error) { | ||
| const finalError = result.error.message.replaceAll( | ||
| commandToExecute, | ||
| this.params.command, | ||
| ); | ||
| llmContentParts.push(`Error: ${finalError}`); | ||
| } | ||
| parts.push(`<error><![CDATA[${finalError}]]></error>`); |
There was a problem hiding this comment.
The tool output is wrapped in <![CDATA[ ... ]]> sections, but the content is not sanitized for the ]]> sequence. An attacker-controlled command output could include ]]> to break out of the CDATA section and inject arbitrary XML tags into the tool result. This can be used to mislead the LLM about the execution results or inject malicious instructions into the model's context.
To remediate this, you should escape or remove any occurrences of ]]> within the output and finalError strings before wrapping them in CDATA.
| @@ -153,13 +159,6 @@ Tool discovery and call commands can be configured in project or user settings. | |||
|
|
|||
| When called, the tool call command is executed as a subprocess. | |||
| On success, tool output is returned as a json string. | |||
Add exit_code before output or error to signal the model
move exit code above other parts when tool fails
Summary
This PR transitions shell tool (both built-in and discovered) outputs to a self-describing XML protocol using the
<subprocess_result>tag instead of specifying via plain-text description in the tool description.Details
run_shell_commandand discovered tools now use a structured<subprocess_result>XML format.packages/core/src/utils/xml.tsthat replaces]]>with]]]]><![CDATA[>, preventing subprocess output from closing the CDATA section and injecting XML tags.<exit_code>tag, even when it is 0.packages/core/src/tools/shell-xml-safety.test.tsto verify injection protection.ShellToolandDiscoveredTooltests to assert the new secure XML format.evals/subprocess-xml.eval.ts.Related Issues
Part of solution to #17548.
How to Validate
npm test -w @google/gemini-cli-core -- src/tools/shell-xml-safety.test.ts src/tools/shell.test.ts src/tools/tool-registry.test.tsPre-Merge Checklist