Skip to content

feat(core): transition subprocess tools to self-describing XML protocol#19337

Open
aishaneeshah wants to merge 22 commits intomainfrom
task/subprocess_xml_tagging
Open

feat(core): transition subprocess tools to self-describing XML protocol#19337
aishaneeshah wants to merge 22 commits intomainfrom
task/subprocess_xml_tagging

Conversation

@aishaneeshah
Copy link
Copy Markdown
Contributor

@aishaneeshah aishaneeshah commented Feb 18, 2026

Summary

This PR transitions shell tool (both built-in and discovered) outputs to a self-describing XML protocol using the <subprocess_result> tag instead of specifying via plain-text description in the tool description.

Details

  • Transition to XML Protocol: Both the built-in run_shell_command and discovered tools now use a structured <subprocess_result> XML format.
  • Security & Injection Protection:
    • CDATA Sanitization: Implemented a breakout-proof CDATA wrapper in packages/core/src/utils/xml.ts that replaces ]]> with ]]]]><![CDATA[>, preventing subprocess output from closing the CDATA section and injecting XML tags.
    • XML Entity Escaping: Discovered tool outputs (stdout, stderr, error) are now XML-escaped to prevent tag injection.
  • Improved Success Visibility: The built-in shell tool now always includes the <exit_code> tag, even when it is 0.
  • Protocol Deduplication: Redundant plain-text protocol descriptions have been removed from tool definitions for token savings.
  • Testing & Verification:
    • Security Tests: Added packages/core/src/tools/shell-xml-safety.test.ts to verify injection protection.
    • Unit Tests: Updated ShellTool and DiscoveredTool tests to assert the new secure XML format.
    • Behavioral Evals: Verified model interpretation via evals/subprocess-xml.eval.ts.

Related Issues

Part of solution to #17548.

How to Validate

  1. Security & Unit Tests:
    npm test -w @google/gemini-cli-core -- src/tools/shell-xml-safety.test.ts src/tools/shell.test.ts src/tools/tool-registry.test.ts
  2. Behavioral Evals:
    npm run test:always_passing_evals -- evals/subprocess-xml.eval.ts

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted security review and fixes
  • Validated on required platforms/methods:
    • Linux (npm run)

@aishaneeshah aishaneeshah requested review from a team as code owners February 18, 2026 01:43
@gemini-cli
Copy link
Copy Markdown
Contributor

gemini-cli Bot commented Feb 18, 2026

Hi there! Thank you for your contribution to Gemini CLI.

To improve our contribution process and better track changes, we now require all pull requests to be associated with an existing issue, as announced in our recent discussion and as detailed in our CONTRIBUTING.md.

This pull request is being closed because it is not currently linked to an issue. Once you have updated the description of this PR to link an issue (e.g., by adding Fixes #123 or Related to #123), it will be automatically reopened.

How to link an issue:
Add a keyword followed by the issue number (e.g., Fixes #123) in the description of your pull request. For more details on supported keywords and how linking works, please refer to the GitHub Documentation on linking pull requests to issues.

Thank you for your understanding and for being a part of our community!

@gemini-cli gemini-cli Bot closed this Feb 18, 2026
@gemini-cli
Copy link
Copy Markdown
Contributor

gemini-cli Bot commented Feb 18, 2026

Hi @aishaneeshah, thank you so much for your contribution to Gemini CLI! We really appreciate the time and effort you've put into this.

We're making some updates to our contribution process to improve how we track and review changes. Please take a moment to review our recent discussion post: Improving Our Contribution Process & Introducing New Guidelines.

Key Update: Starting January 26, 2026, the Gemini CLI project will require all pull requests to be associated with an existing issue. Any pull requests not linked to an issue by that date will be automatically closed.

Thank you for your understanding and for being a part of our community!

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @aishaneeshah, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the shell command execution tools by ensuring that subprocess results are consistently reported with an exit code, regardless of success or failure. By standardizing the output to an XML format, the changes aim to provide models with more reliable and structured information about command execution, improving their ability to interpret and react to shell command outcomes.

Highlights

  • Consistent Exit Code Reporting: The run_shell_command tool now always includes the process exit code in the <subprocess_result> XML output, even for successful commands (exit code 0). This ensures models have complete visibility into command execution status.
  • XML Output Standardization: The output format for run_shell_command and discovered tools has been standardized to use XML tags (e.g., <output>, <exit_code>, <error>) for all subprocess results, improving parseability and consistency.
  • New Behavioral Tests: Comprehensive behavioral tests (evals/subprocess-xml.eval.ts) have been added to verify that the model correctly interprets the new XML-tagged output for success, failure, and error cases.
  • Tool Description Updates: The descriptions for the run_shell_command tool have been updated to reflect that the exit code is now always returned, and the detailed output structure is now implicitly handled by XML.
Changelog
  • evals/subprocess-xml.eval.ts
    • Added new behavioral tests to validate the model's interpretation of XML-tagged subprocess results for various scenarios (success, failure, error).
  • package-lock.json
    • Updated dependency metadata by adding "peer": true to several packages, including @octokit/rest, @opentelemetry/* packages, acorn, eslint, express, hono, @jrichman/ink, react, shell-quote, picomatch, tslib, tsx, typescript, vite, and vitest.
  • packages/core/src/tools/snapshots/shell.test.ts.snap
    • Updated snapshots to reflect the new XML-based output format for shell command results, including the consistent presence of the exit code.
  • packages/core/src/tools/definitions/snapshots/coreToolsModelSnapshots.test.ts.snap
    • Updated snapshots for run_shell_command tool definitions to remove the explicit description of output fields, as the output is now structured via XML.
  • packages/core/src/tools/definitions/dynamic-declaration-helpers.ts
    • Modified the getShellToolDescription function to update the description of the run_shell_command tool, indicating that the exit code is now always included.
  • packages/core/src/tools/shell.test.ts
    • Updated unit tests to assert the presence of XML tags for various output fields (e.g., <background_pids>, <error>, <exit_code>, <signal>).
    • Modified the test for successful commands to expect <exit_code>0</exit_code> to be present in the output.
  • packages/core/src/tools/shell.ts
    • Refactored the ShellToolInvocation to always include the exit code in the llmContent output, even when it is 0.
    • Implemented XML tagging for all subprocess result components (output, error, exit code, signal, background PIDs, process group PGID) within a <subprocess_result> root tag.
  • packages/core/src/tools/tool-registry.test.ts
    • Updated tests for discovered tool execution to expect XML-formatted output for stdout, stderr, and exit code when an error occurs.
  • packages/core/src/tools/tool-registry.ts
    • Modified DiscoveredToolInvocation to format error details (stdout, stderr, error, exit code, signal) into an XML structure within a <subprocess_result> tag.
    • Removed the explicit description of error output fields from the DiscoveredTool description, as the output is now XML-structured.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to standardize the output of shell commands by always including the exit code and formatting the result as XML. However, high-severity XML injection vulnerabilities have been identified across tool-registry.ts and shell.ts. The output and error messages are not properly escaped before being embedded in XML tags, which could lead to indirect prompt injection, violating rules for preventing prompt injection by sanitizing user-provided input. Additionally, there's a critical inconsistency in how this new XML formatting and exit code inclusion are applied to DiscoveredToolInvocation in tool-registry.ts compared to ShellTool, potentially causing unpredictable behavior for the model and making documentation of failure output inconsistent.

I am having trouble creating individual review comments. Click here to see my feedback.

packages/core/src/tools/tool-registry.ts (106-117)

security-high high

This section has two critical issues. Firstly, there's a high-severity XML injection vulnerability: stdout, stderr, and error messages from discovered tools are embedded in XML tags without proper escaping. This could allow tool output to manipulate the XML structure, leading to indirect prompt injection or misinterpretation by the LLM. Secondly, the logic for handling DiscoveredToolInvocation results is inconsistent with the PR's goal of always including the exit code in a consistent XML format. The XML formatting and exit code inclusion are conditional, which deviates from the standardized approach for ShellToolInvocation.

References
  1. To prevent prompt injection, avoid including user-provided input in content passed to the LLM (llmContent). If the input is needed for display purposes, use returnDisplay instead.
  2. To prevent prompt injection, sanitize any additional context from hooks by escaping HTML-like tag characters such as < and >.

packages/core/src/tools/shell.ts (357-366)

security-high high

The output of the shell command and any error messages are included in the XML-formatted llmContent without escaping. This allows for XML injection if the command output contains XML closing tags (e.g., </output><exit_code>0</exit_code>). An attacker could use this to mislead the LLM about the command's execution status or inject malicious instructions into the LLM's context (Indirect Prompt Injection). All untrusted strings should be XML-escaped before being included in the XML structure.

References
  1. To prevent prompt injection, avoid including user-provided input in content passed to the LLM (llmContent). If the input is needed for display purposes, use returnDisplay instead.
  2. To prevent prompt injection, sanitize any additional context from hooks by escaping HTML-like tag characters such as < and >.

packages/core/src/tools/tool-registry.ts (156-162)

high

The description of the failure output format for discovered tools has been removed. While this aligns with similar changes for other tools, the actual implementation of the output for DiscoveredToolInvocation is now inconsistent, as noted in another comment.

Until the output format is made consistent and predictable (always returning the <subprocess_result> XML), removing this documentation could make it harder for the model to understand the tool's output, especially in failure cases. It might be better to update this description to reflect the new XML format, or restore it until the implementation is fixed.

@aishaneeshah aishaneeshah changed the title feat(shell): always include exit_code in subprocess_result XML feat(core): transition subprocess tools to self-describing XML protocol Feb 18, 2026
@gemini-cli gemini-cli Bot reopened this Feb 18, 2026
@gemini-cli
Copy link
Copy Markdown
Contributor

gemini-cli Bot commented Feb 18, 2026

Thank you for linking an issue! This pull request has been automatically reopened.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 18, 2026

Size Change: +88 B (0%)

Total Size: 24.5 MB

ℹ️ View Unchanged
Filename Size Change
./bundle/gemini.js 24.5 MB +88 B (0%)
./bundle/sandbox-macos-permissive-open.sb 890 B 0 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB 0 B
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB 0 B
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB 0 B
./bundle/sandbox-macos-strict-open.sb 4.82 kB 0 B
./bundle/sandbox-macos-strict-proxied.sb 5.02 kB 0 B

compressed-size-action

@gemini-cli gemini-cli Bot added area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality 🔒 maintainer only ⛔ Do not contribute. Internal roadmap item. labels Feb 18, 2026
@aishaneeshah
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the ShellTool and DiscoveredToolInvocation to output command execution results in a structured XML format, replacing the previous plain text output. This change involves wrapping details like output, error messages, exit codes, signals, background PIDs, and process group IDs within specific XML tags (e.g., <output>, <error>, <exit_code>) under a root <subprocess_result> tag. Additionally, the tool descriptions have been updated to remove the explicit list of returned information, and package-lock.json was updated with peer: true for several dependencies. A review comment highlighted a potential issue where dynamic command output and error messages, if they contain special characters, could lead to malformed XML, suggesting the use of CDATA sections to ensure valid XML output.

Comment thread packages/core/src/tools/shell.ts Outdated
Comment on lines +357 to +365
const output = result.output || '(empty)';
const parts = [`<output>${output}</output>`];

if (result.error) {
const finalError = result.error.message.replaceAll(
commandToExecute,
this.params.command,
);
llmContentParts.push(`Error: ${finalError}`);
parts.push(`<error>${finalError}</error>`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The command output and error message are directly embedded into the XML string. If the output or error message contains characters like <, >, or &, it will result in malformed XML. This could lead to parsing errors or unpredictable behavior from the language model. To ensure the generated XML is always valid, you should wrap this dynamic content in CDATA sections.

Suggested change
const output = result.output || '(empty)';
const parts = [`<output>${output}</output>`];
if (result.error) {
const finalError = result.error.message.replaceAll(
commandToExecute,
this.params.command,
);
llmContentParts.push(`Error: ${finalError}`);
parts.push(`<error>${finalError}</error>`);
const output = result.output || '(empty)';
const parts = [`<output><![CDATA[${output}]]></output>`];
if (result.error) {
const finalError = result.error.message.replaceAll(
commandToExecute,
this.params.command,
);
parts.push(`<error><![CDATA[${finalError}]]></error>`);
References
  1. To prevent prompt injection, avoid including user-provided input in content passed to the LLM (llmContent). If the input is needed for display purposes, use returnDisplay instead.
  2. To prevent prompt injection, sanitize any additional context from hooks by escaping HTML-like tag characters such as < and >.

Wrap XML content in CDATA
@aishaneeshah
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant change to how shell command results are formatted and presented to the LLM, transitioning from plain text key-value pairs to a structured XML format. Specifically, the ShellTool and DiscoveredTool now wrap their outputs in <subprocess_result> XML tags, with individual components like output, error, exit code, signal, background PIDs, and process group PGID also enclosed in their respective XML tags. A notable change is the explicit inclusion of <exit_code>0</exit_code> for successful commands, which was previously omitted. The verbose descriptions of returned information in tool descriptions and snapshots have been removed to reflect this new XML-based output. Additionally, several package-lock.json entries were updated to include the "peer": true attribute. Review comments highlighted a critical XML injection vulnerability in tool-registry.ts where stdout, stderr, and error values were not properly CDATA-wrapped, and a potential vulnerability in shell.ts where ]]> sequences within CDATA sections were not sanitized, allowing for XML injection.

Comment on lines +107 to +110
`<output>\n <stdout>${stdout.trim() || '(empty)'}</stdout>\n <stderr>${stderr.trim() || '(empty)'}</stderr>\n </output>`,
];
if (error) {
parts.push(`<error>${error}</error>`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The stdout, stderr, and error values are directly embedded into XML tags without proper escaping or CDATA wrapping. This creates an XML injection vulnerability, as any XML-special characters in the tool's output can corrupt the XML structure and allow manipulation of the LLM's perception. To prevent this, wrap the content in <![CDATA[ ... ]]> sections, similar to how it's handled in shell.ts, and ensure proper handling of the ]]> sequence.

Comment thread packages/core/src/tools/shell.ts Outdated
Comment on lines +359 to +365
const parts = [`<output><![CDATA[${output}]]></output>`];
if (result.error) {
const finalError = result.error.message.replaceAll(
commandToExecute,
this.params.command,
);
llmContentParts.push(`Error: ${finalError}`);
}
parts.push(`<error><![CDATA[${finalError}]]></error>`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The tool output is wrapped in <![CDATA[ ... ]]> sections, but the content is not sanitized for the ]]> sequence. An attacker-controlled command output could include ]]> to break out of the CDATA section and inject arbitrary XML tags into the tool result. This can be used to mislead the LLM about the execution results or inject malicious instructions into the model's context.

To remediate this, you should escape or remove any occurrences of ]]> within the output and finalError strings before wrapping them in CDATA.

@@ -153,13 +159,6 @@ Tool discovery and call commands can be configured in project or user settings.

When called, the tool call command is executed as a subprocess.
On success, tool output is returned as a json string.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality 🔒 maintainer only ⛔ Do not contribute. Internal roadmap item.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant