Skip to content

fix(shell): force UTF-8 encoding for PowerShell commands on Windows#25262

Open
enjoykumawat wants to merge 2 commits intogoogle-gemini:mainfrom
enjoykumawat:fix/powershell-utf8-encoding-v2
Open

fix(shell): force UTF-8 encoding for PowerShell commands on Windows#25262
enjoykumawat wants to merge 2 commits intogoogle-gemini:mainfrom
enjoykumawat:fix/powershell-utf8-encoding-v2

Conversation

@enjoykumawat
Copy link
Copy Markdown
Contributor

Fixes #20661

Problem

On Windows with a non-UTF-8 system code page (e.g. CP936 for Chinese locale), PowerShell uses the system default encoding for console output. When gemini-cli captures this output through Node.js pipes, non-ASCII characters (e.g. Chinese text) are decoded incorrectly and appear garbled.

Two issues combine to cause this:

  1. PowerShell's [Console]::OutputEncoding defaults to the system code page, so bytes written to stdout use that encoding.
  2. getCachedEncodingForBuffer detects the system code page via chcp and uses it for the TextDecoder, but since we now force UTF-8 output, using the system code page decoder produces garbled text.

Fix

  • Add a POWERSHELL_UTF8_GUARD preamble prepended to every PowerShell command:

    [Console]::OutputEncoding = [System.Text.Encoding]::UTF8; $OutputEncoding = [System.Text.Encoding]::UTF8;

    This forces PowerShell to write UTF-8 bytes to stdout/stderr before the user's command runs.

  • Rename ensurePromptvarsDisabledapplyShellPreamble to reflect its broader responsibility (it now handles both bash shopt options and PowerShell UTF-8 setup).

  • In handleOutput, skip getCachedEncodingForBuffer when shell === 'powershell' and always construct the TextDecoder with 'utf-8', ensuring the forced UTF-8 output is decoded correctly regardless of the system code page.

The guard is idempotent — if the command already starts with it, it is not prepended again.

Add POWERSHELL_UTF8_GUARD preamble to force [Console]::OutputEncoding
and $OutputEncoding to UTF-8 before each PowerShell command, ensuring
non-ASCII output is correctly captured through Node.js pipes.

Also rename ensurePromptvarsDisabled to applyShellPreamble to reflect
its broader responsibility, and skip getCachedEncodingForBuffer when
shell is powershell so the TextDecoder always uses UTF-8 regardless of
the system code page.

Fixes google-gemini#20661
@enjoykumawat enjoykumawat requested a review from a team as a code owner April 13, 2026 08:43
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses character encoding issues encountered when capturing PowerShell output on Windows systems with non-UTF-8 default code pages. By forcing UTF-8 encoding at the shell level and updating the output decoding logic, the changes ensure that non-ASCII characters are correctly captured and displayed.

Highlights

  • PowerShell UTF-8 Enforcement: Introduced a preamble that forces PowerShell to use UTF-8 encoding for stdout and stderr, preventing garbled text on non-UTF-8 Windows systems.
  • Refactored Shell Preamble Logic: Renamed 'ensurePromptvarsDisabled' to 'applyShellPreamble' to support both bash-specific shopt configurations and the new PowerShell UTF-8 guard.
  • Encoding Decoder Update: Updated 'handleOutput' to bypass system encoding detection and explicitly use 'utf-8' when executing PowerShell commands.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a shell preamble to force UTF-8 output encoding in PowerShell, ensuring consistent non-ASCII character handling across different system locales. It also updates the output decoder to skip system encoding detection when PowerShell is used. A critical issue was identified where the shell variable is referenced in the handleOutput closure but is not defined within that scope, which would lead to a ReferenceError at runtime.

const encoding = getCachedEncodingForBuffer(data);
// PowerShell preamble forces UTF-8 output; skip system encoding detection.
const encoding =
shell === 'powershell' ? 'utf-8' : getCachedEncodingForBuffer(data);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

A ReferenceError is introduced in the childProcessFallback method. The variable shell is used on line 632 within the handleOutput closure, but it is not defined in the scope of childProcessFallback or the closure itself. This will cause the application to crash with a ReferenceError whenever handleOutput is invoked, leading to a reliable Denial of Service (DoS) for any shell execution that falls back to childProcessFallback. To remediate, ensure the shell variable is available in the scope of childProcessFallback, for example, by calling getShellConfiguration() here. Additionally, please note that executeWithPty (around line 1127) appears to be missing this logic, which means PTY-based executions might still suffer from encoding issues on Windows.

          const { shell } = getShellConfiguration();
          const encoding =
            shell === 'powershell' ? 'utf-8' : getCachedEncodingForBuffer(data);

@gemini-cli gemini-cli bot added priority/p2 Important but can be addressed in a future release. area/core Issues related to User Interface, OS Support, Core Functionality help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support! labels Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core Issues related to User Interface, OS Support, Core Functionality help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support! priority/p2 Important but can be addressed in a future release.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] run_shell_command returns garbled Chinese text for the Agent, while referenced files (@filename) work correctly

1 participant