Skip to content

Windows PowerShell UTF-8 Markdown files can render as mojibake in agent context, causing patch mismatch and unreliable edits #15422

@Gyropilot2

Description

@Gyropilot2

Problem

On Windows with PowerShell 5.1, UTF-8 Markdown files sometimes showed up as mojibake in the agent/tool context even though the files themselves were valid UTF-8 on disk.

Typical example:

  • showing up as —

This made patch-based edits unreliable, because the tool seemed to be matching against the misdecoded text it was seeing rather than the real file contents.

What I observed

The strange part was that the file was not actually corrupted.

  • Normal agent/shell context could show mojibake
  • Get-Content -Encoding utf8 <file> showed the file correctly
  • hex inspection confirmed the file bytes were valid UTF-8

So the problem looked less like “the file was broken” and more like “the tool/context path was decoding it incorrectly.”

Environment

  • Windows
  • PowerShell 5.1
  • Initial code page: 437
  • Initial console input encoding: IBM437
  • Initial $OutputEncoding: us-ascii

Why this matters

When the displayed context is wrong, patch/edit operations can fail even though the file itself is fine.

This was especially painful with large .md files, because the agent could end up seeing mojibaked context, fail to match hunks, and behave as if the file content were different from what was really on disk.

How I worked around it

These local environment changes helped a lot:

  • chcp 65001
  • set console input/output encoding to UTF-8
  • set $OutputEncoding to UTF-8

After that, UTF-8 reads and edit verification behaved much more reliably.

Expected behavior

The tool should read UTF-8 files correctly for context and patch matching regardless of Windows shell/codepage defaults, or at least use a deterministic UTF-8 file-reading path instead of inheriting console encoding behavior.

Possible related issue

There may also be a separate Windows issue where editing Markdown can normalize line endings in touched regions. That was much less serious than the mojibake problem, but it showed up during the same debugging session.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CLIIssues related to the Codex CLIbugSomething isn't workingtool-callsIssues related to tool callingwindows-osIssues related to Codex on Windows systems

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions