[Repo Assist] fix(i18n): repair Windows-1252-misread-as-UTF-8 mojibake in zh-cn, zh-tw, fr-fr resources#585
Conversation
…-tw, fr-fr resources The zh-cn and zh-tw Resources.resw files (1012 and 1023 strings respectively) and fr-fr (17 strings) were double-encoded: the original UTF-8 bytes were interpreted as Windows-1252 code points then re-encoded as UTF-8. This caused mojibake in the UI on systems where the ANSI code page is not 936 (GBK). Fix: reverse the encoding by mapping each code unit back through the Windows-1252 → byte mapping (handling the cp1252-specific 0x80–0x9F range), reassembling the original UTF-8 byte sequence, and writing the correctly decoded string. Root cause: the translation tool used to generate these files decoded source UTF-8 as Windows-1252 before writing the XML. Closes #583 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Codex review: needs maintainer review before merge. Reviewed May 29, 2026, 9:54 AM ET / 13:54 UTC. Summary Reproducibility: yes. Source inspection on current main shows mojibake in the shipped zh-cn and fr-fr resource values, and the linked user report includes visible before-fix corruption. Review metrics: 3 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance:
Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land the resource repair after redacted Windows UI proof, required validation, and a focused localization corruption check are provided or explicitly tracked. Do we have a high-confidence way to reproduce the issue? Yes. Source inspection on current main shows mojibake in the shipped zh-cn and fr-fr resource values, and the linked user report includes visible before-fix corruption. Is this the best way to solve the issue? Yes, the data-only reversal is the narrowest maintainable fix for the encoding corruption, but it still needs Windows runtime proof, required validation, and recurrence coverage before merge. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 32e6025d00c6. Label changesLabel changes:
Label justifications:
Evidence reviewedWhat I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
There was a problem hiding this comment.
Pull request overview
This PR repairs mojibake in WinUI localization resource files so affected Chinese and French UI strings render as intended instead of garbled Windows-1252/UTF-8 misdecoding artifacts.
Changes:
- Restores Simplified Chinese resource values across
zh-cn. - Restores Traditional Chinese resource values across
zh-tw. - Fixes French
œ/Œmojibake in node-related strings.
Show a summary per file
| File | Description |
|---|---|
src/OpenClaw.Tray.WinUI/Strings/zh-cn/Resources.resw |
Replaces widespread garbled Simplified Chinese resource values with readable Chinese text. |
src/OpenClaw.Tray.WinUI/Strings/zh-tw/Resources.resw |
Restores corresponding Traditional Chinese resource values. |
src/OpenClaw.Tray.WinUI/Strings/fr-fr/Resources.resw |
Corrects French nœud / NŒUD strings that were encoded as nÅ“ud / NÅ’UD. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 1/3 changed files
- Comments generated: 0
|
Audit verdict: NO LONGER NEEDED. Current
Targeted byte-level scans of the current master Git objects show the issue is no longer present:
The PR is also stale relative to current master. PR HEAD has 1500 resource values per file, while master has 1509, including newer keys such as Checks performed:
Closing this PR because the underlying issue has already been fixed on master and merging this stale branch could regress newer localization entries. |
🤖 This PR was created by Repo Assist, an automated AI assistant.
Summary
Repairs widespread mojibake (garbled characters) in three localization resource files.
Closes #583
Root Cause
The
zh-cn,zh-tw, andfr-frResources.reswfiles were double-encoded: their original UTF-8 byte sequences were decoded as Windows-1252 code points, then re-encoded as UTF-8 and written into the XML file. On a machine where the ANSI code page is 936 (GBK), the OS would re-interpret those code units correctly by accident — but on any other system (en-US, etc.) the UI renders raw mojibake.Concrete example (zh-cn, 启动 → "启动"):
E5 90 AF E5 8A A8å \x90 ̄ å Š ̈C3 A5 C2 90 C2 AF C3 A5 C5 A0 C2 A8å ̄åŠ ̈on most systems.Fix
For each string value, if every code unit maps cleanly back through the Windows-1252 reverse table (including the
0x80–0x9Fcode-page-specific range), reassemble the original byte sequence and decode it as UTF-8. Strings that are already correct (e.g. ASCII, or genuine Latin text) are left unchanged.Files changed:
zh-cn/Resources.resw— 1,012 strings fixed (out of ~1,700 total)zh-tw/Resources.resw— 1,023 strings fixedfr-fr/Resources.resw— 17 strings fixed (accented characters: é, ê, ç, etc.)nl-nl/Resources.resw— no changes needed (already correct)en-us/Resources.resw— no changes needed (ASCII only)Trade-offs
Test Status
dotnet buildfails with a GitVersionDirectoryNotFoundExceptionfor/home/runner/work/_temp/_runner_file_commands/set_env_*. This is a pre-existing environment issue unrelated to this change (resource.reswfiles are not compiled on Linux). The fix is a pure data change to XML resource files; no C# code was modified.Localization correctness can be verified on Windows with:
pwsh .\scripts\Test-Localization.ps1