Skip to content

[Repo Assist] fix(i18n): repair Windows-1252-misread-as-UTF-8 mojibake in zh-cn, zh-tw, fr-fr resources#585

Closed
github-actions[bot] wants to merge 1 commit into
masterfrom
repo-assist/fix-issue-583-mojibake-resources-cfc632e1db8aec9e
Closed

[Repo Assist] fix(i18n): repair Windows-1252-misread-as-UTF-8 mojibake in zh-cn, zh-tw, fr-fr resources#585
github-actions[bot] wants to merge 1 commit into
masterfrom
repo-assist/fix-issue-583-mojibake-resources-cfc632e1db8aec9e

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

🤖 This PR was created by Repo Assist, an automated AI assistant.

Summary

Repairs widespread mojibake (garbled characters) in three localization resource files.

Closes #583

Root Cause

The zh-cn, zh-tw, and fr-fr Resources.resw files were double-encoded: their original UTF-8 byte sequences were decoded as Windows-1252 code points, then re-encoded as UTF-8 and written into the XML file. On a machine where the ANSI code page is 936 (GBK), the OS would re-interpret those code units correctly by accident — but on any other system (en-US, etc.) the UI renders raw mojibake.

Concrete example (zh-cn, 启动 → "启动"):

  • Original UTF-8 bytes: E5 90 AF E5 8A A8
  • Misinterpreted as Windows-1252: å \x90 ̄ å Š ̈
  • Re-encoded as UTF-8 and stored: C3 A5 C2 90 C2 AF C3 A5 C5 A0 C2 A8
  • This is what the file contained; it renders as å ̄åŠ ̈ on most systems.

Fix

For each string value, if every code unit maps cleanly back through the Windows-1252 reverse table (including the 0x80–0x9F code-page-specific range), reassemble the original byte sequence and decode it as UTF-8. Strings that are already correct (e.g. ASCII, or genuine Latin text) are left unchanged.

Files changed:

  • zh-cn/Resources.resw — 1,012 strings fixed (out of ~1,700 total)
  • zh-tw/Resources.resw — 1,023 strings fixed
  • fr-fr/Resources.resw — 17 strings fixed (accented characters: é, ê, ç, etc.)
  • nl-nl/Resources.resw — no changes needed (already correct)
  • en-us/Resources.resw — no changes needed (ASCII only)

Trade-offs

  • Pure data change; no code logic modified.
  • The fix is deterministic and invertible. The algorithm is conservative: any string that cannot be round-tripped through the Windows-1252 map is left untouched.

Test Status

⚠️ Infrastructure failuredotnet build fails with a GitVersion DirectoryNotFoundException for /home/runner/work/_temp/_runner_file_commands/set_env_*. This is a pre-existing environment issue unrelated to this change (resource .resw files are not compiled on Linux). The fix is a pure data change to XML resource files; no C# code was modified.

Localization correctness can be verified on Windows with:

pwsh .\scripts\Test-Localization.ps1

Generated by 🌈 Repo Assist, see workflow run. Learn more.

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@97143ac59cb3a13ef2a77581f929f06719c7402a

…-tw, fr-fr resources

The zh-cn and zh-tw Resources.resw files (1012 and 1023 strings respectively)
and fr-fr (17 strings) were double-encoded: the original UTF-8 bytes were
interpreted as Windows-1252 code points then re-encoded as UTF-8. This caused
mojibake in the UI on systems where the ANSI code page is not 936 (GBK).

Fix: reverse the encoding by mapping each code unit back through the Windows-1252
→ byte mapping (handling the cp1252-specific 0x80–0x9F range), reassembling the
original UTF-8 byte sequence, and writing the correctly decoded string.

Root cause: the translation tool used to generate these files decoded source
UTF-8 as Windows-1252 before writing the XML.

Closes #583

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 29, 2026

Codex review: needs maintainer review before merge. Reviewed May 29, 2026, 9:54 AM ET / 13:54 UTC.

Summary
The PR replaces mojibake in the fr-fr, zh-cn, and zh-tw WinUI Resources.resw values with UTF-8 text recovered from the Windows-1252-misread strings.

Reproducibility: yes. Source inspection on current main shows mojibake in the shipped zh-cn and fr-fr resource values, and the linked user report includes visible before-fix corruption.

Review metrics: 3 noteworthy metrics.

  • Resource values rewritten: 2,052 values across 3 files. This is a large data-only localization rewrite, so maintainers should require runtime proof even though no C# code changed.
  • Required validation shown: 0 of 3 commands reported complete. AGENTS.md requires build, shared tests, and tray tests, while the PR body reports validation was blocked.
  • Static consistency check: 3 files parsed; 1,500 keys each; 0 placeholder mismatches. The changed resources appear structurally consistent, but this does not replace Windows UI proof.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add redacted after-fix Windows UI screenshots or a short recording for zh-CN, zh-TW, and fr-FR rendering without mojibake.
  • [P1] Run and report ./build.ps1 plus the required shared and tray test commands, or provide a maintainer-verified environment blocker.
  • [P1] Add or explicitly track a focused localization regression check for mojibake-like resource corruption.

Proof guidance:

  • [P1] Needs real behavior proof before merge: No after-fix Windows UI screenshot, recording, terminal output, linked artifact, or logs prove the repaired locales render correctly at runtime; proof should be redacted and added to the PR body for re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Mantis proof suggestion
A visible Windows desktop proof would materially help review because the change is localized UI rendering. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

visual task: verify the Windows tray app renders zh-CN, zh-TW, and fr-FR resource strings without mojibake after this PR.

Risk before merge

  • [P1] No after-fix Windows UI proof shows that the generated PRI/WinUI runtime renders zh-CN, zh-TW, and fr-FR without mojibake.
  • [P1] The PR body reports build validation was blocked, so AGENTS.md-required build, shared tests, and tray tests have not been shown for this branch.
  • [P1] The deterministic round-trip check supports the encoding repair, but it does not prove visual layout, runtime resource loading, or native-language translation quality.
  • [P1] There is still no focused mojibake regression check, so the same resource corruption could recur in translation tooling.

Maintainer options:

  1. Require Windows runtime proof (recommended)
    Have the contributor or a maintainer post redacted screenshots or a short recording for zh-CN, zh-TW, and fr-FR, then report the required validation before merge.
  2. Accept the data-only risk
    Maintainers may intentionally merge after inspecting the deterministic round-trip and owning the lack of contributor runtime proof.
  3. Replace with a verified repair
    If runtime proof and validation cannot be produced, keep the linked bug open and replace this bot branch with a narrower proof-backed repair.

Next step before merge

  • [P2] The remaining blocker is maintainer or contributor proof and validation, not a narrow code repair ClawSweeper can safely make.

Security
Cleared: The diff only changes existing .resw localization values and introduces no new code execution, dependency, workflow, or secrets surface.

Review details

Best possible solution:

Land the resource repair after redacted Windows UI proof, required validation, and a focused localization corruption check are provided or explicitly tracked.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection on current main shows mojibake in the shipped zh-cn and fr-fr resource values, and the linked user report includes visible before-fix corruption.

Is this the best way to solve the issue?

Yes, the data-only reversal is the narrowest maintainable fix for the encoding corruption, but it still needs Windows runtime proof, required validation, and recurrence coverage before merge.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 32e6025d00c6.

Label changes

Label changes:

  • add merge-risk: 🚨 other: Merging a large automated localization rewrite without runtime proof could visibly degrade localized UI even if normal static checks pass.
  • add rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🐚 platinum hermit.
  • remove rating: 🦪 silver shellfish: Current PR rating is rating: 🧂 unranked krab, so this older rating label is no longer current.

Label justifications:

  • P2: This fixes a real localized UI bug with limited blast radius and a clear resource-file scope, but it is not an emergency runtime outage.
  • merge-risk: 🚨 other: Merging a large automated localization rewrite without runtime proof could visibly degrade localized UI even if normal static checks pass.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🐚 platinum hermit.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: No after-fix Windows UI screenshot, recording, terminal output, linked artifact, or logs prove the repaired locales render correctly at runtime; proof should be redacted and added to the PR body for re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

What I checked:

Likely related people:

  • Scott Hanselman: Blame for the current mojibake lines points to the merge commit that added the affected locale resource files. (role: introduced affected resources / merger; confidence: medium; commits: 8714381248a2; files: src/OpenClaw.Tray.WinUI/Strings/zh-cn/Resources.resw, src/OpenClaw.Tray.WinUI/Strings/zh-tw/Resources.resw, src/OpenClaw.Tray.WinUI/Strings/fr-fr/Resources.resw)
  • the99missedcalls: Recent work touched the same localization resource files while improving voice input readiness UX. (role: recent area contributor; confidence: medium; commits: 6a1429c75cfa; files: src/OpenClaw.Tray.WinUI/Strings/zh-cn/Resources.resw, src/OpenClaw.Tray.WinUI/Strings/zh-tw/Resources.resw, src/OpenClaw.Tray.WinUI/Strings/fr-fr/Resources.resw)
  • Ranjesh: Recent config page editor work touched the same affected resource files. (role: recent adjacent contributor; confidence: low; commits: 9de9b5ba0f8a; files: src/OpenClaw.Tray.WinUI/Strings/zh-cn/Resources.resw, src/OpenClaw.Tray.WinUI/Strings/zh-tw/Resources.resw, src/OpenClaw.Tray.WinUI/Strings/fr-fr/Resources.resw)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. merge-risk: 🚨 other 🚨 Merging this PR has meaningful risk outside the owned taxonomy. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels May 29, 2026
@shanselman shanselman requested a review from Copilot May 30, 2026 00:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR repairs mojibake in WinUI localization resource files so affected Chinese and French UI strings render as intended instead of garbled Windows-1252/UTF-8 misdecoding artifacts.

Changes:

  • Restores Simplified Chinese resource values across zh-cn.
  • Restores Traditional Chinese resource values across zh-tw.
  • Fixes French œ / Œ mojibake in node-related strings.
Show a summary per file
File Description
src/OpenClaw.Tray.WinUI/Strings/zh-cn/Resources.resw Replaces widespread garbled Simplified Chinese resource values with readable Chinese text.
src/OpenClaw.Tray.WinUI/Strings/zh-tw/Resources.resw Restores corresponding Traditional Chinese resource values.
src/OpenClaw.Tray.WinUI/Strings/fr-fr/Resources.resw Corrects French nœud / NŒUD strings that were encoded as nÅ“ud / NÅ’UD.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 1/3 changed files
  • Comments generated: 0

@ranjeshj
Copy link
Copy Markdown
Collaborator

ranjeshj commented Jun 3, 2026

Audit verdict: NO LONGER NEEDED.

Current origin/master already contains 6d14b00 Fix mojibake in localized tray resources, which fixes the Windows-1252-misread-as-UTF-8 mojibake in the same localized resource files targeted by this PR:

  • src/OpenClaw.Tray.WinUI/Strings/fr-fr/Resources.resw
  • src/OpenClaw.Tray.WinUI/Strings/zh-cn/Resources.resw
  • src/OpenClaw.Tray.WinUI/Strings/zh-tw/Resources.resw

Targeted byte-level scans of the current master Git objects show the issue is no longer present:

  • fr-fr/Resources.resw: 1509 values; 0 replacement chars, 0 box-drawing chars, 0 typical Latin mojibake sequences.
  • zh-cn/Resources.resw: 1509 values; 1267 values contain CJK; 0 replacement chars, 0 box-drawing chars, 0 typical Latin mojibake sequences.
  • zh-tw/Resources.resw: 1509 values; 1267 values contain CJK; 0 replacement chars, 0 box-drawing chars, 0 typical Latin mojibake sequences.

The PR is also stale relative to current master. PR HEAD has 1500 resource values per file, while master has 1509, including newer keys such as Chat_Permission_* and SettingsPage_LocalGatewaySetup_*. Some SSH-related localized values are also better on master than on the PR branch.

Checks performed:

Closing this PR because the underlying issue has already been fixed on master and merging this stale branch could regress newer localization entries.

@ranjeshj ranjeshj closed this Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automation merge-risk: 🚨 other 🚨 Merging this PR has meaningful risk outside the owned taxonomy. P2 Normal priority bug or improvement with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. repo-assist status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Chinese Simplified text shows garbled characters (mojibake) in UI — v0.6.0-alpha.5

2 participants