Skip to content

fix(ask-user-format): forbid \uXXXX escaping of CJK chars in AskUserQuestion #1203

@joe51317-dotcom

Description

@joe51317-dotcom

Problem

In tier-2+ skills (plan-eng-review, plan-ceo-review, office-hours,
autoplan, investigate, retro, codex, ...) the model frequently
corrupts CJK characters when emitting AskUserQuestion. Users see garbled
questions.

Concrete repro (Traditional Chinese):

  • Intended: 管理工具 — U+7BA1 U+7406 U+5DE5 U+5177
  • Rendered: ㄃3用箱 — model wrote thinking it was ,
    but U+3103 is (Bopomofo). Wrong codepoint by an order of magnitude.

The trigger is long, multi-line question strings — exactly the
decision-brief format gstack mandates (D# + ELI10 + Stakes + Pros/Cons

  • Net), where question bodies routinely run hundreds of CJK characters.
    At that length, reflexive \uXXXX escaping kicks in and the model
    miscodes from memory.

Root cause

scripts/resolvers/preamble/generate-ask-user-format.ts — the single
source of truth for the AskUserQuestion contract injected into every
tier-2+ SKILL.md — has no guidance on character encoding. The model
defaults to escaping non-ASCII in JSON tool params, but tool param
transport (Claude Code, OpenAI, MCP) is UTF-8 native; the escape was
never necessary and is unreliable for long CJK runs.

Fix

Add one element rule (after current rule 11) and one self-check item
inside generateAskUserFormat(). Must live in the returned template
string so it lands in skill body — that's where salience is highest
when the model is about to call the tool.

--- a/scripts/resolvers/preamble/generate-ask-user-format.ts
+++ b/scripts/resolvers/preamble/generate-ask-user-format.ts
@@ rule 11 ends here @@
     labels (A, B, C).

+12. **Non-ASCII characters — write directly, never \\u-escape.** When any
+    string field (question, option label, option description) contains
+    Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit
+    the literal UTF-8 characters in the JSON string. **Never escape them
+    as \`\\uXXXX\`.** Tool param transport is UTF-8 native; manual
+    escaping requires recalling each codepoint from training, which is
+    unreliable for long CJK strings — the model regularly emits the
+    wrong codepoint (e.g. writes \`\\u3103\` thinking it is 管 U+7BA1,
+    but \`\\u3103\` is actually ㄃, so the user sees \`管理工具\`
+    rendered as \`㄃3用箱\`). The trigger is long, multi-line questions
+    with hundreds of CJK characters: exactly when reflexive escaping
+    kicks in and exactly when miscoding is most damaging. Long ≠ escape.
+
+    Wrong: \`"question": "請選擇\\uXXXX\\uXXXX\\uXXXX\\uXXXX"\`
+    Right: \`"question": "請選擇管理工具"\`
+
+    Only JSON-mandatory escapes remain allowed: \`\\n\`, \`\\t\`, \`\\"\`, \`\\\\\`.
+
 ### Self-check before emitting
 ...
 - [ ] You are calling the tool, not writing prose
+- [ ] Non-ASCII characters (CJK / accents) written directly, NOT \\u-escaped

Then recompile:

bun run gen:skill-docs --host all

Acceptance criteria

  • Rule 12 appears in compiled .agents/skills/gstack-*/SKILL.md
    across all 30 tier-2+ skills
  • New self-check item appears in the same files
  • Manual repro: in a Traditional/Simplified Chinese context, run
    /plan-eng-review on a multi-step plan; AskUserQuestion options
    render intact CJK (no ㄃3用箱-style codepoint substitution)
  • Tier-1 skills unaffected (they don't load generateAskUserFormat)
  • No new test failures in bun test

Impact

Currently breaks gstack's marquee plan-review skills (/plan-eng-review,
/plan-ceo-review, /plan-design-review, /office-hours, /autoplan)
for the entire CJK userbase — TW/CN/JP/KR users see corrupted decision
briefs and can't reliably pick options. One file, ~25 lines, fixes all
30 tier-2+ skills via the shared preamble.

I've validated the patch locally; happy to send a PR if preferred over
this prompt-style issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions