Conversation
… \u escapes The landing page at heznpc.github.io/skillBridge/ has been displaying literal escape sequences for non-Latin language names since the lang-tag list was first auto-generated: Korean : \ud55c\uad6d\uc5b4 (should be 한국어) Japanese : \u65e5\u672c\u8a9e (should be 日本語) Chinese : \u4e2d\u6587(\u7b80\u4f53) (should be 中文(简体)) Russian : \u0420\u0443\u0441\u0441\u043a\u0438\u0439 (should be Русский) ... and 5 more Verified live at https://heznpc.github.io/skillBridge/ before this fix. Languages whose names are pure Latin (Spanish, French, German, Vietnamese post-decoding via the JS engine in popup) rendered fine. Root cause: src/lib/constants.js stored each language label as a string literal containing \uXXXX escape sequences. At runtime this is fine — the JS engine decodes them when parsing the source — so popup.js, header- controls.js, and the in-extension language picker all show the right characters. But scripts/generate-docs.js reads constants.js as a TEXT FILE and uses a regex to extract the label between single quotes. The regex captures the raw bytes \, u, d, 5, 5, c, ... which then get written verbatim into docs/index.html. Browsers don't decode \uXXXX in HTML text content, so the live page shows the escape sequences literally. Fix: convert every \uXXXX escape in PREMIUM_LANGUAGES and AVAILABLE_LANGUAGES to its literal UTF-8 character. constants.js is already a UTF-8 source file, every other label was already non-escaped (Deutsch, Italiano, etc.), and prettier/eslint accept either form — there was no reason to use escapes in the first place. Once the source has real characters, the script's regex captures them and the HTML gets correct UTF-8 output. Why this is safe at runtime: - The JS engine produces an identical string from '\ud55c\uad6d\uc5b4' and '한국어'. - All runtime consumers (popup.js:112-128, header-controls.js:78-178, translator.js:33, content.js:455) use lang.label as text and don't string-compare against escape literals. - tests/constants.test.js only checks length, code presence, and label truthiness — no string content assertions. Tests pass unchanged. Verification: - npm test — 309/309 passing (same as baseline) - npm run lint — clean - npm run format:check — clean (Prettier accepts UTF-8 in source) - npm run docs — idempotent; second run produces no diff - docs/index.html — confirmed via raw byte read that the new content is valid UTF-8 with real Unicode characters Out of scope (separate fix candidate): - README.md L7 links to https://heznpc.github.io/skillbridge/ (lowercase b) but the actual GitHub Pages URL is https://heznpc.github.io/skillBridge/ (capital B). Lowercase returns 404. Same project area but a separate concern. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The landing page at https://heznpc.github.io/skillBridge/ has been displaying literal
\uXXXXescape sequences for non-Latin language names since the lang-tag list was first auto-generated. Verified live before this fix.\ud55c\uad6d\uc5b4\u65e5\u672c\u8a9e\u4e2d\u6587(\u7b80\u4f53)\u4e2d\u6587(\u7e41\u9ad4)\u0420\u0443\u0441\u0441\u043a\u0438\u0439+ N more)\u...Languages whose names are pure Latin (
Español,Français,Deutsch,Português (BR), etc.) rendered fine because they were already stored with literal characters.Root cause
src/lib/constants.jsstored each language label as a string literal containing\uXXXXescape sequences:At runtime this is fine. The JS engine decodes
\uXXXXwhen parsing the source, sopopup.js,header-controls.js,translator.js, and the in-extension language picker all see the correct characters and display them properly.But
scripts/generate-docs.jsreadsconstants.jsas a TEXT FILE and uses a regex to extract the label between single quotes:The regex captures the raw bytes
\,u,d,5,5,c, ... which then get written verbatim intodocs/index.html. Browsers don't decode\uXXXXin HTML text content, so users see the escape sequences literally on the live page.Fix
Convert every
\uXXXXescape inPREMIUM_LANGUAGESandAVAILABLE_LANGUAGESto its literal UTF-8 character.constants.jsis already a UTF-8 source file, every other label was already non-escaped (Deutsch, Italiano, Polski...), and prettier/eslint accept either form — there was no reason to use escapes.Once the source has real characters, the script's regex captures them and the HTML output is correct UTF-8.
Why this is safe at runtime
'\ud55c\uad6d\uc5b4'and'한국어'produce identical strings when the JS engine parses thempopup.js:112-128,header-controls.js:78-178,translator.js:33,content.js:455) uselang.labelas text and don't string-compare against escape literalstests/constants.test.jsonly checks length, code presence, and label truthiness — no string content assertionsVerification
npm testnpm run lintnpm run format:checknpm run docsdocs/index.htmlOut of scope (separate fix candidate)
README.md:7links tohttps://heznpc.github.io/skillbridge/(lowercaseb) but the actual GitHub Pages URL ishttps://heznpc.github.io/skillBridge/(capitalB). Lowercase returns 404. Same project area but a separate concern — will track separately if you'd like.Test plan
test+validate)🤖 Generated with Claude Code