docs: opt technical terms out of auto-translation (#318)#323
Merged
Conversation
Browser/machine translators mangle crypto primitive names, brand names, and protocol names. Add a rehype plugin that marks all <code>/<pre> with translate="no" and wraps a curated glossary of bare-prose terms (SHA-256, AES-256-GCM, Ed25519, MLS, Capsule, GraphQL, ...) in <span translate="no">, across all 46 docs with zero per-file edits. The glossary is a single source of truth in no-translate-terms.js; the plugin uses longest-first matching with token-boundary lookarounds and is covered by vitest unit tests. Keeps all files as .md (no MDX conversion) and leaves Starlight i18n as a separate future TODO.
Deploying capsule with
|
| Latest commit: |
d6a5142
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://2924c326.capsule-22k.pages.dev |
| Branch Preview URL: | https://docs-auto-translation-friend.capsule-22k.pages.dev |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #318.
Problem
Browser and machine auto-translators (Chrome/Google Translate, Edge, Safari, Firefox) translate technical content unless told otherwise, mangling cryptographic primitive names (
SHA-256,AES-256-GCM), brand names (Capsule), and protocol names across the docs.Approach
Issue #318 asked how to apply
translate="no"across 46 docs / ~4,200 lines without drowning the source in markup. Rather than manual MDX spans, this adds one centralized rehype plugin + single glossary — zero per-file content edits, all files stay.md.The plugin runs over each page's HTML AST and does two passes:
<code>/<pre>withtranslate="no"+class="notranslate", covering all backticked identifiers and code blocks automatically.<span translate="no" class="notranslate">, skipping text already inside<code>/<pre>/<a>. Uses longest-first matching with token-boundary lookarounds soML-KEM-768beatsML-KEMandSHA-2never matches insideSHA-256.The glossary (
no-translate-terms.js) is a single source of truth covering crypto primitives & algorithms, brand/product names, and protocol/infra names.Files
capsule-docs/src/lib/no-translate-terms.js— glossary (SSoT)capsule-docs/src/lib/rehype-notranslate.mjs— the plugincapsule-docs/src/lib/rehype-notranslate.test.mjs— 8 vitest unit testscapsule-docs/astro.config.mjs— registers the plugin viamarkdown.rehypePluginscapsule-docs/package.json— addsvitest+rehypedevDeps and atestscriptVerification
bun run test→ 8/8 pass (code marking, prose wrapping, table cells, no double-wrap in code/links, longest-match precedence, token boundaries)bun run build→ 47 pages built, all internal links validdist/.../primitives/index.htmlcarries 91translate="no"markers (28<code>+ correctly-wrapped prose terms)biome checkcleanOut of scope
astro.config.mjsTODO).md→.mdxconversion (unnecessary with this approach)One manual check left for the reviewer: load a built page in Chrome → Translate → confirm protected terms stay in English.