Skip to content

docs: opt technical terms out of auto-translation (#318)#323

Merged
justin13888 merged 1 commit into
masterfrom
docs/auto-translation-friendly-318
Jun 3, 2026
Merged

docs: opt technical terms out of auto-translation (#318)#323
justin13888 merged 1 commit into
masterfrom
docs/auto-translation-friendly-318

Conversation

@justin13888
Copy link
Copy Markdown
Collaborator

Closes #318.

Problem

Browser and machine auto-translators (Chrome/Google Translate, Edge, Safari, Firefox) translate technical content unless told otherwise, mangling cryptographic primitive names (SHA-256, AES-256-GCM), brand names (Capsule), and protocol names across the docs.

Approach

Issue #318 asked how to apply translate="no" across 46 docs / ~4,200 lines without drowning the source in markup. Rather than manual MDX spans, this adds one centralized rehype plugin + single glossaryzero per-file content edits, all files stay .md.

The plugin runs over each page's HTML AST and does two passes:

  1. Code protection — marks every <code>/<pre> with translate="no" + class="notranslate", covering all backticked identifiers and code blocks automatically.
  2. Glossary protection — wraps bare-prose occurrences of curated terms in <span translate="no" class="notranslate">, skipping text already inside <code>/<pre>/<a>. Uses longest-first matching with token-boundary lookarounds so ML-KEM-768 beats ML-KEM and SHA-2 never matches inside SHA-256.

The glossary (no-translate-terms.js) is a single source of truth covering crypto primitives & algorithms, brand/product names, and protocol/infra names.

Files

  • capsule-docs/src/lib/no-translate-terms.js — glossary (SSoT)
  • capsule-docs/src/lib/rehype-notranslate.mjs — the plugin
  • capsule-docs/src/lib/rehype-notranslate.test.mjs — 8 vitest unit tests
  • capsule-docs/astro.config.mjs — registers the plugin via markdown.rehypePlugins
  • capsule-docs/package.json — adds vitest + rehype devDeps and a test script

Verification

  • bun run test → 8/8 pass (code marking, prose wrapping, table cells, no double-wrap in code/links, longest-match precedence, token boundaries)
  • bun run build → 47 pages built, all internal links valid
  • Generated dist/.../primitives/index.html carries 91 translate="no" markers (28 <code> + correctly-wrapped prose terms)
  • biome check clean

Out of scope

  • Starlight i18n localized builds (existing astro.config.mjs TODO)
  • .md.mdx conversion (unnecessary with this approach)

One manual check left for the reviewer: load a built page in Chrome → Translate → confirm protected terms stay in English.

Browser/machine translators mangle crypto primitive names, brand names,
and protocol names. Add a rehype plugin that marks all <code>/<pre> with
translate="no" and wraps a curated glossary of bare-prose terms (SHA-256,
AES-256-GCM, Ed25519, MLS, Capsule, GraphQL, ...) in <span translate="no">,
across all 46 docs with zero per-file edits.

The glossary is a single source of truth in no-translate-terms.js; the
plugin uses longest-first matching with token-boundary lookarounds and is
covered by vitest unit tests. Keeps all files as .md (no MDX conversion)
and leaves Starlight i18n as a separate future TODO.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying capsule with  Cloudflare Pages  Cloudflare Pages

Latest commit: d6a5142
Status: ✅  Deploy successful!
Preview URL: https://2924c326.capsule-22k.pages.dev
Branch Preview URL: https://docs-auto-translation-friend.capsule-22k.pages.dev

View logs

@justin13888 justin13888 merged commit 76220a8 into master Jun 3, 2026
2 checks passed
@justin13888 justin13888 deleted the docs/auto-translation-friendly-318 branch June 3, 2026 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make Capsule docs more friendly for auto-translations

1 participant