[docs] sanitize JSX attribute quotes in auto-translated MDX#247
Conversation
The German translator periodically emits `<Tab title="Tab „Richtlinien"">` where it intends `„…"` typographic quotes but uses ASCII `"` for the closing — the inner straight `"` terminates the JSX attribute and the real attribute close becomes a stray `"` before `>`, which trips `mintlify validate` with `Unexpected character "`. PR #229 fixed this once by hand on `docs/de/dashboard.mdx`. The next auto-translation run regenerated the same broken markup, so the same parse error landed on `main` again after #246. Make it stick: - `scripts/translate-docs/mdx-translator.ts` adds `sanitizeJsxAttributes`, which strips stray trailing ASCII `"` after a JSX attribute close and drops unmatched typographic opening quotes (`„`, `"`, `«`, `‹`, `「`, `『`) inside the same value. Matched pairs (e.g. `「ポリシー」`) are preserved. Wired into `translateMdxPage` ahead of `rewriteInternalLinks`. - `scripts/translate-docs/translator.ts` extends rule #2 of the system prompt to forbid ASCII `"` inside JSX attribute values entirely, so the LLM is less likely to produce the pattern in the first place. - `__tests__/scripts/translate-docs/mdx-translator.test.ts` covers the exact `de/dashboard.mdx` failure plus self-close, multi-attribute, matched typographic pairs, empty-value, and multiple-on-one-line cases. - `docs/de/dashboard.mdx` drops the inner German quotes from the two `<Tab title>` attributes (mirrors #229) so CI on `main` goes green immediately rather than waiting for the next translation cycle. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR fixes MDX translation validation failures by introducing a Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@__tests__/scripts/translate-docs/mdx-translator.test.ts`:
- Around line 101-112: Add a new unit test that exercises sanitizeJsxAttributes
with an attribute value containing one correctly matched typographic quote pair
plus an extra stray opening typographic quote (e.g., a Japanese or German opener
alongside an ASCII or different closer) so the sanitizer must preserve the
matched pair and remove the stray opener; call sanitizeJsxAttributes with that
mixed input (use the same <Tab title="..."> pattern as existing tests) and
assert the returned string keeps the balanced pair intact while stripping only
the unmatched opener (i.e., expected string equals the attribute with the
matched quote pair preserved and the stray opener removed).
In `@scripts/translate-docs/mdx-translator.ts`:
- Around line 47-52: The loop that currently removes all occurrences of an
opener when opens > closes (using cleaned = cleaned.split(open).join("")) should
instead remove only the surplus unmatched opener(s); compute surplus = opens -
closes and remove that many occurrences (e.g., run a loop that calls cleaned =
cleaned.replace(open, "") surplus times or remove the last/first N instances
based on desired behavior), keeping matched pairs intact; update the block that
iterates over openings/open/close and use the variables opens, closes, cleaned
and prefix to perform limited removals (or replace this pass with a proper
JSX/MDX parse-based transform if you prefer).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3da2ab8f-3545-4039-8c5a-d8aee4da55d1
📒 Files selected for processing (5)
CHANGELOG.md__tests__/scripts/translate-docs/mdx-translator.test.tsdocs/de/dashboard.mdxscripts/translate-docs/mdx-translator.tsscripts/translate-docs/translator.ts
Two findings on PR #247: 1. mdx-translator.ts:50 — `cleaned.split(open).join("")` removed *every* occurrence of an opener when `opens > closes`, so a value containing one matched typographic pair plus one stray opener (e.g. `„Foo“ und „Bar`) lost the matched pair too. Fix: drop only the surplus = opens - closes openers, scanning from the right with `lastIndexOf` so the leftmost matched pair is preserved. 2. mdx-translator.test.ts — add a regression test for that mixed case (one matched „…“ pair + one dangling „) so the bug above can't recur. Also drop the English curly “…” pair from the openings list. U+201C is both the German closer and the English-curly opener, so processing the English pair after the German pair would strip the very German closer we just preserved. The remaining pairs (German, French ×2, Japanese ×2) all have unambiguous openers. 1177 unit tests pass (was 1176 — the new mixed-case test is the +1). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ion-bumps policy (#285) * [luv-cut-0.0.10-beta.0] chore: cut 0.0.10-beta.0 release Bumps package.json from 0.0.9-beta.3 to 0.0.10-beta.0 and rolls the ## Unreleased changelog section into ## 0.0.10-beta.0 — 2026-05-04. Why 0.0.10-beta.0 and not 0.0.9-beta.3: 0.0.9 is already published as `latest` on npm. Per semver, 0.0.9-beta.3 < 0.0.9 — publishing it would point the `beta` dist-tag at a version semver-older than the released 0.0.9, while shipping *more* features than 0.0.9 ever had. The next pre-release after a shipped 0.0.9 must live in the 0.0.10 line. Why the version had drifted to 0.0.13-beta.1 before #284 reset it: PRs #266 (OpenCode) and #267 (Pi) each speculatively bumped package.json in their feature branches even though no release was being cut. When unified into #270, the bumps stacked (0.0.10-beta.1 → .2 → 0.0.11-beta.1 → 0.0.12-beta.1 → 0.0.13-beta.1). Going forward, feature PRs should leave package.json alone — only release-cut PRs touch the version. Adds since v0.0.9: Features: - Add Gemini CLI integration (beta) (#277) - Add OpenCode (sst/opencode) integration (beta) (#270) - Add Pi (pi-coding-agent) integration (beta) (#270) - Add GitHub Copilot CLI integration (beta) (#236) - Add Cursor Agent CLI integration (beta) (#245) - Project page lists Copilot and Cursor sessions (#245) Fixes: - Pi integration: cache sessionId in shim (#284) - Cursor integration: support cursor-agent 2026-04+ layout (#283) - block-read-outside-cwd: deny message for all 6 CLIs (#270) - require-ci-green-before-stop: scope to current HEAD (#266) - failproofai policies --uninstall: correct selector wording (#236) - README: replace broken Copilot and Cursor logos (#236, #257) - Auto-translated MDX: sanitize JSX attribute quotes (#247) Docs: - README: drop "more coming soon" tagline (#281) - README: add Gemini, Pi, Cursor to supported-CLIs list (#277, #264, #245) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: add block-version-bumps custom policy Prevents the kind of drift that caused this very release. PRs #266 (OpenCode) and #267 (Pi) each speculatively bumped package.json in their feature branches, and when unified into #270 the bumps stacked all the way to 0.0.13-beta.1. PR #284 then over-corrected to 0.0.9-beta.3 — older than the already-published 0.0.9. The policy lives at .failproofai/policies/block-version-bumps.mjs (auto-loaded by failproofai's project-scope hooks). It blocks: - Edit/Write/MultiEdit on package.json that touches the "version" key - Bash: npm|yarn|pnpm|bun (pm) version <args> - Bash: sed|awk|jq mutating package.json referencing "version" Allowed when on a `luv-cut-*` branch — the established release-cut branch convention. Branch detection is a best-effort `git rev-parse` that fails open (returns false) so a missing/unusable git tree never blocks a legitimate edit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: address CodeRabbit review on block-version-bumps Three valid findings, all fixed: 1. sed/awk/jq detection (line 26): regex required `package.json` to appear before `version`, missing forms like `jq '.version="x"' package.json`. Switched to two non-consuming lookaheads so either ordering matches within a shell segment. 2. Value-only Edit/MultiEdit bypass (lines 74-84): an agent could issue `Edit { old_string: '"0.0.9-beta.3"', new_string: '"0.0.10-beta.0"' }` — neither string contains the literal `"version"` key, so the previous check let it through. Added STANDALONE_SEMVER_VALUE_RE plus an editTouchesVersion() helper that catches a value-only swap when both sides are bare semver-quoted strings that differ. The anchors (^ / $) and leading-digit requirement intentionally exclude range-prefixed dep entries (`"^1.2.3"`) and key-prefixed ones (`"react": "18.2.0"`), so dep-version Edits aren't false-positive. 3. Loose cut-branch match (line 36): `^luv-cut-/` allowed any suffix (e.g. `luv-cut-feature`). Tightened to require a semver-shape suffix: `^luv-cut-\d+\.\d+\.\d+(?:-[0-9A-Za-z.-]+)?$`. Verified via 16 regex test cases (sed orderings, dep edits with keys, range prefixes, cut branch shapes). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
CI / docs / Validate docsjob onmainis red after [auto] update translations #246 (docs: update translations for changed English sources) because the German translator emitted<Tab title="Tab „Richtlinien"">and<Tab title="Tab „Aktivität"">indocs/de/dashboard.mdx. The opening„(U+201E) is fine, but the LLM closed the typographic pair with an ASCII"(U+0022) instead of the proper"(U+201D). That ASCII"terminates the JSX attribute, leaving the real attribute close as a stray character before>, which tripsmintlify validatewithUnexpected character".main— PR [docs] fix mintlify parse error in de/dashboard.mdx #229 fixed it once by hand-editing the file, but the next auto-translation cycle regenerated the same broken markup. User directive: "this is failing a lot... lets fix this for once all".scripts/translate-docs/mdx-translator.ts— newsanitizeJsxAttributes(content)strips stray trailing ASCII"after a JSX attribute close, and drops only the surplus unmatched typographic opening quotes („,«,‹,「,『) inside the same value, scanning from the right so a matched pair earlier in the same value is preserved (e.g.„Foo“ und „Bar→„Foo“ und Bar). Wired intotranslateMdxPageahead ofrewriteInternalLinksso every translated page is sanitized before write.scripts/translate-docs/translator.ts— rule Bump next from 16.2.1 to 16.2.2 #2 of the system prompt now explicitly forbids ASCII"inside JSX attribute values, so the LLM is less likely to produce the pattern in the first place. Cached as part of the system-prompt prefix.__tests__/scripts/translate-docs/mdx-translator.test.ts— 9 new tests covering the exactde/dashboard.mdx:65failure, self-close, multi-attribute, matched typographic pairs, empty values, multiple malformed attributes on one line, and the mixedmatched-pair + stray-openercase CodeRabbit flagged.docs/de/dashboard.mdx— strip the inner German quotes from the two<Tab title>attributes (mirrors [docs] fix mintlify parse error in de/dashboard.mdx #229) so the failingdocs / Validate docscheck passes immediately on this PR rather than waiting for the next cache-invalidating translation run."…"is intentionally not in the openings list: U+201C is the German closer and the English-curly opener, so processing the English pair after German would strip the German closer. The remaining pairs (German, French ×2, Japanese ×2) all have unambiguous openers.## Unreleased→### Fixes.Fixes failing run: https://github.com/exospherehost/failproofai/actions/runs/25147733926
CodeRabbit follow-up (commit
897ee50)CodeRabbit raised two issues on the first commit and both are addressed:
cleaned.split(open).join(""), which removed all occurrences of an opener wheneveropens > closes, breaking matched pairs. Fixed: now drops onlysurplus = opens - closesopeners, scanning from the right withlastIndexOfso the leftmost matched pair is preserved.drops only the surplus opener when a matched pair is also presentcovering<Tab title="„Foo“ und „Bar"">→<Tab title="„Foo“ und Bar">.Test plan
bun run test:run __tests__/scripts/translate-docs/mdx-translator.test.ts— 18 tests pass (9 existing + 9 new sanitizer cases including the CodeRabbit mixed case).bun run test:run— full unit suite stays green (1177 passed). The[failproofai:hook] WARN policy "thrower" threw: …lines in the output are intentional fixture coverage from__tests__/hooks/policy-evaluator.test.ts:146-161(testing the evaluator's fail-open behavior), not regressions from this change.bun run lint— only pre-existing<img>warning inapp/components/log-viewer/tool-input-output.tsx.bunx tsc --noEmit— clean.grep -n "Tab title" docs/de/dashboard.mdxshowsTab Richtlinien/Tab Aktivität(no stray quotes).quality,build,docs,test (default),test (log-debug, debug),test (hook-log-file, 1),test-e2e,CodeRabbit); 1 skipped (Mintlify Deployment).gh run watchuntil the second commit'sdocs / Validate docsjob goes green too.🤖 Generated with Claude Code