feat(pm-adapter): typed direction resolver chain (SD-2776)#3184
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3f38b84217
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
FYI - here is the new PR. |
dfe29c9 to
cdae1fa
Compare
2a7cb6a to
6869011
Compare
cdae1fa to
2e967df
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2e967dff44
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Introduces orthogonal direction context types so future RTL work cannot accidentally collapse axes that ECMA-376 keeps separate: - BaseDirection, WritingMode (enums) - SectionDirectionContext (page direction, gutter — chrome only) - TableDirectionContext (visual cell ordering only) - CellDirectionContext (cell writing mode) - ParagraphDirectionContext (paragraph inline base direction + writing mode) - RunBidiContext, RunScriptContext (run-level signals; consumed in 1b/1c) Adds `directionContext` field to ParagraphAttrs alongside the existing `direction` scalar. Both are populated by pm-adapter from the same source; consumers can migrate gradually. Per ECMA-376 §17.6.1 / §17.3.1.6 / §17.4.1 / §17.3.1.41, each axis stays separate: section bidi is chrome only, paragraph bidi is paragraph-local, table visual direction is cell ordering, writing mode is the one inheriting axis. No behavior change. Resolver chain and migration follow in subsequent commits.
New module pm-adapter/src/direction/ with:
- resolveSectionDirection / resolveTableDirection / resolveCellDirection /
resolveParagraphDirection — context propagation chain mirroring OOXML
containment hierarchy
- logicalSides helpers (resolveLogicalAlignment, resolveLogicalIndent,
physicalSide, isRtl, toBaseDirection) — direction-aware logical→physical
mapping
- 12 non-collapse tests enforcing the four ECMA spec rules:
1. Section w:bidi MUST NOT make paragraphs RTL (§17.6.1)
2. Table w:bidiVisual MUST NOT make cell paragraphs RTL (§17.4.1)
3. Run-level w:rtl MUST NOT bubble up to paragraph
4. Paragraph w:bidi DOES produce paragraph RTL (§17.3.1.6, including
style cascade through docDefaults per §17.7.2)
- Writing mode IS the one inheriting axis (§17.3.1.41) — paragraph→cell→
section→default
Co-located README documenting the spec rules, a worked-example for
downstream consumers, and explicit non-goals (script classifier and bidi
controls deferred to Wave 1b / 1c).
No production call sites consume the resolver yet; migration follows.
…lver Replaces the cascade in resolveEffectiveParagraphDirection + inferDirectionFromRuns with the typed resolver chain from pm-adapter/src/direction/. The cascade had three issues identified by the audit at .tmp/rtl-audit-findings.md: 1. Section→paragraph fallback (§17.6.1 violation) — section bidi propagated to paragraph inline direction. Latin paragraphs in RTL sections rendered right-aligned; Word renders them left-aligned. 2. Majority-of-runs heuristic (UAX #9 P2/P3 disagreement) — base direction came from counting runs whose w:rtl flag was set, not the first strong character of the text content. 3. docDefaultsDirection parameter (redundant) — the style-engine cascade in style-engine/src/ooxml/index.ts:165 already resolves docDefaults.paragraphProperties.rightToLeft into the paragraph's resolved properties before this resolver runs. Now: paragraph direction comes from paragraph w:bidi (or its style cascade); when absent, inlineDirection is undefined and the browser applies UBA via the missing dir attribute. Output corrected for documents that today render incorrectly; unchanged for documents that were already correct. Tests updated: - paragraph.test.ts: removed cascade/heuristic tests that codified the spec violations - paragraph.test.ts: section-fallback test flipped to assert no inheritance - index.test.ts: two integration tests flipped to expect undefined paragraph direction when only section bidi is set Validation: - 1,765 pm-adapter unit tests pass - 211 contracts unit tests pass - 12,374 super-editor unit tests pass (incl. footer w:rtl roundtrip) - 51 RTL Playwright behavior tests pass across Chromium/Firefox/WebKit Closes SD-2776, SD-2778. The legacy attrs.direction scalar remains populated for backwards compatibility; consumers should migrate to attrs.directionContext over time.
The resolved TableProperties type from the style-engine uses `rightToLeft` for the bidiVisual flag (matching the existing importer convention). The resolver previously checked only `bidiVisual`, so passing real resolved table properties would leave visualDirection undefined for RTL tables. Now accepts both `rightToLeft` (style-engine name) and `bidiVisual` (OOXML name) for safety. Test added to cover the alias.
Per ECMA-376 §17.18.93, ST_TextDirection has 12 enumeration values across the strict and Word-transitional vocabularies. The V-suffix variants are glyph rotation, which CSS expresses through text-orientation, so they share the writing-mode of their non-V sibling. Before this commit the three resolvers (paragraph/section/cell) handled 6 of the 12 values; lrTbV, tbRlV, tbV, lrV, rlV all fell through to undefined and the resolver silently used the inherited/default writing-mode instead. The repo's ST_TEXT_DIRECTION contract (registry.ts:18) publishes lrTbV and tbRlV as accepted values, so this was a contract violation - documents that imported one of these would lose their writing-mode override. Adds an exhaustive test that exercises all 12 values on paragraph, section, and cell.
447cacd to
38afb53
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
🎉 This PR is included in @superdoc-dev/mcp v0.3.0-next.72 The release is available on GitHub release |
|
🎉 This PR is included in vscode-ext v2.3.0-next.116 |
|
🎉 This PR is included in @superdoc-dev/react v1.2.0-next.114 The release is available on GitHub release |
|
🎉 This PR is included in superdoc v1.30.0-next.69 The release is available on GitHub release |
|
🎉 This PR is included in superdoc-cli v0.8.0-next.88 The release is available on GitHub release |
|
🎉 This PR is included in superdoc-sdk v1.8.0-next.70 |
|
🎉 This PR is included in superdoc-cli v0.9.0 The release is available on GitHub release |
|
🎉 This PR is included in superdoc v1.32.0 The release is available on GitHub release |
|
🎉 This PR is included in @superdoc-dev/mcp v0.4.0 The release is available on GitHub release |
|
🎉 This PR is included in @superdoc-dev/react v1.3.0 The release is available on GitHub release |
|
🎉 This PR is included in vscode-ext v2.4.0 |
…(SD-2781) (#3203) * feat(pm-adapter): preserve run-level bidi/script metadata on TextRun (SD-2781) Adds two preservation-only fields to the layout text-run contract, kept on separate axes per ECMA's own categorization: - TextRun.bidi (RunBidiContext): direction signals only - run rtl flag now, embedding (w:dir) and override (w:bdo) wired in Wave 1c. - TextRun.script (RunScriptContext): complex-script flag + per-script language metadata (default / complexScript / eastAsian) per §17.3.2.20. Both populated by pm-adapter from raw run properties when present. Wave 1a does not render either; Wave 1b will gate the formatting-stack selection on script.complexScript, Wave 1c will read bidi.embedding/override. Why now: ECMA puts direction (rtl, dir, bdo) and script formatting (cs, lang/@bidi) in different categories. Lumping them under one bidi field would collapse the axes and lie about the schema. Adding both fields now means Wave 1b/1c don't have to introduce both the data path and the rendering at once. The RunScriptContext.language field expanded from a single string to a structured object with three optional tags. No production consumer reads the field yet (preservation-only since #3184), so the shape change is safe. Tests: - bidi populated only on rtl, script populated only on cs/lang - explicit rtl=false preserved (a meaningful override) - the three lang tags land on separate fields - axis non-collapse: rtl never leaks into script, cs never leaks into bidi * fix(contracts): import RunBidiContext + RunScriptContext locally for TextRun The previous commit added `bidi?: RunBidiContext` and `script?: RunScriptContext` to TextRun in contracts/src/index.ts but only re-exported the type names; they were not in local type scope, so `tsc --project` failed: src/index.ts(324,10): error TS2304: Cannot find name 'RunBidiContext'. src/index.ts(330,12): error TS2304: Cannot find name 'RunScriptContext'. Vitest's transform doesn't enforce type-only imports the way tsc does, so the unit tests passed even though the build was broken. CI build would have caught it; the missing piece was running `pnpm build` locally before pushing. Adds the two names to the existing local `import type` statement alongside ParagraphDirectionContext (same pattern, same line 19). Also adds two integration tests in pm-adapter/src/integration.test.ts that exercise the full PM -> FlowBlock conversion through the unmocked applyInlineRunProperties pipeline. The previous unit tests in common.test.ts mock computeRunAttrs, which is why they couldn't catch shape-level regressions in the contracts package. The integration tests prove a real PM doc with raw runProperties (rtl/cs/lang on a run-wrapper node) produces a TextRun with populated bidi/script, and that runs without signals don't gain empty objects. * docs: drop duplicate ticket prefix on SD-2781 test block comment Repo convention puts the ticket reference in the `describe` label only (e.g., `describe('SD-1333: ...')`). Trims the redundant `SD-2781:` prefix from the block comment, keeps the non-obvious "Wave 1a preserves these signals; nothing renders them yet" note. * fix(pm-adapter): read bidi/script from raw inline runProperties + reassign on token runs Two codex findings on PR #3203: 1. Cascade-leak: applyInlineRunProperties was receiving cascade-resolved runProperties from runNodeChildrenToRuns and populating bidi/script from them. Style-inherited runs ended up with bidi/script metadata they did not have inline, making preservation indistinguishable from direct formatting and bloating the layout tree on every styled run. Fix: thread the raw inline runProperties through InlineConverterParams (alongside the existing cascade-resolved runProperties). applyInlineRunProperties gains a fourth parameter that bidi/script populate from. When the caller doesn't opt in (no inline parameter), no metadata is attached - safer default than reading the cascaded view. 2. Token-run drop: generic-token.ts called applyInlineRunProperties without reassigning the return value. Since the helper builds a new object via spread, all merged fields (including the SD-2781 bidi/script) were lost for page-number / total-page-count token runs inside an rtl run wrapper. Fix: change the local to `let` and reassign. Also forwards inlineRunProperties through. no-break-hyphen.ts (another caller) updated for consistency. Tests added: - 3 unit tests in common.test.ts proving bidi/script populate from inlineRunProperties only, never from cascade-resolved runProperties - Existing 7 SD-2781 unit tests updated to pass inlineRunProperties (the new opt-in source) - 1 integration test proving bidi/script propagate to page-number tokens inside an rtl run wrapper All 1800 pm-adapter tests, 12644 super-editor tests, and 1201 layout-bridge tests pass. Contracts + pm-adapter both build clean. * fix(pm-adapter): cs absence != false + forward inlineRunProperties through nested converters Two round-3 findings on PR #3203: 1. RunScriptContext.complexScript was required (boolean), so a run with only w:lang and no w:cs ended up with { complexScript: false, language: {...} }. Per ECMA §17.3.2.7, absent w:cs inherits from the style hierarchy and ultimately falls back to Unicode-based script detection - it is NOT the same as explicit false. Misrepresenting absence as false would mislead Wave 1b CS-formatting selection. Fix: make complexScript optional. Only set it when raw cs is explicitly present (true OR false - both are meaningful toggle states per §17.17.4). Leave undefined otherwise so consumers can fall back correctly. 2. Three nested inline converters (page-reference, bookmark-start, structured-content) called visitNode without forwarding the new activeInlineRunProperties arg, so text inside a PAGEREF result, a bookmark span, or an SDT wrapper lost run-level bidi/script metadata even when the enclosing run wrapper had it. Fix: thread inlineRunProperties through all three converters. In page-reference's bookmark path, also pass the locally-scanned run's raw runProperties (not the cascade-resolved version) as inlineRunProperties for the synthesized token run. Tests: - cs absent + only lang -> script exists, complexScript is undefined - cs=false explicit -> script.complexScript === false (meaningful off) - existing 'three lang tags' test unchanged - new integration test: text inside structuredContent wrapper preserves bidi/script from the enclosing run wrapper All 1803 pm-adapter tests pass (+3 new), 12644 super-editor pass, contracts + pm-adapter both build clean. * docs: drop round-counter prefix from SD-2781 integration test header Per project CLAUDE.md, code comments shouldn't reference iteration / review state ("round-3 (codex finding)") - those belong in the PR description and rot once the PR merges. The load-bearing content (which converters, what invariant) stays. --------- Co-authored-by: Caio Pizzol <caiopizzol@gmail.com>
Stacked on
artem/rtl-v3(PR #3162). Artem's PRs merge first; this follows.Artem's RTL bug fixes — caret geometry, list nesting, run-level w:rtl export preservation, header/footer direction — are the base of this branch and ship unchanged. This PR replaces only the direction-inference cascade in
resolveEffectiveParagraphDirectionwith a typed resolver chain.What it does
Introduces
pm-adapter/src/direction/— a small module that computes a typedParagraphDirectionContextonce per paragraph during conversion. Downstream consumers read from that context instead of re-deriving direction from raw attributes.The typed context types (in
@superdoc/contracts) keep orthogonal axes separate so future RTL features cannot accidentally collapse them:SectionDirectionContext— page direction, writing mode, gutterTableDirectionContext— visual cell ordering onlyCellDirectionContext— cell writing modeParagraphDirectionContext— paragraph inline base direction + writing modeRunBidiContext/RunScriptContext— run-level signals (preserved here, consumed in 1b/1c)computeParagraphAttrscalls the resolver chain and writesdirectionContextontoParagraphAttrs. The legacyattrs.directionscalar is still populated so existing consumers don't break.What changes for users
Three behaviors were spec-incorrect in the cascade and are corrected here:
Section
w:bidino longer makes Latin paragraphs render RTL. Per ECMA-376 §17.6.1, section bidi affects section chrome only. Previously a Latin paragraph in an RTL section rendered right-aligned in SuperDoc but left-aligned in Word. Now both render left-aligned.Paragraph base direction without explicit
w:bidiis left undefined. Previously a majority-of-runs heuristic produced'ltr'or'rtl'. The browser's Unicode Bidi Algorithm (UAX minimal core infra for existing functionality #9 P2/P3) handles this natively whendiris omitted, and matches Word.The
docDefaultsDirectionparameter is removed as redundant. The style-engine cascade already resolvesdocDefaults.paragraphProperties.rightToLeftinto the paragraph's resolved properties before this resolver runs. docDefaults still inherits — through the existing cascade.Validation
footer-run-rtl-roundtrip)direction/non-collapse.test.tsenforce the four spec rules by constructionWhat it does NOT do
Explicit non-goals — separate PRs:
textDirection/directionconflation fixrtl-paragraph→direction)Spec rationale
Full ECMA citations and design notes live in
pm-adapter/src/direction/README.md.