Skip to content

Fix Arabic ligature bidi reordering in textkit#3407

Open
e-Naeim wants to merge 1 commit intodiegomura:masterfrom
e-Naeim:fix-arabic-bidi-ligature-reordering
Open

Fix Arabic ligature bidi reordering in textkit#3407
e-Naeim wants to merge 1 commit intodiegomura:masterfrom
e-Naeim:fix-arabic-bidi-ligature-reordering

Conversation

@e-Naeim
Copy link
Copy Markdown

@e-Naeim e-Naeim commented Apr 28, 2026

Fixes #3406.

What changed

This updates textkit bidi reordering so reordered runs are built from the source run that owns each visual character index.

Ligature continuations are now deduped by source glyph-array index within the owning run, instead of by glyph.id.

Why

Arabic fonts can reuse the same ligature glyph id for different logical clusters. Cairo, for example, can emit the same شر ligature glyph id in separate words such as شراكة and شركات.

The previous addedGlyphs.has(glyph.id) logic treated the second logical cluster as a duplicate and omitted it. With mixed bidi text, slicing reordered visual indices by the original logical run boundaries can also assign glyphs to the wrong visual run.

Examples

Reproduction text:

توجد شراكة استراتيجية مع شركات متعددة. تتكون هيكلية ملكية الشركة من ثلاثة مساهمين، حيث يمتلك المساهم الأول (saudi lend gate) نسبة 50% من الأسهم.

Before:

Before: Arabic ligature dropped by glyph-id dedupe

After:

After: repeated ligatures preserved

Validation

  • Added regression coverage for repeated ligature glyph ids from different source clusters.
  • Added regression coverage for Arabic text mixed with an LTR span.
  • Ran a standalone smoke test of the updated bidiReordering behavior with both regression inputs.
  • Generated before/after PDFs through a downstream @react-pdf/renderer stack using Cairo to verify the visible Arabic output.

I could not run the full upstream workspace locally on Windows because the repository contains an existing test fixture filename under packages/image/tests/assets that cannot be checked out on NTFS. The PR branch itself is created directly against master and only changes packages/textkit.

Copilot AI review requested due to automatic review settings April 28, 2026 22:06
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 28, 2026

⚠️ No Changeset found

Latest commit: cf52a38

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes incorrect bidi reordering behavior in @react-pdf/textkit that could drop Arabic ligatures (when multiple logical clusters share the same glyph id) and could mis-assign glyph clusters to the wrong visual run in mixed bidi text.

Changes:

  • Rebuilds reordered runs based on the source run that owns each visual character index, rather than slicing visual indices by original logical run boundaries.
  • Dedupes ligature continuations by (owning run, source glyph-array index) instead of by glyph.id, avoiding false dedupes when glyph ids repeat across clusters.
  • Adds regression tests for repeated ligature glyph ids and for Arabic text mixed with an LTR span.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
packages/textkit/src/layout/bidiReordering.ts Updates run reconstruction + ligature dedupe logic during bidi reordering to preserve repeated ligatures and keep clusters in the correct visual run.
packages/textkit/tests/layout/bidiReordering.test.ts Adds targeted regression coverage for repeated ligature glyph ids and mixed bidi run ownership.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Arabic ligatures can be dropped during textkit bidi reordering

2 participants