Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply visual run resets to line range. #55

Merged
merged 1 commit into from
Apr 6, 2021
Merged

Conversation

laurmaedje
Copy link
Contributor

The reset logic in visual runs doesn't take the line offset into account. The for loop iterates over the line's char indices, but the line-local index i is then used to index into the original_classes for the whole paragraph.

Consider the string "aa טֶ", which consists of five chars with the following subranges:

  • 0..1: a
  • 1..2: a
  • 2..3: space
  • 3..5: ט
  • 5..7: \u{5b6}

What happens in detail: When resolving the line 3..7, the code previously iterated over both chars, and for the last char i=2 resolved to the original class of the space, setting reset_from to Some(2). Then, all bytes from 2 to line_str.len() (which was 4), were reset to paragraph level. As a result, the first half of ט was reset to 0.

The result of all this is that the reordered visual runs now slice the ט in half because its level changes halfway through. In addition to a fix for the problem, I added the above example as a test case.

@mbrubeck
Copy link
Contributor

mbrubeck commented Apr 6, 2021

Thanks!

It looks like this increases the minimum Rust requirement to 1.36 because it uses non-lexical lifetimes. I think that's fine, because our main downstream crate (url) already requires 1.36 or later. I'll merge this now, and open a separate PR to fix up the CI configuration.

@mbrubeck mbrubeck merged commit 6f815ba into servo:master Apr 6, 2021
@mbrubeck
Copy link
Contributor

mbrubeck commented Apr 6, 2021

Published unicode-bidi 0.3.5 with this fix.

mbrubeck added a commit to mbrubeck/rust-unic that referenced this pull request Apr 22, 2021
This fixes a bug where indices within a line are incorrectly used to
index within the paragraph.

For more details, see:
servo/unicode-bidi#55
mbrubeck added a commit to mbrubeck/rust-unic that referenced this pull request Apr 22, 2021
This fixes a bug where indices within a line are incorrectly used to
index within the paragraph.

For more details, see:
servo/unicode-bidi#55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants