Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use byte indices instead of char indices for text runs #10895

Merged
merged 4 commits into from Apr 29, 2016

Conversation

@mbrubeck
Copy link
Contributor

mbrubeck commented Apr 28, 2016

Replace character indices with UTF-8 byte offsets throughout all code dealing with text runs. This eliminates a lot of complexity when converting from one to the other, and interoperates better with the rest of the Rust ecosystem.

For most code this is just a simple replacement of char indices with byte indices. In a few places like glyph storage and text fragment scanning, it also lets us get rid of code that existed only to map between bytes and chars.

Also includes some related fixes to text shaping, discovered while working on this conversion. See the commit messages for details.

r? @pcwalton


This change is Reviewable

mbrubeck added 2 commits Apr 28, 2016
Shaper::save_glyph_results incorrectly starts its loop by setting glyph_span
to a length of 1.  This means that the `if glyph_span.len() == 0` test in the
inner loop will never succeed.

Instead the glyph span should start out empty, and a glyph should be added only
as the corresponding char is found.  For comparison, see the Gecko code this
was ported from:

https://hg.mozilla.org/mozilla-central/file/ab0044bf/gfx/thebes/gfxHarfBuzzShaper.cpp#l1682
This is a no-op, since a "non-glyph" is simply `GlyphEntry(0)`. This is the
same as `GlyphEntry::initial()`, which all the entries are already initialized
to.
@highfive
Copy link

highfive commented Apr 28, 2016

Heads up! This PR modifies the following files:

  • @KiChjang: components/script/dom/htmltextareaelement.rs, components/script/dom/htmlinputelement.rs
@highfive
Copy link

highfive commented Apr 28, 2016

warning Warning warning

  • These commits modify unsafe code. Please review it carefully!
@mbrubeck
Copy link
Contributor Author

mbrubeck commented Apr 28, 2016

bors-servo added a commit that referenced this pull request Apr 28, 2016
Use byte indices instead of char indices for text runs

Replace character indices with UTF-8 byte offsets throughout all code dealing with text runs.  This eliminates a lot of complexity when converting from one to the other, and interoperates better with the rest of the Rust ecosystem.

For most code this is just a simple replacement of char indices with byte indices.  In a few places like glyph storage and text fragment scanning, it also lets us get rid of code that existed only to map between bytes and chars.

Also includes some related fixes to text shaping, discovered while working on this conversion.  See the commit messages for details.

r? @pcwalton

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/10895)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Apr 28, 2016

Trying commit 5e49a92 with merge 58e7865...

@bors-servo
Copy link
Contributor

bors-servo commented Apr 28, 2016

💔 Test failed - linux-rel

@highfive
Copy link

highfive commented Apr 28, 2016

  ▶ FAIL [expected PASS] /_mozilla/css/submit_focus_a.html
  └   → /_mozilla/css/submit_focus_a.html 3d3190e5b4a800f4418abbb2c0f9a440b124a4bc
/_mozilla/css/submit_focus_b.html 69459c6973cf8bf615a79d90b7e480d60981b795
Testing 3d3190e5b4a800f4418abbb2c0f9a440b124a4bc == 69459c6973cf8bf615a79d90b7e480d60981b795
@mbrubeck mbrubeck force-pushed the mbrubeck:byteindex branch from 5e49a92 to d58422a Apr 28, 2016
@mbrubeck
Copy link
Contributor Author

mbrubeck commented Apr 28, 2016

▶ FAIL [expected PASS] /_mozilla/css/submit_focus_a.html

Fixed. (I accidentally removed an input_type check in HtmlInputElement::selection_for_layout.)

@mbrubeck mbrubeck force-pushed the mbrubeck:byteindex branch from d58422a to 5e8585c Apr 28, 2016
Replace character indices with UTF-8 byte offsets throughout the code dealing
with text shaping and breaking.  This eliminates a lot of complexity when
converting from one to the other, and interoperates better with the rest of
the Rust ecosystem.
@mbrubeck mbrubeck force-pushed the mbrubeck:byteindex branch from 5e8585c to 659305f Apr 28, 2016
@mbrubeck mbrubeck force-pushed the mbrubeck:byteindex branch from 72c6995 to c4872d9 Apr 28, 2016
@pcwalton
Copy link
Contributor

pcwalton commented Apr 28, 2016

@bors-servo: r+

This is awesome!

@bors-servo
Copy link
Contributor

bors-servo commented Apr 28, 2016

📌 Commit c4872d9 has been approved by pcwalton

1 similar comment
@bors-servo
Copy link
Contributor

bors-servo commented Apr 28, 2016

📌 Commit c4872d9 has been approved by pcwalton

@bors-servo
Copy link
Contributor

bors-servo commented Apr 29, 2016

Testing commit c4872d9 with merge 4e15586...

bors-servo added a commit that referenced this pull request Apr 29, 2016
Use byte indices instead of char indices for text runs

Replace character indices with UTF-8 byte offsets throughout all code dealing with text runs.  This eliminates a lot of complexity when converting from one to the other, and interoperates better with the rest of the Rust ecosystem.

For most code this is just a simple replacement of char indices with byte indices.  In a few places like glyph storage and text fragment scanning, it also lets us get rid of code that existed only to map between bytes and chars.

Also includes some related fixes to text shaping, discovered while working on this conversion.  See the commit messages for details.

r? @pcwalton

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/10895)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Apr 29, 2016

💔 Test failed - windows

@mbrubeck
Copy link
Contributor Author

mbrubeck commented Apr 29, 2016

@bors-servo retry

Infrastructure

@bors-servo
Copy link
Contributor

bors-servo commented Apr 29, 2016

Previous build results for android, arm32, arm64, linux-dev, mac-dev-unit, mac-rel-css, mac-rel-wpt are reusable. Rebuilding only linux-rel, windows...

@bors-servo
Copy link
Contributor

bors-servo commented Apr 29, 2016

💔 Test failed - windows

@mbrubeck
Copy link
Contributor Author

mbrubeck commented Apr 29, 2016

@bors-servo retry

@larsbergstrom
Copy link
Contributor

larsbergstrom commented Apr 29, 2016

cc me

@bors-servo
Copy link
Contributor

bors-servo commented Apr 29, 2016

Testing commit c4872d9 with merge cf121ad...

bors-servo added a commit that referenced this pull request Apr 29, 2016
Use byte indices instead of char indices for text runs

Replace character indices with UTF-8 byte offsets throughout all code dealing with text runs.  This eliminates a lot of complexity when converting from one to the other, and interoperates better with the rest of the Rust ecosystem.

For most code this is just a simple replacement of char indices with byte indices.  In a few places like glyph storage and text fragment scanning, it also lets us get rid of code that existed only to map between bytes and chars.

Also includes some related fixes to text shaping, discovered while working on this conversion.  See the commit messages for details.

r? @pcwalton

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/10895)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Apr 29, 2016

@bors-servo bors-servo merged commit c4872d9 into servo:master Apr 29, 2016
3 checks passed
3 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@mbrubeck mbrubeck deleted the mbrubeck:byteindex branch May 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.