Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect adjoining text fragments with no line break opportunity between them #19688

Merged
merged 2 commits into from Jan 25, 2018

Conversation

@Manishearth
Copy link
Member

Manishearth commented Jan 4, 2018

First attempt at #874

This change is Reviewable

@highfive
Copy link

highfive commented Jan 4, 2018

Heads up! This PR modifies the following files:

  • @emilio: components/layout/fragment.rs, components/layout/inline.rs
@highfive
Copy link

highfive commented Jan 4, 2018

warning Warning warning

  • These commits modify layout code, but no tests are modified. Please consider adding a test!
@Manishearth
Copy link
Member Author

Manishearth commented Jan 4, 2018

I'm very unfamiliar with the code here, and I figured that instead of trying to get a million questions answered over a 12 hour time difference, it would make more sense to try a simple attempt at a solution and be told what's wrong with it 😄

feedback? @mbrubeck

@mbrubeck
Copy link
Contributor

mbrubeck commented Jan 4, 2018

Testing for whitespace alone will not suffice, since we also want to break on characters like hyphens, in cases like foo-<b>bar</b>. Instead we need to use the UAX 14 line breaking algorithm to determine whether a breaking opportunity exists.

Currently we run UAX 14 in TextRun::break_and_shape, when constructing each TextRun. The text run is split on line-break opportunities, and the segments are stored as a Vec<GlyphRun> inside the TextRun. This is done by the TextRunScanner during flow construction. Since it runs on one TextRun at a time, it can't determine whether a line break exists between TextRuns.

(A text run is used in shaping and must be a contiguous slice of text with the same shaping options, so in the general case we can't simply combine all the text of a "word" into a single TextRun. The overview has some additional high-level description.)

At a conceptual level, we need to run UAX 14 on the entire inline flow as a single string of text. Maybe we could do that by moving line-break scanning out of TextRun construction, to some other place that can process all the fragments together. An alternative would be to keep the current code for finding intra-run breaks, and do a separate check to determine whether to allow breaks between runs (similar to the approach in this PR). This would be less intrusive, but I don't know how feasible or efficient it could be.

xi-unicode has a LineBreakLeafIter type looks useful for finding line breaks without having a contiguous str for the entire flow. It still requires starting from the beginning of the flow, though; it doesn't offer any way to test for a break at an arbitrary point in the middle of a string/rope.

@mbrubeck
Copy link
Contributor

mbrubeck commented Jan 4, 2018

I guess we wouldn't need to scan the entire flow for line breaks all at once, as long as we can store and re-use the same LineBreakLeafIter for all the text runs in a flow.

@pcwalton pcwalton assigned mbrubeck and unassigned pcwalton Jan 5, 2018
@Manishearth
Copy link
Member Author

Manishearth commented Jan 5, 2018

So it seems like the problem is not actually stuff between textruns? For the testcase at hand the entire thing is a single textrun. I can make it into two text runs if I do <span style="font-size:10px"> (which still does not break in Firefox), and I can fix that case (for some reason LineBreakLeafIter doesn't seem to help distinguish between boundary breaks and I need to figure out why that's happening first)

@mbrubeck
Copy link
Contributor

mbrubeck commented Jan 5, 2018

Ah. So, in the original test case there is a single TextRun, but it is split into multiple Fragments, each pointing to a separate range of the TextRun. We use TextRun::natural_word_slices_in_range to find breaking opportunities within the range for each fragment, but we don't check whether the start of a new fragment is at a breaking opportunity or not.

For a fragment that starts in the middle of a TextRun we can check whether it starts within a GlyphRun, or between GlyphRuns. For the first fragment in a TextRun, we hit the case that I mentioned previously, and we'll need to use LineBreakLeafIter during TextRun construction to record whether a break opportunity exists between this TextRun and the previous one.

@Manishearth
Copy link
Member Author

Manishearth commented Jan 10, 2018

For a fragment that starts in the middle of a TextRun we can check whether it starts within a GlyphRun, or between GlyphRuns.

I'm not clear on where we'd do this. During fragment construction? Or in scan_for_lines? It feels like the text run info isn't accessible from the latter, but for the former I feel like we'd have unnecessary effects.

@mbrubeck
Copy link
Contributor

mbrubeck commented Jan 10, 2018

I think we'll want to add a "can break before" flag to ScannedTextFragmentInfo. We could determine this by doing a linear or binary search of the GlyphRuns in its TextRun, to find where the fragment boundary falls (or by looking at a similar flag stored on the TextRun, for the first fragment in a run), but it's probably more efficient to track this during text scanning and record it as each scanned text fragment is constructed. (There might still be cases where we need to search the GlyphRuns to initialize the flag when splitting a fragment...)

@Manishearth
Copy link
Member Author

Manishearth commented Jan 10, 2018

I think we'll want to add a "can break before" flag to ScannedTextFragmentInfo.

Right, that's the solution for the foo<strong>bar</strong> case (which I've almost got working), but not the foo<span>bar</span> case, right?

Maybe I've misunderstood your point.

@Manishearth
Copy link
Member Author

Manishearth commented Jan 10, 2018

Oh, wait, i see your point. Yeah, we can use the same flag for both.

@Manishearth
Copy link
Member Author

Manishearth commented Jan 10, 2018

Though, last I checked the creation of the fragment break in foo<span>bar</span> was not in flush_clump_to_list, where I'd expect it to be. I'll recheck, but are there other places where we split such fragments?

@Manishearth Manishearth force-pushed the Manishearth:linebreak branch from 90a757e to 3743be1 Jan 12, 2018
@Manishearth
Copy link
Member Author

Manishearth commented Jan 12, 2018

Updated code. r? @mbrubeck

Going to try some edge cases with mixed scripts, including Arabic ZWJ-ing (e.g. الـ iPhone, though I need to bidi-override it to work correctly)

@mbrubeck
Copy link
Contributor

mbrubeck commented Jan 12, 2018

@bors-servo
Copy link
Contributor

bors-servo commented Jan 12, 2018

Trying commit 3743be1 with merge 5dc0a96...

bors-servo added a commit that referenced this pull request Jan 12, 2018
Detect adjoining text fragments with no whitespace between them and avoid linebreaking there

First attempt at #874

<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/19688)
<!-- Reviewable:end -->
finished = true;
}
if idx == 0 {
break_at_zero = false;

This comment has been minimized.

Copy link
@mbrubeck

mbrubeck Jan 12, 2018

Contributor

Should this be true?

This comment has been minimized.

Copy link
@Manishearth

Manishearth Jan 13, 2018

Author Member

yep.

@bors-servo
Copy link
Contributor

bors-servo commented Jan 12, 2018

💔 Test failed - linux-rel-css

@Manishearth Manishearth force-pushed the Manishearth:linebreak branch from 3743be1 to 1026c8a Jan 13, 2018
@Manishearth
Copy link
Member Author

Manishearth commented Jan 13, 2018

@bors-servo retry try

@bors-servo
Copy link
Contributor

bors-servo commented Jan 13, 2018

Trying commit 1026c8a with merge ef5e71d...

bors-servo added a commit that referenced this pull request Jan 13, 2018
Detect adjoining text fragments with no whitespace between them and avoid linebreaking there

First attempt at #874

<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/19688)
<!-- Reviewable:end -->
@Manishearth
Copy link
Member Author

Manishearth commented Jan 13, 2018

Hm, even after fixing that it doesn't seem to want to line break anymore

bors-servo added a commit that referenced this pull request Jan 25, 2018
Detect adjoining text fragments with no line break opportunity between them

First attempt at #874

<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/19688)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Jan 25, 2018

💔 Test failed - linux-rel-css

@Manishearth
Copy link
Member Author

Manishearth commented Jan 25, 2018

Nothing changed between this and the try, so it's an intermittent. Had a quick look and doesn't seem related, but might investigate further if it keeps cropping up.

@bors-servo retry

@bors-servo
Copy link
Contributor

bors-servo commented Jan 25, 2018

Testing commit fc97f0b with merge e53663e...

bors-servo added a commit that referenced this pull request Jan 25, 2018
Detect adjoining text fragments with no line break opportunity between them

First attempt at #874

<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/19688)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Jan 25, 2018

💔 Test failed - arm64

@Manishearth
Copy link
Member Author

Manishearth commented Jan 25, 2018

@bors-servo retry

  • infra, "unable to read response"
@bors-servo
Copy link
Contributor

bors-servo commented Jan 25, 2018

@bors-servo
Copy link
Contributor

bors-servo commented Jan 25, 2018

💔 Test failed - arm64

@Manishearth
Copy link
Member Author

Manishearth commented Jan 25, 2018

arm64 is permafail on

error: failed to execute compile
caused by: error reading compile response from server
caused by: Failed to read response header
caused by: failed to fill whole buffer
error: Could not compile `script`.

for both this PR and another. Trying to ssh and fix

@Manishearth
Copy link
Member Author

Manishearth commented Jan 25, 2018

@bors-servo retry try-

(rehighstated the builder, cleaned up processes)

@bors-servo
Copy link
Contributor

bors-servo commented Jan 25, 2018

Testing commit fc97f0b with merge 9c13073...

bors-servo added a commit that referenced this pull request Jan 25, 2018
Detect adjoining text fragments with no line break opportunity between them

First attempt at #874

<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/19688)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Jan 25, 2018

@bors-servo bors-servo merged commit fc97f0b into servo:master Jan 25, 2018
2 of 3 checks passed
2 of 3 checks passed
continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@Manishearth
Copy link
Member Author

Manishearth commented Jan 25, 2018

@Manishearth
Copy link
Member Author

Manishearth commented Jan 25, 2018

followups : #19862 , #19863

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants
You can’t perform that action at this time.