New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
an odd HB_GLYPH_FLAG_UNSAFE_TO_BREAK case #3824
Comments
I’m not sure I follow what the C code is comparing, but if I try to shape the text with
Note that HarfBuzz advances are relative not absolute positions, to measure the full with of a sequence of glyphs, you need to add the advances. |
Scratch that, I mis-decoded the UTF-8 string. |
More relevant output (I skipped the morx output for brevity):
So there is a GPOS lookup that gets applied only when the string does not start with space, so that is why the substring has different glyph advances. |
That should have resulted in a unsafe-to-break somewhere... I'll debug. Thanks. |
It looks like a simple case of PairPos table with non-zero ValueFormat2, which causes kerning to only be applied to every other glyph. If you add a second space char at the beginning then the kern is correctly applied again... |
From the spec: "valueFormat2 applies to the ValueRecords for the second glyph in each pair. The single ValueFormat field applies to ValueRecords for all second glyphs. If valueFormat2 is set to 0, then the ValueRecords for the second glyph of the pair will be empty, the second glyph is not repositioned, and it becomes the “next” glyph for which a lookup is performed." |
@jfkthame do you have any input here? |
I suppose when valueFormat2 is non-zero, we should record it as "unsafe" to break between the pair even if the values applied were zero? |
The problem arises between the second glyph and the glyph after it, which are not tried for kerning now... I'm inclined to go against the spec and always use second glyph as next glyph. The current behavior doesn't make sense. If a font kerns 'oo' and does that half-and-half using valueFormat2 non-zero, then a sequence of 'oooooooooooooo' will get every other pair kerned only. |
Proposed WIP patch in #3825 |
Yeah, I'm pretty sure I remember seeing that sort of effect in a font. Whether it's OK to deviate here.... I'm not sure. I guess it'll be interesting to see if it affects any tests, for a start. |
It only broke two AOTS tests that specifically test this behavior. |
Since valueFormat2 of nonzero seems to be rare, I suppose we can mark unsafe-to-break between the second glyph and the glyph after that... |
|
With the patch:
|
In LibreOffice we're making use of HB_GLYPH_FLAG_UNSAFE_TO_BREAK to cache layout results but have found an odd edge case in one test document (of 250,000) which I've attempted to boil down to the test case here where on verifying that we could reuse the old layout I see different results on shaping one side of the original string which had no HB_GLYPH_FLAG_UNSAFE_TO_BREAK flags.
The output I get is:
full sequence: advance is 1235
full sequence: advance is 460
is unsafe to break set on any glyph? no, (expect no)
shape again without space at edge, expect same results as previous iteration
sub sequence: advance is 1135
sub sequence: advance is 460
which I expect is:
full sequence: advance is 1235
full sequence: advance is 460
is unsafe to break set on any glyph? no, (expect no)
shape again without space at edge, expect same results as previous iteration
sub sequence: advance is 1235
sub sequence: advance is 460
harfbuzz-demo.zip
The text was updated successfully, but these errors were encountered: