an odd HB_GLYPH_FLAG_UNSAFE_TO_BREAK case #3824

caolanm · 2022-09-21T14:15:06Z

In LibreOffice we're making use of HB_GLYPH_FLAG_UNSAFE_TO_BREAK to cache layout results but have found an odd edge case in one test document (of 250,000) which I've attempted to boil down to the test case here where on verifying that we could reuse the old layout I see different results on shaping one side of the original string which had no HB_GLYPH_FLAG_UNSAFE_TO_BREAK flags.

The output I get is:

full sequence: advance is 1235
full sequence: advance is 460
is unsafe to break set on any glyph? no, (expect no)
shape again without space at edge, expect same results as previous iteration
sub sequence: advance is 1135
sub sequence: advance is 460

which I expect is:

full sequence: advance is 1235
full sequence: advance is 460
is unsafe to break set on any glyph? no, (expect no)
shape again without space at edge, expect same results as previous iteration
sub sequence: advance is 1235
sub sequence: advance is 460

harfbuzz-demo.zip

khaledhosny · 2022-09-21T16:06:47Z

I’m not sure I follow what the C code is comparing, but if I try to shape the text with hb-shape, I get the same glyph advances:

$ hb-shape XBRoya34.ttf -u "20,D8,A7,D9,84,D8,A3" --direction=r --script=arab
[sterling=6+1134|Oslash=5+1588|.notdef=4+1239|Ugrave=3+1474|section=2+1134|Oslash=1+1588|space=0+600]
$ hb-shape XBRoya34.ttf -u "D8,A7,D9,84,D8,A3" --direction=r --script=arab
[sterling=5+1134|Oslash=4+1588|.notdef=3+1239|Ugrave=2+1474|section=1+1134|Oslash=0+1588]
$ hb-shape XBRoya34.ttf -u "20" --direction=r --script=arab
[space=0+600]

Note that HarfBuzz advances are relative not absolute positions, to measure the full with of a sequence of glyphs, you need to add the advances.

khaledhosny · 2022-09-21T16:17:49Z

Scratch that, I mis-decoded the UTF-8 string.

khaledhosny · 2022-09-21T16:28:23Z

More relevant output (I skipped the morx output for brevity):

$ hb-shape XBRoya34.ttf " الأ" --direction=r --script=arab --language=ar --show-flags -V
trace: start table GPOS	buffer: [space=0+600|afii57415=1+460|uniFEF7=2+1235]
trace: start lookup 0	buffer: [space=0+600|afii57415=1+460|uniFEF7=2+1235]
trace: kerning glyphs at 0,1	buffer: [space=0+600|afii57415=1+460|uniFEF7=2+1235]
trace: kerned glyphs at 0,1	buffer: [space=0+600|afii57415=1+460|uniFEF7=2+1235]
trace: end lookup 0	buffer: [space=0+600|afii57415=1+460|uniFEF7=2+1235]
trace: end table GPOS	buffer: [space=0+600|afii57415=1+460|uniFEF7=2+1235]
[uniFEF7=2+1235|afii57415=1+460|space=0+600]

$ hb-shape XBRoya34.ttf "الأ" --direction=r --script=arab --language=ar --show-flags  -V
trace: start table GPOS	buffer: [afii57415=0+460|uniFEF7=1+1235]
trace: start lookup 0	buffer: [afii57415=0+460|uniFEF7=1+1235]
trace: kerning glyphs at 0,1	buffer: [afii57415=0+460|uniFEF7=1+1235]
trace: kerned glyphs at 0,1	buffer: [afii57415=0+460|uniFEF7=1+1135]
trace: end lookup 0	buffer: [afii57415=0+460|uniFEF7=1+1135#3]
trace: end table GPOS	buffer: [afii57415=0+460|uniFEF7=1+1135#3]
[uniFEF7=1+1135#1|afii57415=0+460]

So there is a GPOS lookup that gets applied only when the string does not start with space, so that is why the substring has different glyph advances.

behdad · 2022-09-21T17:17:02Z

there is a GPOS lookup that gets applied only when the string does not start with space

That should have resulted in a unsafe-to-break somewhere... I'll debug. Thanks.

behdad · 2022-09-21T17:23:21Z

there is a GPOS lookup that gets applied only when the string does not start with space

It looks like a simple case of PairPos table with non-zero ValueFormat2, which causes kerning to only be applied to every other glyph. If you add a second space char at the beginning then the kern is correctly applied again...

behdad · 2022-09-21T17:24:24Z

From the spec:

"valueFormat2 applies to the ValueRecords for the second glyph in each pair. The single ValueFormat field applies to ValueRecords for all second glyphs. If valueFormat2 is set to 0, then the ValueRecords for the second glyph of the pair will be empty, the second glyph is not repositioned, and it becomes the “next” glyph for which a lookup is performed."

behdad · 2022-09-21T17:28:03Z

@jfkthame do you have any input here?

jfkthame · 2022-09-21T18:07:07Z

I suppose when valueFormat2 is non-zero, we should record it as "unsafe" to break between the pair even if the values applied were zero?

behdad · 2022-09-21T18:16:36Z

I suppose when valueFormat2 is non-zero, we should record it as "unsafe" to break between the pair even if the values applied were zero?

The problem arises between the second glyph and the glyph after it, which are not tried for kerning now...

I'm inclined to go against the spec and always use second glyph as next glyph. The current behavior doesn't make sense. If a font kerns 'oo' and does that half-and-half using valueFormat2 non-zero, then a sequence of 'oooooooooooooo' will get every other pair kerned only.

This goes against the spec. #3824

behdad · 2022-09-21T18:23:42Z

Proposed WIP patch in #3825

jfkthame · 2022-09-21T18:24:04Z

Yeah, I'm pretty sure I remember seeing that sort of effect in a font. Whether it's OK to deviate here.... I'm not sure. I guess it'll be interesting to see if it affects any tests, for a start.

behdad · 2022-09-21T18:25:52Z

I guess it'll be interesting to see if it affects any tests, for a start.

It only broke two AOTS tests that specifically test this behavior.

behdad · 2022-11-11T19:38:11Z

I suppose when valueFormat2 is non-zero, we should record it as "unsafe" to break between the pair even if the values applied were zero?

The problem arises between the second glyph and the glyph after it, which are not tried for kerning now...

Since valueFormat2 of nonzero seems to be rare, I suppose we can mark unsafe-to-break between the second glyph and the glyph after that...

Fixes #3824

behdad · 2022-11-11T19:48:18Z

Since valueFormat2 of nonzero seems to be rare, I suppose we can mark unsafe-to-break between the second glyph and the glyph after that...

#3874

behdad · 2022-11-11T19:51:06Z

$ hb-shape XBRoya34.ttf " الأ" --direction=r --script=arab --language=ar --show-flags -V
[uniFEF7=2+1235|afii57415=1+460|space=0+600]

With the patch:

$ hb-shape XBRoya34.ttf " الأ" --direction=r --script=arab --language=ar --show-flags
[uniFEF7=2+1235#1|afii57415=1+460|space=0+600]

Fixes #3824

behdad added a commit that referenced this issue Sep 21, 2022

[gpos] in PairPos, always advance next-glyph by one only

7a76222

This goes against the spec. #3824

behdad mentioned this issue Sep 21, 2022

[gpos] in PairPos, always advance next-glyph by one only #3825

Closed

behdad mentioned this issue Nov 11, 2022

[GPOS] Change how PairPos is applied when ValueFormat2 is non-zero MicrosoftDocs/typography-issues#989

Closed

behdad added a commit that referenced this issue Nov 11, 2022

[GPOS.PairPos] Adjust unsafe-to-break for non-zero ValueFormat2

26efece

Fixes #3824

behdad mentioned this issue Nov 11, 2022

[GPOS.PairPos] Adjust unsafe-to-break for non-zero ValueFormat2 #3874

Merged

behdad closed this as completed in #3874 Nov 14, 2022

behdad added a commit that referenced this issue Nov 14, 2022

[GPOS.PairPos] Adjust unsafe-to-break for non-zero ValueFormat2

ea63e95

Fixes #3824

caolanm mentioned this issue Nov 26, 2022

a HB_GLYPH_FLAG_UNSAFE_TO_BREAK and --features="-kern" case #3898

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

an odd HB_GLYPH_FLAG_UNSAFE_TO_BREAK case #3824

an odd HB_GLYPH_FLAG_UNSAFE_TO_BREAK case #3824

caolanm commented Sep 21, 2022

khaledhosny commented Sep 21, 2022 •

edited

khaledhosny commented Sep 21, 2022

khaledhosny commented Sep 21, 2022 •

edited

behdad commented Sep 21, 2022

behdad commented Sep 21, 2022

behdad commented Sep 21, 2022

behdad commented Sep 21, 2022

jfkthame commented Sep 21, 2022

behdad commented Sep 21, 2022

behdad commented Sep 21, 2022

jfkthame commented Sep 21, 2022

behdad commented Sep 21, 2022

behdad commented Nov 11, 2022

behdad commented Nov 11, 2022

behdad commented Nov 11, 2022

an odd HB_GLYPH_FLAG_UNSAFE_TO_BREAK case #3824

an odd HB_GLYPH_FLAG_UNSAFE_TO_BREAK case #3824

Comments

caolanm commented Sep 21, 2022

khaledhosny commented Sep 21, 2022 • edited

khaledhosny commented Sep 21, 2022

khaledhosny commented Sep 21, 2022 • edited

behdad commented Sep 21, 2022

behdad commented Sep 21, 2022

behdad commented Sep 21, 2022

behdad commented Sep 21, 2022

jfkthame commented Sep 21, 2022

behdad commented Sep 21, 2022

behdad commented Sep 21, 2022

jfkthame commented Sep 21, 2022

behdad commented Sep 21, 2022

behdad commented Nov 11, 2022

behdad commented Nov 11, 2022

behdad commented Nov 11, 2022

khaledhosny commented Sep 21, 2022 •

edited

khaledhosny commented Sep 21, 2022 •

edited