Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OT shaping fails to form mixed skin tone emoji sequence, CoreText succeeds #2967

Closed
drott opened this issue May 4, 2021 · 13 comments · Fixed by #2968
Closed

OT shaping fails to form mixed skin tone emoji sequence, CoreText succeeds #2967

drott opened this issue May 4, 2021 · 13 comments · Fixed by #2968
Labels
Chrome Chrome/Chromium project related issues and requests

Comments

@drott
Copy link
Collaborator

drott commented May 4, 2021

For the WOMAN MAN HOLDING HANDS sequence,

With ToT build:

build/util/hb-shape /System/Library/Fonts/Apple\ Color\ Emoji.ttc -u 1F469,1F3FD,200D,1F91D,200D,1F468,1F3FE

I get

[u1F469.3.L=0+800|space=0+0|space=0+0|u1F468.4.RA=0+800]

Expected, with CoreText:

build/util/hb-shape --shaper=coretext /System/Library/Fonts/Apple\ Color\ Emoji.ttc -u 1F469,1F3FD,200D,1F91D,200D,1F468,1F3FE

[u1F469.3.L=0+0|u1F468.4.RA=5+1066]

CC @jfkthame

Edit: For reference, original Chrome issue https://bugs.chromium.org/p/chromium/issues/detail?id=1205016

@drott drott added the Chrome Chrome/Chromium project related issues and requests label May 4, 2021
@jfkthame
Copy link
Collaborator

jfkthame commented May 4, 2021

Huh, seems odd. I was going to suggest that the zero-width space glyphs harfbuzz leaves in the buffer must be blocking a GPOS rule that's supposed to adjust the spacing -- but there is no such GPOS in the font, the table is empty (nor is there any kerx that could be relevant).

The two "real" glyphs we get are the expected ones, but something about spacing/positioning is totally off. The hmtx metrics match what we're getting from harfbuzz shaping, so I wonder where Core Text is getting its results from?

Ah, one interesting thing: the u1F469.3.L glyph (and those like it) has GDEF class 3 (mark glyph):

  <GDEF>
    <Version value="0x00010000"/>
    <GlyphClassDef Format="2">
      <ClassDef glyph="silhouette.ML" class="3"/>
      <ClassDef glyph="silhouette.WL" class="3"/>
      <ClassDef glyph="u1F468.1.L" class="3"/>
      <ClassDef glyph="u1F468.2.L" class="3"/>
      <ClassDef glyph="u1F468.3.L" class="3"/>
      <ClassDef glyph="u1F468.4.L" class="3"/>
      <ClassDef glyph="u1F468.5.L" class="3"/>
      <ClassDef glyph="u1F469.1.L" class="3"/>
      <ClassDef glyph="u1F469.2.L" class="3"/>
      <ClassDef glyph="u1F469.3.L" class="3"/>
      <ClassDef glyph="u1F469.4.L" class="3"/>
      <ClassDef glyph="u1F469.5.L" class="3"/>
    </GlyphClassDef>
  </GDEF>

That seems pretty illogical to me; if anything, the first (left-hand) glyph would be a base, and the second member of the couple could be an attached mark. But maybe this prompts Core Text to treat it specially (zeroing its width?), though I still don't see where the 1066 would come from.

Ohhhh.... the advance values reported when using --shaper coretext are dependent on the font size, rather than being reported in font units. So running

./util/hb-shape --font-size 600 --shaper coretext /System/Library/Fonts/Apple\ Color\ Emoji.ttc -u 1F469,1F3FD,200D,1F91D,200D,1F468,1F3FE

gives us

[u1F469.3.L=0+0|u1F468.4.RA=5+800]

(Why is 600 the appropriate size to use? I don't know...)

This at least looks promisingly similar to the harfbuzz result. Apparently all we'd need to do is to zero the advance of the glyph that has class 3 (mark), but we're not doing so.

In _hb_ot_complex_shaper_default, we have HB_OT_SHAPE_ZERO_WIDTH_MARKS_BY_GDEF_LATE, so it seems like this ought to work. But apparently the u1F469.3.L glyph has lost its mark-ness by the time we do that.

No; actually, this buffer location never had mark-ness set, because the glyph originally mapped from U+1F469 was not one of these skin-toned people, it was the basic WOMAN emoji, which is not a "mark" by any stretch of the imagination. So at the point where we did _hb_ot_layout_set_glyph_props, it didn't get marked as a mark. And later, when a morx substitution replaced it with u1F469.3.L, we don't re-check the GDEF table and update the class.

I guess we could update the glyph_info flags when a substitution happens, or alternatively we could look up classes in the GDEF table when doing the "late mark zeroing" (whereas currently this relies on the glyph_info that was set earlier).

A further problem is that if we do get zero_mark_widths_by_gdef to recognize the u1F469.3.L as a mark, we'll end up shifting it leftwards when we zero its width, which is not desired in this case. I guess a possible heuristic would be to disable the offset-adjustment when the glyph is first in its cluster.

Putting all this together, I have a patch that seems to give the desired result here, though it feels a bit hacky. @behdad might want to take a look and see where he feels is the best place to make the various decisions/adjustments.

jfkthame added a commit that referenced this issue May 4, 2021
When zeroing by GDEF late, we need to re-check the GDEF table
(if it is the source of glyph classes) because substitutions may
mean that we now have mark glyphs in slots that were not flagged
as such initially.

Fixes #2967.
@jfkthame
Copy link
Collaborator

jfkthame commented May 4, 2021

@drott if you could test this patch and confirm if it fixes the issue for you, that'd be great - thanks.

@jfkthame
Copy link
Collaborator

jfkthame commented May 4, 2021

Also, if something like this does look like a useful way forward, the question arises whether we should do it in other shapers where GDEF_LATE zeroing is used. Though substitutions that replace non-marks with marks (or vice versa) probably aren't exactly common....

@jfkthame
Copy link
Collaborator

jfkthame commented May 4, 2021

Opened a PR for review .... and let's see how badly things break.

@behdad
Copy link
Member

behdad commented May 4, 2021

Thanks for the analysis Jonathan! I haven't fully studied what you found and the PR. But just wanted to say:

I guess we could update the glyph_info flags when a substitution happens, or alternatively we could look up classes in the GDEF table when doing the "late mark zeroing" (whereas currently this relies on the glyph_info that was set earlier).

In gsubgpos.hh we do that already; that is, whenever adding new glyph to buffer, if GDEF classes are present, we consult those for the new gid. Only if there are no GDEF classes, we carry information from before the substitution. It seems like I forgot to implement similar logic in morx impl.

@behdad
Copy link
Member

behdad commented May 4, 2021

(Why is 600 the appropriate size to use? I don't know...)

Humm? This sounds like a serious bug. hb-view bug or library?

@behdad
Copy link
Member

behdad commented May 4, 2021

These three places:

buffer->info[mark].codepoint = *replacement;

buffer->info[idx].codepoint = *replacement;

info[i].codepoint = *replacement;

What GSUB does instead:

void replace_glyph_inplace (hb_codepoint_t glyph_index) const
{
_set_glyph_props (glyph_index);
buffer->cur().codepoint = glyph_index;
}

@behdad
Copy link
Member

behdad commented May 4, 2021

(Why is 600 the appropriate size to use? I don't know...)

Humm? This sounds like a serious bug. hb-view bug or library?

Ah, try with --font-funcs=ot; that should fix it? Sounds like a know hb-view bug with FreeType's bitmap fonts.

@drott
Copy link
Collaborator Author

drott commented May 5, 2021

Thank you for the analysis and fix proposal, Jonathan.

@drott if you could test this patch and confirm if it fixes the issue for you, that'd be great - thanks.

Unfortunately with a clean build with mark-zeroing-for-apple-emoji applied I still get

$ build/util/hb-shape --font-funcs=ot /System/Library/Fonts/Apple\ Color\ Emoji.ttc -u 1F469,1F3FD,200D,1F91D,200D,1F468,1F3FE
[u1F469.3.L=0+0|space=0+0|space=0+0|u1F468.4.RA=0+800]

@jfkthame
Copy link
Collaborator

jfkthame commented May 5, 2021

I believe that's a "correct" result, which will render identically to the Core Text version -- the zero-width space glyphs are unimportant. The key point is that we have the two glyphs u1F469.3.L and u1F468.4.RA rendered at the same (zero) position, and then advance by the width of just one glyph.

(I also checked the actual rendering in a Firefox build with this patch, and confirmed it looks correct.)

@jfkthame
Copy link
Collaborator

jfkthame commented May 5, 2021

(Why is 600 the appropriate size to use? I don't know...)

Humm? This sounds like a serious bug. hb-view bug or library?

Ah, try with --font-funcs=ot; that should fix it? Sounds like a know hb-view bug with FreeType's bitmap fonts.

That doesn't seem to make any difference when using --shaper=coretext.

@drott
Copy link
Collaborator Author

drott commented May 5, 2021

I believe that's a "correct" result, which will render identically to the Core Text version -- the zero-width space glyphs are unimportant. The key point is that we have the two glyphs u1F469.3.L and u1F468.4.RA rendered at the same (zero) position, and then advance by the width of just one glyph.

Right, sorry, I missed the difference in that u1F469.3.L=0+800 had now changed to u1F469.3.L=0+0. LGTM then, except Behdad's comment on the PR suggests a different approach, IIUC?

@jfkthame
Copy link
Collaborator

jfkthame commented May 5, 2021

Thanks for the analysis Jonathan! I haven't fully studied what you found and the PR. But just wanted to say:

I guess we could update the glyph_info flags when a substitution happens, or alternatively we could look up classes in the GDEF table when doing the "late mark zeroing" (whereas currently this relies on the glyph_info that was set earlier).

In gsubgpos.hh we do that already; that is, whenever adding new glyph to buffer, if GDEF classes are present, we consult those for the new gid. Only if there are no GDEF classes, we carry information from before the substitution. It seems like I forgot to implement similar logic in morx impl.

OK, sounds like the cleanest fix will be to do the same during morx substitutions. I'll try that and check that it works as expected.

blueboxd pushed a commit to blueboxd/chromium-legacy that referenced this issue Jun 2, 2021
Fixes issues with mixed skin-tone man/woman holding hands emoji shaping
with Apple Color Emoji. See bug below and [1].

[1] harfbuzz/harfbuzz#2967

https://chromium.googlesource.com/external/github.com/harfbuzz/harfbuzz.git/+log/b37f03f16b39..1dffb553613d

$ git log b37f03f16..1dffb5536 --date=short --no-merges --format='%ad %ae %s'
2021-05-18 drott Chromium build fixes for C++ 17 warning and missing _remap_indexes
2021-05-13 jfkthame [aat] Add testcase for Apple Color Emoji couple-with-skin-tones sequence.
2021-05-13 jfkthame [aat] If shaping via morx, don't adjust mark positioning when zeroing widths.
2021-05-05 jfkthame [aat] Update glyph properties from GDEF if available when doing a replacement.
2021-05-12 grieger Error when link width not in [2, 4]
2021-04-17 qxliu [subset] Add subset () method for COLRv1 Paint tables, BaseGlyphV1List and LayerV1List
2021-05-12 grieger Add hb-ot-color-colrv1-closure.hh to sources list.
2021-05-12 grieger Remove array for visited_paint.
2021-04-01 qxliu [subset] COLRv1 layer/palette indices closure
2021-05-04 grieger [subset] fix failing colrv0 subsetting when font has composite glyphs.
2021-05-06 thomas.stuefe start
2021-03-29 grieger [subset] Add more Noto Nastaliq test cases.

Created with:
  roll-dep src/third_party/harfbuzz-ng/src
R=bashi@chromium.org,behdad@chromium.org,bungeman@chromium.org,drott@chromium.org,jshin@chromium.org,kojii@chromium.org

Fixed: 1205016
Cq-Include-Trybots: luci.chromium.try:mac10.15-blink-rel,mac10.13-blink-rel,mac10.12-blink-rel,mac-rel
Change-Id: Ie37e8357fa821c85d246c8fe2bb9ab6fa47ca8f2
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2902809
Commit-Queue: Dominik Röttsches <drott@chromium.org>
Reviewed-by: Koji Ishii <kojii@chromium.org>
Cr-Commit-Position: refs/heads/master@{#888400}
mjfroman pushed a commit to mjfroman/moz-libwebrtc-third-party that referenced this issue Oct 14, 2022
Fixes issues with mixed skin-tone man/woman holding hands emoji shaping
with Apple Color Emoji. See bug below and [1].

[1] harfbuzz/harfbuzz#2967

https://chromium.googlesource.com/external/github.com/harfbuzz/harfbuzz.git/+log/b37f03f16b39..1dffb553613d

$ git log b37f03f16..1dffb5536 --date=short --no-merges --format='%ad %ae %s'
2021-05-18 drott Chromium build fixes for C++ 17 warning and missing _remap_indexes
2021-05-13 jfkthame [aat] Add testcase for Apple Color Emoji couple-with-skin-tones sequence.
2021-05-13 jfkthame [aat] If shaping via morx, don't adjust mark positioning when zeroing widths.
2021-05-05 jfkthame [aat] Update glyph properties from GDEF if available when doing a replacement.
2021-05-12 grieger Error when link width not in [2, 4]
2021-04-17 qxliu [subset] Add subset () method for COLRv1 Paint tables, BaseGlyphV1List and LayerV1List
2021-05-12 grieger Add hb-ot-color-colrv1-closure.hh to sources list.
2021-05-12 grieger Remove array for visited_paint.
2021-04-01 qxliu [subset] COLRv1 layer/palette indices closure
2021-05-04 grieger [subset] fix failing colrv0 subsetting when font has composite glyphs.
2021-05-06 thomas.stuefe start
2021-03-29 grieger [subset] Add more Noto Nastaliq test cases.

Created with:
  roll-dep src/third_party/harfbuzz-ng/src
R=bashi@chromium.org,behdad@chromium.org,bungeman@chromium.org,drott@chromium.org,jshin@chromium.org,kojii@chromium.org

Fixed: 1205016
Cq-Include-Trybots: luci.chromium.try:mac10.15-blink-rel,mac10.13-blink-rel,mac10.12-blink-rel,mac-rel
Change-Id: Ie37e8357fa821c85d246c8fe2bb9ab6fa47ca8f2
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2902809
Commit-Queue: Dominik Röttsches <drott@chromium.org>
Reviewed-by: Koji Ishii <kojii@chromium.org>
Cr-Commit-Position: refs/heads/master@{#888400}
NOKEYCHECK=True
GitOrigin-RevId: a44afba0d407aa46b871a6e13b9db1ad85566236
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chrome Chrome/Chromium project related issues and requests
Projects
None yet
3 participants