Skip to content

Emoji ZWJ Sequences are not grouped as single cluster #217

@drott

Description

@drott

Emoji ZWJ Sequences are not recognized/grouped into a single cluster when shaping them with a font that does not have glyphs for them. This makes emoji conflict with the shaper-driven segmentation approach we take in Chrome. If the page for example specifies Arial as a first font, the ZWJs get shaped with spaces, and the individual person emoji remain, which then get shaped with the fallback emoji font, but the run is already broken up by the zero width spaces taken from Arial, example:


$ ./hb-shape /Library/Fonts/Arial.ttf "👩‍👩‍👧‍👧 "
[.notdef=0+1536|space=1+0|.notdef=2+1536|space=3+0|.notdef=4+1536|space=5+0|.notdef=6+1536|space=7+569]

So here the Emoji glyphs are not found, but the ZWJs are shaped and rendered as zero width space glyphs from Arial.

At the moment, this could only be overriden in Blink by always explicitly prioritizing the system Emoji font over the fonts specified on the page for Emoji sequences.

Instead, however, I would propose to fix this by making them the same cluster and putting one .notdef for the whole emoji ZWJ sequence. Actually, probably similarly we need to think about the +Keycap and double-regional indicator sequences (as in Emoji flags).

Related to, but not duplicate of issue #179, I believe. CC @roozbehp

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions