Skip to content

Consider providing a way to keep emoji sequences joined #1159

@drott

Description

@drott

(Spun off from https://crbug.com/774302)

The sequence of used fonts in Chrome is the font-family: font stack, after which emoji segmentation is used to determine a fallback priority, and use an emoji font as the first attempted fallback font for an emoji sequence, then perform other system fallback. This allows specifying a custom emoji font in the font stack.

If a font in the font stack has glyph coverage for symbols that are part of an emoji sequence, they get shaped with the font appears earliest in the font stack. For example, Arial has coverage for the male and female sign characters.

Thus, emoji sequences are broken up as HarfBuzz does not consider those sequences a full grapheme cluster. In Chrome, we rely on HarfBuzz clusters as the unit of fallback. However, if HarfBuzz breaks up the emoji sequence, this fallback mechanism is suboptimal.

For example:

$ ./hb-shape /usr/share/fonts/truetype/msttcorefonts/Arial.ttf \
`../test/shaping/hb-unicode-encode U+1F481,U+1F3FB,U+200D,U+2642,U+FE0F`
[.notdef=0+1536|.notdef=0+1536|space=0+0|male=3+1536|space=3+0]

So, HarfBuzz returns two clusters, starting at character index 0 and at character index 3.

Whereas, when shaped with an emoji font:

$ ./hb-shape ~/.local/share/fonts/NotoColorEmoji.ttf  \
`../test/shaping/hb-unicode-encode U+1F481,U+1F3FB,U+200D,U+2642,U+FE0F`
[gid2101=0+2550]

The shaping result comes back with one cluster, starting at character index 0.

Keeping in mind what the user expects to see, for shaping fallback we should consider emoji sequences as a whole unit of fallback, even though they are not necessarily defined as grapheme clusters by Unicode (to be clarified).

We should probably not break into multiple HarfBuzz clusters all those sequences that are defined as an xpicto-sequence in UTR #29 Text Segmentation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ChromeChrome/Chromium project related issues and requests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions