missing emoji substitutions #700

forresto · 2024-04-15T07:54:10Z

Expected Behavior

There are a handful of emoji substitutions that are not found, even after #688 landed.

❤️‍🩹 should render as one glyph. (1433)

Current Behavior

❤️‍🩹 is rendering as 3 glyphs, ([ 169, 18, 1345 ])

Possible Solution

I can make a PR with failing test cases, if that's helpful.

Steps to Reproduce (for bugs)

#️⃣ found sub [ 4, 22 ] 1520
*️⃣ found sub [ 5, 22 ] 1521
0️⃣ found sub [ 6, 22 ] 1531
1️⃣ found sub [ 7, 22 ] 1522
⛹️‍♀️ found sub [ 140, 18, 81 ] 140
⛹️‍♂️ found sub [ 140, 18, 82 ] 140
❤️‍🔥 found sub [ 169, 18, 794 ] 1432
❤️‍🩹 found sub [ 169, 18, 1345 ] 1433

I'm manually looking for substitutions to find these, like this...

  const substitutions = font.substitution.getFeature("ccmp");

  let opentypeOptions = {
    kerning: true,
    language: "dflt",
    features: [{ script: "DFLT", tags: ["ccmp", "liga"] }],
  };

  for (const emoji of emojiData) {
    const { unicode } = emoji;
    const glyphs = font.stringToGlyphs(unicode, opentypeOptions);
    let glyph;
    if (glyphs.length === 1) {
      glyph = glyphs[0];
    } else {
      const indexes = glyphs.map((glyph) => glyph.index);
      const sub = substitutions.find((substitution) => equals(substitution.sub, indexes));

      if (sub) {
        glyph = font.glyphs.get(sub.by);
        console.log(unicode, "found sub", indexes, sub.by);
      } else {
        console.log(unicode, "no ccmp sub", indexes);
      }
    }
  }

/** Custom equals function that can also check lists. */
function equals(a, b) {
  if (a === b) {
    return true;
  } else if (Array.isArray(a) && Array.isArray(b)) {
    if (a.length !== b.length) {
      return false;
    }
    for (let i = 0; i < a.length; i += 1) {
      if (!equals(a[i], b[i])) {
        return false;
      }
    }
    return true;
  } else {
    return false;
  }
}

Context

Using noto-emoji in our CAD app, https://cuttle.xyz

Your Environment

Version used: be0d441
Font used: noto-emoji v47
Browser Name and version: Node
Operating System and version (desktop or mobile):
Link to your project:

The text was updated successfully, but these errors were encountered:

Connum · 2024-04-15T08:17:52Z

@TonyJR would you be available to have a look at this, as you implemented the ccmp feature?

TonyJR · 2024-04-15T08:35:04Z

@TonyJR would you be available to have a look at this, as you implemented the ccmp feature?

Yes, I'm trying this.
I found the rule for "#️⃣ found sub [ 4, 22 ] 1520"

sub numbersign uni20E3 by keycap_hash;

It's should be GSUB4.1. I will find the reason.

forresto · 2024-04-15T09:53:25Z

Here are the ones that should result in one glyph, but return multiple.

[
{"string":"#️⃣","indexes":[4,23],"expected":1548},
{"string":"*️⃣","indexes":[5,23],"expected":1549},
{"string":"0️⃣","indexes":[6,23],"expected":1559},
{"string":"1️⃣","indexes":[7,23],"expected":1550},
{"string":"2️⃣","indexes":[8,23],"expected":1551},
{"string":"3️⃣","indexes":[9,23],"expected":1552},
{"string":"4️⃣","indexes":[10,23],"expected":1553},
{"string":"5️⃣","indexes":[11,23],"expected":1554},
{"string":"6️⃣","indexes":[12,23],"expected":1555},
{"string":"7️⃣","indexes":[13,23],"expected":1556},
{"string":"8️⃣","indexes":[14,23],"expected":1557},
{"string":"9️⃣","indexes":[15,23],"expected":1558},
{"string":"🏋️‍♀️","indexes":[447,18,82],"expected":447},
{"string":"🏋️‍♂️","indexes":[447,18,83],"expected":447},
{"string":"🏌️‍♀️","indexes":[448,18,82],"expected":448},
{"string":"🏌️‍♂️","indexes":[448,18,83],"expected":448},
{"string":"🏳️‍🌈","indexes":[485,18,256],"expected":1871},
{"string":"🏳️‍⚧️","indexes":[485,18,116],"expected":1872},
{"string":"👁️‍🗨️","indexes":[566,18,886],"expected":1432},
{"string":"👨‍❤️‍👨","indexes":[605,18,170,18,605],"expected":646},
{"string":"👨‍❤️‍💋‍👨","indexes":[605,18,170,18,640,18,605],"expected":644},
{"string":"👩‍❤️‍👨","indexes":[606,18,170,18,605],"expected":646},
{"string":"👩‍❤️‍👩","indexes":[606,18,170,18,606],"expected":646},
{"string":"👩‍❤️‍💋‍👨","indexes":[606,18,170,18,640,18,605],"expected":644},
{"string":"👩‍❤️‍💋‍👩","indexes":[606,18,170,18,640,18,606],"expected":644},
{"string":"🕵️‍♀️","indexes":[855,18,82],"expected":855},
{"string":"🕵️‍♂️","indexes":[855,18,83],"expected":855},
{"string":"⛹️‍♀️","indexes":[141,18,82],"expected":141},
{"string":"⛹️‍♂️","indexes":[141,18,83],"expected":141},
{"string":"❤️‍🔥","indexes":[170,18,795],"expected":1433},
{"string":"❤️‍🩹","indexes":[170,18,1346],"expected":1434},
]

TonyJR · 2024-04-15T10:03:25Z

I found the reason!
You have inputed a “fully-qualified” emoji and the font not supported.

TonyJR · 2024-04-15T11:10:11Z

WTF！Figma draw it right. I'm going to find the reason out.

Connum · 2024-04-24T18:40:39Z

@TonyJR any progress on this?

TonyJR · 2024-04-25T10:56:23Z

@TonyJR any progress on this?

Sorry, I've been a bit busy lately. \uFE00-\uFE0F are variation selectors, which should deal in cmap.
I have tested halfbuzz and it skips these characters.
I have two solutions to solve the bug.

Process cmap before processing gsub. After this, remove them away.
Skip them when processing gsub.

I prefer the first option. @Connum, are you familiar with CMAP.

Connum · 2024-04-25T11:34:48Z

I implemented a special handling of variation selectors some time ago, maybe that's interfering? And the order of processing should be stated in the docs. As far as I remember, cmap should be handled before any layout is applied.

TonyJR · 2024-04-25T12:10:41Z

Yes, you are right. I'm trying to find the order. But I prefer to directly refer to the Halfbuzz source code.
And I found that half actually merges and processes the functions in gsub/gpos. Perhaps we should also refer to it, but this may be a big project...

Connum added bug Needs Investigation font features labels Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

missing emoji substitutions #700

missing emoji substitutions #700

forresto commented Apr 15, 2024

Connum commented Apr 15, 2024

TonyJR commented Apr 15, 2024

forresto commented Apr 15, 2024 •

edited

TonyJR commented Apr 15, 2024

TonyJR commented Apr 15, 2024

Connum commented Apr 24, 2024

TonyJR commented Apr 25, 2024

Connum commented Apr 25, 2024

TonyJR commented Apr 25, 2024

missing emoji substitutions #700

missing emoji substitutions #700

Comments

forresto commented Apr 15, 2024

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Context

Your Environment

Connum commented Apr 15, 2024

TonyJR commented Apr 15, 2024

forresto commented Apr 15, 2024 • edited

TonyJR commented Apr 15, 2024

TonyJR commented Apr 15, 2024

Connum commented Apr 24, 2024

TonyJR commented Apr 25, 2024

Connum commented Apr 25, 2024

TonyJR commented Apr 25, 2024

forresto commented Apr 15, 2024 •

edited