Normalize Emoji variant selectors in identifiers? #31588

Keno · 2019-04-02T21:06:03Z

Today I came across the existence of emoji variant selectors (basically what happened was that the unicode standard already had a bunch of symbols that were a little like emoji but not colorful, so they added a combining character to make those colorful and similarly to make emoji non-colorful). At the moment, we consider these significant in identifiers, so we allow things like:

We should make a decision on whether we want to normalize out this distinction in our identifier normalization.

Keno · 2019-04-03T02:11:08Z

A related issue: Should we change the \:emoji: completions to include the emoji variant selectors? E.g. \:phone: completes to the non-emoji variant.

stevengj · 2019-04-03T16:14:54Z

I feel like emoji normalization is something we should leave to the Unicode consortium — it seems like they should really fix this in NFC, and it's not worth the effort for us to use a custom normalization here.

For tab completion we can do whatever we want, of course.

StefanKarpinski · 2019-04-03T20:41:37Z

I agree. The main concern for identifier normalization is when two different identifiers are both easy to input and hard to distinguish. That doesn’t seem to be the case here. If the Unicode consortium decides to normalize these then we can follow suit.

Keno · 2019-04-03T20:47:52Z

That doesn’t seem to be the case here.

Isn't it? iTerm2 has decent unicode support, but half the other software I tried (including all editors we support) either render them the same or render one or the other as a replacement character. As for inputting them, if I google "telephone emoji" I get to https://emojipedia.org/black-telephone/ and if I copy that, I get the emoji variant, which is different from what you get by doing \:phone:<tab>. Worse, if I accidentally backspace on one of the emoji variant identifiers in sublime, I get the non-emoji variant (which may or may not be rendered the same).

StefanKarpinski · 2019-04-03T20:55:31Z

The most conservative option would be to reject modified emoji altogether.

Keno · 2019-04-03T20:58:42Z

True, but for a number of emoji (☎️ being an example), the rendering that people identify with the emoji is the one that has the variant selector.

stevengj · 2019-04-03T21:26:41Z

These are hardly the confusable characters that I would worry about most (as opposed to, say, Α vs. A), given that the emoji tab completion was added as an April Fool's joke (#10709) and emoji variables are mostly a party trick in Julia rather than a practical programming style. Let the ~~Unicode Consortium~~ Emoji Emporium worry about this.

StefanKarpinski · 2019-04-03T21:31:23Z

Emoji Emporium

😂 too true

JeffBezanson added the domain:unicode Related to unicode characters and encodings label Apr 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalize Emoji variant selectors in identifiers? #31588

Normalize Emoji variant selectors in identifiers? #31588

Keno commented Apr 2, 2019

Keno commented Apr 3, 2019

stevengj commented Apr 3, 2019

StefanKarpinski commented Apr 3, 2019

Keno commented Apr 3, 2019

StefanKarpinski commented Apr 3, 2019

Keno commented Apr 3, 2019

stevengj commented Apr 3, 2019 •

edited

Loading

StefanKarpinski commented Apr 3, 2019

Normalize Emoji variant selectors in identifiers? #31588

Normalize Emoji variant selectors in identifiers? #31588

Comments

Keno commented Apr 2, 2019

Keno commented Apr 3, 2019

stevengj commented Apr 3, 2019

StefanKarpinski commented Apr 3, 2019

Keno commented Apr 3, 2019

StefanKarpinski commented Apr 3, 2019

Keno commented Apr 3, 2019

stevengj commented Apr 3, 2019 • edited Loading

StefanKarpinski commented Apr 3, 2019

stevengj commented Apr 3, 2019 •

edited

Loading