New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emoji "eye in speech bubble" is not interpreted correctly #46

Open
yantene opened this Issue Jul 24, 2017 · 2 comments

Comments

Projects
None yet
2 participants
@yantene
Collaborator

yantene commented Jul 24, 2017

There is an emoji named "eye in speech bubble" in the Unicode Emoji List.

eye in speech bubble - Emoji List, v5.0

According to this list, the emoji consists of codepoint U+1F441 U+FE0F U+200D U+1F5E8 U+FE0F,
but the following code returns nil.

Twemoji.find_by(unicode: "\u{1f441}\u{fe0f}\u{200d}\u{1f5e8}\u{fe0f}")

This is because the emoji has a codepoint U+1F441 U+200D U+1F5E8 in this gem.

Codepoint U+FE0F is "emoji presentation selector", a variation selector for emojis.

Similar issues are also found in keycap emojis. (:hash:, :asterisk:, etc.)
On the other hand, :{man,woman}-{kiss,heart}-{man,woman}: emojis contains U+FE0F in this gem.

I think this is a problem.
I think this issue should be fixed, but I have no idea.
What do you think?

@renechz

This comment has been minimized.

Show comment
Hide comment
@renechz

renechz Jul 27, 2017

Having a related issue, I think:

Twemoji.find_by(unicode: "☝️") # => nil
Twemoji.find_by(unicode: "✌️") # => nil

This seems to consistently happen with emojis that include the U+FE0F as a trailing presentation codepoint. e.g. the :point_up: has a codepoint U+261D in this gem, but is parsed as "261d-fe0f".

Twemoji.unicode_to_str("☝️") # => "261d-fe0f"

As far as I can tell from emoji variation sequences, the presentation codepoint is to say "this is a graphic (e.g. non-text) emoji". Wouldn't it be safe to add the U+FE0F suffix and assume all emojis are "non-text"?

This gem is using emoji-unicode.yml to build Regexp to match the emojis unicode and text. Maybe the fe0f codepoint can be added as an optional group? e.g ":point-up:": 261d(-fe0f)?. (Not good with regex, so I don't really have an idea).

That would likely bring some other issues in itself as I'm not completely familiar with the gem, but it could be a starting point.

renechz commented Jul 27, 2017

Having a related issue, I think:

Twemoji.find_by(unicode: "☝️") # => nil
Twemoji.find_by(unicode: "✌️") # => nil

This seems to consistently happen with emojis that include the U+FE0F as a trailing presentation codepoint. e.g. the :point_up: has a codepoint U+261D in this gem, but is parsed as "261d-fe0f".

Twemoji.unicode_to_str("☝️") # => "261d-fe0f"

As far as I can tell from emoji variation sequences, the presentation codepoint is to say "this is a graphic (e.g. non-text) emoji". Wouldn't it be safe to add the U+FE0F suffix and assume all emojis are "non-text"?

This gem is using emoji-unicode.yml to build Regexp to match the emojis unicode and text. Maybe the fe0f codepoint can be added as an optional group? e.g ":point-up:": 261d(-fe0f)?. (Not good with regex, so I don't really have an idea).

That would likely bring some other issues in itself as I'm not completely familiar with the gem, but it could be a starting point.

@yantene

This comment has been minimized.

Show comment
Hide comment
@yantene

yantene Aug 9, 2017

Collaborator

@renechz Sorry for the delay in reaction.
Thank you for your mention.

I think that your proposal is not bad and should be implemented.
@JuanitoFatas What do you think?

Collaborator

yantene commented Aug 9, 2017

@renechz Sorry for the delay in reaction.
Thank you for your mention.

I think that your proposal is not bad and should be implemented.
@JuanitoFatas What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment