Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emoji-regex/text match string contains number #14

Closed
roderickhsiao opened this issue Mar 16, 2017 · 8 comments
Closed

emoji-regex/text match string contains number #14

roderickhsiao opened this issue Mar 16, 2017 · 8 comments

Comments

@roderickhsiao
Copy link

roderickhsiao commented Mar 16, 2017

When the string has number inside, emoji-regex/text matches it

version 6.4.0

var emojiRegex = require('emoji-regex/text');
const matchExpected = emojiRegex().exec('foo');
console.log(matchExpected)
// null

const matchUnExpected = emojiRegex().exec('foo123');
console.log(matchUnExpected)
// [ '1', index: 3, input: 'foo123' ]
``
@mathiasbynens
Copy link
Owner

That’s the expected behavior. E.g. 1 is a text emoji. See http://unicode.org/Public/emoji/5.0/emoji-data.txt:

0030..0039    ; Emoji                # 1.1 [10] (0️..9️)    digit zero..digit nine

Per spec, they’re only supposed to be rendered in emoji form when followed by a variation selector: http://unicode.org/Public/emoji/5.0/emoji-sequences.txt But since you’re using the text regex you opt in to matching them anyway.

@roderickhsiao
Copy link
Author

Thanks @mathiasbynens I guess we will need to handle on our side then 👍

cheers

@mathiasbynens
Copy link
Owner

emoji-regex matches emoji according to the Unicode Standard. It sounds like you want to do something else — how do you determine what’s an emoji and what isn’t?

@roderickhsiao
Copy link
Author

roderickhsiao commented Mar 17, 2017

Yes, we are parsing a string and try to extract the emoji, currently after parsing we are getting the number which probably shouldnt present as Emoji in our case

@roderickhsiao
Copy link
Author

we basically just split the sentence and check individual emojiRegex().test(c) to get emoji in sentence

@mathiasbynens
Copy link
Owner

You didn’t answer the question — for your use case, how do you decide what constitutes an emoji and what isn’t?

@roderickhsiao
Copy link
Author

roderickhsiao commented Mar 17, 2017

We add a flag for parsed emoji which match the unicode spec and check if browser render an emoji (icon) for that purely.

@roderickhsiao
Copy link
Author

Checked the spec, probably we want to exclude

0023 ; Emoji # 1.1 [1] (#️) number sign
002A ; Emoji # 1.1 [1] (*️) asterisk
0030..0039 ; Emoji # 1.1 [10] (0️..9️) digit zero..digit nine

But you are absolutely correct, those are consider valid emoji.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants