-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some emojis ending with \ufe0f
ย are not completely matched
#28
Comments
Thatโs not a standard emoji sequence AFAICT โ U+1F575 U+FE0F is not listed in |
Apple appears to have a very loose idea of conformance to the standard set of codepoints. While working on other fixes for There were 49 |
Any real downsides to adding this control character to the regex? Besides bloating the regex just to workaround a possible MacOS bug. |
Excerpt from http://unicode.org/Public/emoji/5.0/emoji-test.txt:
So the sequence in question is rather conformant. |
Thanks for the pointer, @artyom! Per http://unicode.org/reports/tr51/#Emoji_Implementation_Notes, emoji ZWJ sequences โmay have an emoji presentation selectorโ. |
Hacky solution: just add |
For my own project's use I ended up going with that same hacky solution. I figured it wasn't right to submit a PR back to this project for it, so I just left it on a custom branch of my fork. |
Are there any plans to integrate this into the project? It seems that the consensus is that this is a legitimate use case...sorry if I'm off base here |
@fredvollmer #28 (comment) answers your question. Iโd welcome a patch :) |
@mathiasbynens how to solve this quesiton? I met this question, too |
Hi @mathiasbynens, is it possible to add rules for those not fall on the sequence egs :
|
Try again using the latest release! const emojiRegex = require('emoji-regex');
const string = '\u{1F575}\uFE0F'; // '๐ต๏ธ'
console.log(
string.match(emojiRegex())
);
// โ [ '๐ต๏ธ' ] Closing as fixed. Feel free to reopen or file a new bug in case I missed anything. |
Summary: Turns out that [macOS](mathiasbynens/emoji-regex#28 (comment)) [appends](mathiasbynens/emoji-regex#68) the `U+FE0F` the character to some Unicode emojis when you select them from the native OS emoji selector. It's not clear why Apple does this, or why it only happens for a certain set of emoji. This still counts as [valid emoji Unicode](mathiasbynens/emoji-regex#28 (comment)). However, our `onlyOneEmojiRegex` thinks it's two emojis. Our implementation of `onlyOneEmojiRegex` involves introspecting into the RegExp string that `emoji-regex` uses, and is not an officially supported approach by that package. `emoji-regex` supports matching emojis in text, and checking if the text includes only emoji. But checking for precisely one emoji is more complicated, and our approach (which is basically just extracting the raw RegExp and putting it inside of `/^()$/`) doesn't work in some scenarios where `U+FE0F` is suffixed. Luckily we don't use the native macOS emoji selector in any of our UIs, but it does look like @Ginsu used it to select some of the emojis. The diff adds a unit test to make sure all of the default emojis pass `onlyOneEmojiRegex`, and fixes all failing emojis. Test Plan: I noticed that a test username of `at4` would match up with an anchor emoji as the default, and the anchor emoji was failing to be set. After this diff everything worked Reviewers: ginsu, atul Reviewed By: atul Subscribers: tomek, ginsu Differential Revision: https://phab.comm.dev/D8145
Male detective emoji, ๐ต๏ธ
"\u{1f575}\ufe0f"
, when matched with emoji regex, not all of its codepoints are consumed, leaving\ufe0f
behind. The emoji is typed with control+cmd+space shortcut of Mac.The text was updated successfully, but these errors were encountered: