New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex does not match many objects from Apple emoji palette, but does match the same emojis from Android #68
Comments
I've noticed previously that Apple has a loose conformance with the Unicode spec around emoji characters — specifically which ones do and don't need the special Regarding your specific example, there's definitely a difference in the output between the platforms, even though they have the same visual appearance on my machine. const listCodePoints = (arr) =>
arr.map(
(e) => [...e].map(
(cp) => `U+${cp.codePointAt(0).toString(16).toUpperCase()}`
).join(' ')
);
console.log(listCodePoints(ios));
// [ "U+23F1", "U+23F2", "U+1F570", "U+231B U+FE0F", "U+23F3", "U+1F39B" ]
console.log(listCodePoints(android));
// [ "U+23F1 U+FE0F", "U+23F2 U+FE0F", "U+1F570 U+FE0F", "U+231B", "U+23F3", "U+1F39B U+FE0F" ] |
@gilmoreorless Does it make sense to make |
@scottnonnenberg-signal It does make sense as a potential variation. I pondered about a "loose" variant in #33 (comment), but that was about a slightly different problem. The simple answer is no-one has yet done the work to add one — @mathiasbynens pointed out it's not a straightforward task in #28 (comment). |
The best long-term solution is for Apple to respect the Unicode Standard and not deviate from it. In recent macOS updates it seems like emoji input has improved in terms of spec compliance, so I'm hopeful. @kenpowers-signal Do you have an up-to-date iOS device handy? Could you try inputting those emoji again on the latest iOS? I wonder if the variation selectors are still missing. |
iOS 14.2.1 Apparently all the same as earlier iOS. |
v10.0.0 now leverages |
Summary: Turns out that [macOS](mathiasbynens/emoji-regex#28 (comment)) [appends](mathiasbynens/emoji-regex#68) the `U+FE0F` the character to some Unicode emojis when you select them from the native OS emoji selector. It's not clear why Apple does this, or why it only happens for a certain set of emoji. This still counts as [valid emoji Unicode](mathiasbynens/emoji-regex#28 (comment)). However, our `onlyOneEmojiRegex` thinks it's two emojis. Our implementation of `onlyOneEmojiRegex` involves introspecting into the RegExp string that `emoji-regex` uses, and is not an officially supported approach by that package. `emoji-regex` supports matching emojis in text, and checking if the text includes only emoji. But checking for precisely one emoji is more complicated, and our approach (which is basically just extracting the raw RegExp and putting it inside of `/^()$/`) doesn't work in some scenarios where `U+FE0F` is suffixed. Luckily we don't use the native macOS emoji selector in any of our UIs, but it does look like @Ginsu used it to select some of the emojis. The diff adds a unit test to make sure all of the default emojis pass `onlyOneEmojiRegex`, and fixes all failing emojis. Test Plan: I noticed that a test username of `at4` would match up with an anchor emoji as the default, and the anchor emoji was failing to be set. After this diff everything worked Reviewers: ginsu, atul Reviewed By: atul Subscribers: tomek, ginsu Differential Revision: https://phab.comm.dev/D8145
There are several emojis which can be inserted from the iOS / macOS emoji pickers which are not recognized by the regex provided by this library, but the same emojis inserted from the Android emoji picker are recognized.
Runkit output:
I haven't dug into the unicode to see what's happening just yet.
The text was updated successfully, but these errors were encountered: