Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegEx not matched due to extended Unicode characters #38

Closed
kapfab opened this issue Jul 26, 2017 · 2 comments
Closed

RegEx not matched due to extended Unicode characters #38

kapfab opened this issue Jul 26, 2017 · 2 comments

Comments

@kapfab
Copy link

kapfab commented Jul 26, 2017

When a string contains extended Unicode characters (like emojis), RegEx are not applied to its entirety.
This is usually unnoticed but when you try to match something at the very end of a string containing emojis, you can spend hours trying to find an issue in your RegEx…

This is probably linked to the NSString/String mapping used for Swift RegEx functions, explained here:
https://stackoverflow.com/questions/29756530/swift-regex-matching-fails-when-source-contains-unicode-characters

Replacing the few occurrences of

    let range = GroupRange(location: 0, length: source.characters.count)

by

    let range = GroupRange(location: 0, length: (source as NSString).length)

seems to make everything working fine.

@kengruven
Copy link

My understanding is that .characters (a String.CharacterView) is extended grapheme clusters, while NSRegularExpression expects an NSRange in terms of UTF-16 code units.

Using source.utf16.count should also work.

@ypopovych
Copy link
Member

Should be fixed now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants