-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support \u and \U escapes in regexes #47915
Comments
Since \u and \U aren't interpolated in raw strings anymore, the re |
|
(in the last sentence, I meant UCS2. Sorry) |
These concerns indeed must be handled: On narrow unicode builds, chars > Additionally, this should at least raise an error too: >>> re.compile("[\U00100000]").match("\U00100000").group()
'\udbc0' |
Here's an updated patch for py3k branch. |
FYI, can be written simply as + raise error("bogus escape: %r" % escape) |
I don't think it is worth to target it for 2.7 and 3.2 (it's new feature, not bugfix), but for 3.3 it will be very useful. Since PEP-393 conversion to the surrogate pairs is no longer relevant. |
Georg, Atsuo, how are you? |
Here is updated (in conforming with PEP-393) patch. In additional octal and hexadecimal escaping cleared, illegal error message for hexadecimal escaping fixed. Added new tests for octal and hexadecimal escaping. |
I forgot about byte patterns. Here is an updated patch. |
Any chance to commit the patch today and to get this feature in Python 3.3? |
New changeset b1dbd8827e79 by Antoine Pitrou in branch 'default': |
Thanks for reminding us! It's now in 3.3. |
Thank you for the quick response. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: