-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
str.isidentifier() does not work with non-BMP non-canonicalized strings on Windows #84776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
>>> import _testcapi
>>> u = '\U0001d580\U0001d593\U0001d58e\U0001d588\U0001d594\U0001d589\U0001d58a'
>>> u.isidentifier()
True
>>> _testcapi.unicode_legacy_string(u).isidentifier()
False |
It's maybe time to speed up the deprecation of the legacy C API using Py_UNICODE... |
My previous change on this function: commit f3e7ea5
|
I am not sure that changes in bpo-39500 was correct. It is easier to catch a bug if crash consistently when you pass a non-canonicalized strings then if silently return a wrong result for specific input on particular platform. Alternatively, you could reimplement correct handling of surrogate pairs in PyUnicode_IsIdentifier(). |
Thanks for the fix Serhiy! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: