New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode control characters are not allowed as identifiers #49608
Comments
I tried to use Zero-width joiner (U+200D) as part of an identifier. SyntaxError: invalid character in identifier I have attached the Python file which produce this error. Zero-width joiner (U+200D) is a Unicode control character: |
Why do you think this is a bug? |
On a further look at this issue, I understood Python cannot use all [1] http://en.wikipedia.org/wiki/Zero-width_joiner |
I think RFC-3454 [1] can be used as a base for selecting the control |
Valid identifiers should begin with a letter or '_' and contain only Some examples:
>>> a-b = 5 # U+FF0D, Cat: Pd, FULLWIDTH HYPHEN-MINUS
SyntaxError: invalid character in identifier
>>> a# = 5 # U+FF03, Cat: Po, FULLWIDTH NUMBER SIGN
SyntaxError: invalid character in identifier
>>> a)b = 5 # U+FF09, Cat: Pe, FULLWIDTH RIGHT PARENTHESIS
SyntaxError: invalid character in identifier
>>> a_b = 5 # U+FF3F, Cat: Pc, FULLWIDTH LOW LINE
>>> a_b
5
>>> a﹍b﹎c﹏d = 5 # U+FE4D, U+FE4E, U+FE4F, Cat: Pc
>>> a﹍b﹎c﹏d
5 |
The definition of a word in the new re module (actually targetted at I suppose ideally we want the definitions of a word and an identifier to |
See PEP-3131 for a specification what is an identifier in Python. Closing this as "won't fix". |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: