Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some Unicode in identifiers improperly rejected #60214

Closed
JoshuaLandau mannequin opened this issue Sep 23, 2012 · 2 comments
Closed

Some Unicode in identifiers improperly rejected #60214

JoshuaLandau mannequin opened this issue Sep 23, 2012 · 2 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@JoshuaLandau
Copy link
Mannequin

JoshuaLandau mannequin commented Sep 23, 2012

BPO 16010
Nosy @bitdancer

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2012-09-24.01:40:36.990>
created_at = <Date 2012-09-23.23:45:22.252>
labels = ['interpreter-core', 'invalid']
title = 'Some Unicode in identifiers improperly rejected'
updated_at = <Date 2012-09-25.21:08:15.260>
user = 'https://bugs.python.org/JoshuaLandau'

bugs.python.org fields:

activity = <Date 2012-09-25.21:08:15.260>
actor = 'terry.reedy'
assignee = 'none'
closed = True
closed_date = <Date 2012-09-24.01:40:36.990>
closer = 'r.david.murray'
components = ['Interpreter Core']
creation = <Date 2012-09-23.23:45:22.252>
creator = 'Joshua.Landau'
dependencies = []
files = []
hgrepos = []
issue_num = 16010
keywords = []
message_count = 2.0
messages = ['171082', '171089']
nosy_count = 2.0
nosy_names = ['r.david.murray', 'Joshua.Landau']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue16010'
versions = ['Python 3.2']

@JoshuaLandau
Copy link
Mannequin Author

JoshuaLandau mannequin commented Sep 23, 2012

"a¹ = None" is not valid, even though unicodedata.normalize("NFKC", "¹") == "1".

One would expect "a¹ = None" and "a1 = None" to be equivalent in this case, as with "aⁱ = None" and "ai = None".

I am not sure how many other characters exhibit the same problem.

References:
http://docs.python.org/py3k/reference/lexical_analysis.html#identifiers
http://mail.python.org/pipermail/python-list/2012-September/631420.html

"¹" === "\u00b9"
"ⁱ" === "\u2071"

@JoshuaLandau JoshuaLandau mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Sep 23, 2012
@bitdancer
Copy link
Member

I find it unexpected that aⁱ and ai name the same variable, but I suppose that is a consequence of the unicode normalization rules (meaning what I really find surprising is the normalization).

As for the '¹', its category is No, which does not appear in the list in the identifiers section you link to, while 'ⁱ' is Lm, which does.

So there is no bug here.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)
Projects
None yet
Development

No branches or pull requests

2 participants