Some Unicode in identifiers improperly rejected #60214

JoshuaLandau · 2012-09-23T23:45:22Z

BPO	16010
Nosy	@bitdancer

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2012-09-24.01:40:36.990>
created_at = <Date 2012-09-23.23:45:22.252>
labels = ['interpreter-core', 'invalid']
title = 'Some Unicode in identifiers improperly rejected'
updated_at = <Date 2012-09-25.21:08:15.260>
user = 'https://bugs.python.org/JoshuaLandau'

bugs.python.org fields:

activity = <Date 2012-09-25.21:08:15.260>
actor = 'terry.reedy'
assignee = 'none'
closed = True
closed_date = <Date 2012-09-24.01:40:36.990>
closer = 'r.david.murray'
components = ['Interpreter Core']
creation = <Date 2012-09-23.23:45:22.252>
creator = 'Joshua.Landau'
dependencies = []
files = []
hgrepos = []
issue_num = 16010
keywords = []
message_count = 2.0
messages = ['171082', '171089']
nosy_count = 2.0
nosy_names = ['r.david.murray', 'Joshua.Landau']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue16010'
versions = ['Python 3.2']

JoshuaLandau · 2012-09-23T23:45:21Z

"a¹ = None" is not valid, even though unicodedata.normalize("NFKC", "¹") == "1".

One would expect "a¹ = None" and "a1 = None" to be equivalent in this case, as with "aⁱ = None" and "ai = None".

I am not sure how many other characters exhibit the same problem.

References:
http://docs.python.org/py3k/reference/lexical_analysis.html#identifiers
http://mail.python.org/pipermail/python-list/2012-September/631420.html

"¹" === "\u00b9"
"ⁱ" === "\u2071"

bitdancer · 2012-09-24T01:40:37Z

I find it unexpected that aⁱ and ai name the same variable, but I suppose that is a consequence of the unicode normalization rules (meaning what I really find surprising is the normalization).

As for the '¹', its category is No, which does not appear in the list in the identifiers section you link to, while 'ⁱ' is Lm, which does.

So there is no bug here.

JoshuaLandau mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Sep 23, 2012

bitdancer closed this as completed Sep 24, 2012

terryjreedy added the invalid label Sep 25, 2012

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some Unicode in identifiers improperly rejected #60214

Some Unicode in identifiers improperly rejected #60214

JoshuaLandau mannequin commented Sep 23, 2012

JoshuaLandau mannequin commented Sep 23, 2012

bitdancer commented Sep 24, 2012

Some Unicode in identifiers improperly rejected #60214

Some Unicode in identifiers improperly rejected #60214

Comments

JoshuaLandau mannequin commented Sep 23, 2012

JoshuaLandau mannequin commented Sep 23, 2012

bitdancer commented Sep 24, 2012