Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unicode name accepts a punctuation glyph #60453

Closed
julientayon mannequin opened this issue Oct 16, 2012 · 4 comments
Closed

unicode name accepts a punctuation glyph #60453

julientayon mannequin opened this issue Oct 16, 2012 · 4 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@julientayon
Copy link
Mannequin

julientayon mannequin commented Oct 16, 2012

BPO 16249
Nosy @ezio-melotti, @bitdancer

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/ezio-melotti'
closed_at = <Date 2012-10-16.16:02:24.509>
created_at = <Date 2012-10-16.15:32:35.472>
labels = ['interpreter-core', 'type-bug', 'invalid']
title = 'unicode name accepts a punctuation glyph'
updated_at = <Date 2012-10-16.16:02:24.507>
user = 'https://bugs.python.org/julientayon'

bugs.python.org fields:

activity = <Date 2012-10-16.16:02:24.507>
actor = 'ezio.melotti'
assignee = 'ezio.melotti'
closed = True
closed_date = <Date 2012-10-16.16:02:24.509>
closer = 'ezio.melotti'
components = ['Interpreter Core']
creation = <Date 2012-10-16.15:32:35.472>
creator = 'julien.tayon'
dependencies = []
files = []
hgrepos = []
issue_num = 16249
keywords = []
message_count = 4.0
messages = ['173049', '173052', '173055', '173056']
nosy_count = 3.0
nosy_names = ['ezio.melotti', 'r.david.murray', 'julien.tayon']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue16249'
versions = ['Python 3.2']

@julientayon
Copy link
Mannequin Author

julientayon mannequin commented Oct 16, 2012

I guess unicode variable names are restricted to letters, and that symbols and punctuation shoud be ignored (except _).

I have tested other dots (punctuation) they dont work.

Only
http://www.fileformat.info/info/unicode/char/00b7/index.htm
oddly enough has worked so far.

$ python3.2
Python 3.2.3 (default, Sep 10 2012, 18:14:40) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> foo⋅bar=42
  File "<stdin>", line 1
    foo⋅bar=42
            ^
SyntaxError: invalid character in identifier
>>> print(ord("foo⋅bar"[3]))
8901
>>> foo·bar = 42
>>> print(ord("foo·bar"[3]))
183

I have sampled randomly in the same block as MIDDLE DOT and it seems to behave correctly.

@julientayon julientayon mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Oct 16, 2012
@bitdancer
Copy link
Member

The rules for python identifiers are documented here:

http://docs.python.org/dev/reference/lexical_analysis.html#identifiers

Are you saying that the behavior does not match the documentation?

@julientayon
Copy link
Mannequin Author

julientayon mannequin commented Oct 16, 2012

http://www.fileformat.info/info/unicode/char/b7/index.htm

the unicode category is Po (Ponctuation).

Empirically, it cannot start a variable name so according to the rules given in the lexical analyser it should be one of : Mn, Mc, Nd, Pc

Which is not the case Po not in [ Mn, Mc, Nd, Pc ].

Modulo my weak brain, it does not seem right.

@ezio-melotti
Copy link
Member

The characters with the Other_ID_Continue property are also included, i.e.:

00B7 ; Other_ID_Continue # Po MIDDLE DOT
0387 ; Other_ID_Continue # Po GREEK ANO TELEIA
1369..1371 ; Other_ID_Continue # No [9] ETHIOPIC DIGIT ONE..ETHIOPIC DIGIT NINE
19DA ; Other_ID_Continue # No NEW TAI LUE THAM DIGIT ONE

See http://unicode.org/Public/UNIDATA/PropList.txt

@ezio-melotti ezio-melotti self-assigned this Oct 16, 2012
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants