Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"utf8" not always a synonym for "utf-8" in lib2to3 #83335

Closed
PeterLudemann mannequin opened this issue Dec 29, 2019 · 2 comments
Closed

"utf8" not always a synonym for "utf-8" in lib2to3 #83335

PeterLudemann mannequin opened this issue Dec 29, 2019 · 2 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes topic-2to3 type-bug An unexpected behavior, bug, or error

Comments

@PeterLudemann
Copy link
Mannequin

PeterLudemann mannequin commented Dec 29, 2019

BPO 39154
Nosy @benjaminp, @ezio-melotti

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2021-10-20.23:06:02.911>
created_at = <Date 2019-12-29.17:42:10.169>
labels = ['3.7', '3.8', 'type-bug', 'expert-2to3', '3.9']
title = '"utf8" not always a synonym for "utf-8" in lib2to3'
updated_at = <Date 2021-10-20.23:06:02.910>
user = 'https://bugs.python.org/PeterLudemann'

bugs.python.org fields:

activity = <Date 2021-10-20.23:06:02.910>
actor = 'iritkatriel'
assignee = 'none'
closed = True
closed_date = <Date 2021-10-20.23:06:02.911>
closer = 'iritkatriel'
components = ['2to3 (2.x to 3.x conversion tool)']
creation = <Date 2019-12-29.17:42:10.169>
creator = 'Peter Ludemann'
dependencies = []
files = []
hgrepos = []
issue_num = 39154
keywords = []
message_count = 2.0
messages = ['358995', '359024']
nosy_count = 3.0
nosy_names = ['benjamin.peterson', 'ezio.melotti', 'Peter Ludemann']
pr_nums = []
priority = 'normal'
resolution = 'wont fix'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue39154'
versions = ['Python 3.7', 'Python 3.8', 'Python 3.9']

@PeterLudemann PeterLudemann mannequin added topic-unicode 3.7 (EOL) end of life 3.8 only security fixes type-bug An unexpected behavior, bug, or error labels Dec 29, 2019
@PeterLudemann
Copy link
Mannequin Author

PeterLudemann mannequin commented Dec 29, 2019

lib2to3.tokenize should allow 'utf8' and 'utf-8' interchangeably, to be consistent with the rest of the Python library (I looked through the library source, and there seems to be no consistent preference, and also many (but not all) checks for 'utf-8' also check for 'utf8'). In particular, tokenize.detect_encoding should have code for both forms, as the encoding can be set by the user. Also, code should allow for 'UTF8' and 'UTF-8'.

See also https://bugs.python.org/issue39154

(This is probably a larger issue than just lib2to3, as a quick grep through /usr/lib/python3.7 showed; but not sure how to best address that.)

@PeterLudemann PeterLudemann mannequin added topic-2to3 and removed topic-unicode labels Dec 29, 2019
@PeterLudemann PeterLudemann mannequin changed the title "utf8-sig" missing from codecs (inconsistency) "utf8" not always a synonym for "utf-8" in lib2to3 Dec 29, 2019
@PeterLudemann PeterLudemann mannequin added topic-2to3 and removed topic-unicode labels Dec 29, 2019
@PeterLudemann PeterLudemann mannequin changed the title "utf8-sig" missing from codecs (inconsistency) "utf8" not always a synonym for "utf-8" in lib2to3 Dec 29, 2019
@PeterLudemann
Copy link
Mannequin Author

PeterLudemann mannequin commented Dec 30, 2019

To clarify and fix a typo ... lib2to3.pgen2.tokenize.detect_encoding checks for 'utf-8'(and 'utf_8') but not 'utf8' in various places. Similarly for 'latin-1' and 'latin1'. (The codecs documentation page allows 'utf8' and 'latin1' as codecs.)

['UTF-8' is taken care of in _get_normal_name]

See also https://bugs.python.org/issue39155

@terryjreedy terryjreedy added 3.9 only security fixes labels Jan 4, 2020
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes topic-2to3 type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants