Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csv.DictReader.fieldnames interprets unicode as ascii #84057

Closed
sparan mannequin opened this issue Mar 6, 2020 · 2 comments
Closed

csv.DictReader.fieldnames interprets unicode as ascii #84057

sparan mannequin opened this issue Mar 6, 2020 · 2 comments
Labels
3.8 only security fixes build The build process and cross-build stdlib Python modules in the Lib dir

Comments

@sparan
Copy link
Mannequin

sparan mannequin commented Mar 6, 2020

BPO 39876
Nosy @serhiy-storchaka

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2020-03-06.15:14:24.024>
created_at = <Date 2020-03-06.13:41:31.913>
labels = ['3.8', 'build', 'library', 'invalid']
title = 'csv.DictReader.fieldnames interprets unicode as ascii'
updated_at = <Date 2020-03-06.15:14:24.022>
user = 'https://bugs.python.org/sparan'

bugs.python.org fields:

activity = <Date 2020-03-06.15:14:24.022>
actor = 'serhiy.storchaka'
assignee = 'none'
closed = True
closed_date = <Date 2020-03-06.15:14:24.024>
closer = 'serhiy.storchaka'
components = ['Library (Lib)']
creation = <Date 2020-03-06.13:41:31.913>
creator = 'sparan'
dependencies = []
files = []
hgrepos = []
issue_num = 39876
keywords = []
message_count = 2.0
messages = ['363506', '363522']
nosy_count = 2.0
nosy_names = ['serhiy.storchaka', 'sparan']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'compile error'
url = 'https://bugs.python.org/issue39876'
versions = ['Python 3.8']

@sparan
Copy link
Mannequin Author

sparan mannequin commented Mar 6, 2020

with open(filename, "rt") as csvfile:
        csv_reader = csv.DictReader(csvfile, delimiter=csv_delimiter)
        filednames = csv_reader.fieldnames

In Python 3.8 csv expects utf-8 encoded files but apperently doens't read the header with utf-8 format.
If the csv file has an header named 'Französisch' it will be saved as 'Französisch'.

@sparan sparan mannequin added 3.8 only security fixes stdlib Python modules in the Lib dir build The build process and cross-build labels Mar 6, 2020
@serhiy-storchaka
Copy link
Member

csv.DictReader does not work with encodings. It works with already decoded strings.

You have to specify the correct encoding in open() (and "t" in mode is ignored):

with open(filename, "r", encoding="utf-8") as csvfile:

By default open() uses locale encoding. It is likely 'cp1252' on American and Western European Windows.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 only security fixes build The build process and cross-build stdlib Python modules in the Lib dir
Projects
None yet
Development

No branches or pull requests

1 participant