You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
assignee=Noneclosed_at=<Date2021-08-06.20:34:41.107>created_at=<Date2016-08-13.06:49:56.128>labels= ['3.11', 'type-bug', 'library', 'docs']
title="[doc] CSV DictReader default dialect name 'excel' is misleading, as MS Excel doesn't actually use ',' as a separator."updated_at=<Date2021-08-06.20:34:41.106>user='https://github.com/lockywolf'
I wanted to use the csv module to load CSV's and the documentation says that the default dialect for reading CSVs is 'excel'.
However, the delimiter used with this dialect in Python is a comma (','), whereas in fact (even though is's called _comma_ separated values) MS Excel (2016) uses a semicolon (';') as a delimiter.
Therefore, the Python's 'excel' actually doesn't read Excel generated files.
Excel's behaviour has always been locale-dependent. If the user's locale uses , as the decimal mark , then ; has been used as the column separator in "C"SV. However, even if you use autodetection with sniff, it is impossible to detect with 100 % accuracy, e.g, is the following csv row comma or semicolon separated:
The dialect could be documented better though, as currently it simply says:
The excel class defines the usual properties of an Excel-generated CSV file. It is registered with the dialect name 'excel'.
And there really should be a separate dialect for Excel-semicolon separated values, as a couple billion people would see ; in their CSV.
If you need semicolon delimiters, can't you just pass delimiter=';' to the reader or writer? I don't think there's a need for a separate dialect class for that, since dialect classes should only provide a baseline for the most broad use cases. Users have plenty of options for extending or customizing behavior without adding more dialect classes.
I also think the docs around dialects are confusing. I remember being confused by them when I was learning! I made quite a few changes to try to add clarity around dialects to the documentation. Let me know if anybody has feedback!