-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
Description
Feature or enhancement
Proposal:
I am not a lawyer. And I know it would be very strange to adopt a non-standard, but please just read.
Background
Around 2003-2004, the csv module of Python was introduced with PEP 305 into the standard library of the language. By the time PEP 305 was purposed, the module's default CSV dialect, "excel," was defined as CSV file as exported by Excel 97 and Excel 2000. It was one of the two predefined dialects of the module. The other predefined dialect was "excel-tab."
After that, things have changed a lot. In the year 2005, a non-standard specification, RFC 4180, is published. Around 2006, a new software, which would later be called "Google Sheets," was released. And about one year later, a new software called "Numbers" is released by Apple.
Description
Today, the use of "excel"
in the csv
module of Python as its default dialect, despite having historical origin, may be seen as non-neutral, as there seems to be no reason in a more open and competitive world to favor a specific product over Numbers, Google Sheets, LibreOffice Calc, or a publicly available specification on the internet.
Although excel is indeed a common English word that can be found in dictionaries, Python's use of it, as described above, and in PEP 305, is highly associated with a product or products of Microsoft.
It could be viewed by Google, Apple, and users of their products as an unneutral act of favoring a product of Microsoft or promoting it in this competitive world, or at least indicating that this module is intended to be used with such a product, or that the CSV format is highly associated with such a product.
For normal users, it would be a false guarantee that this module is and will always be compatible with such a product.
Finally, it might be seen as not universal or not portable enough. Even if it is identical to RFC 4180, people would still think that it is specific to Excel rather than cross-platform. We have only three predefined dialects, with two of them being "excel" and one being "unix." Today, people would say, "It's so good. I can export and import data from Excel." Someday in the future, people may instead say, "What is an excel?"
By the time the csv
module was introduced, it might seem logical to name the default mode after a well-known product; twenty years later, this decision must be reviewed.
Twenty years later, which is more common, Python or Excel? Did Microsoft standardize the CSV format? Did they (Microsoft) publish a formal specification (of CSV) for us to follow? As developers of open source projects, should we link our projects to the name of a proprietary software, or that of a publicly available specification? Do governments of this world use RFC 4180, or "excel," or "unix," as their official CSV formats? Will Python continue to support the current and future versions of Microsoft products? (I mean Excel, not Windows.) If so, is the predefined "excel" dialect subject to changes, if Microsoft changes it tomorrow?
Solution
Create a distinct dialect object, called rfc4180
, by strictly following RFC 4180. And then make it the default. The specification, despite not being a standard, is the closest thing to a universal standard. There will basically be no compatible issue as the new object will almost be identical to the excel
dialect. This is more of a naming issue.
Alternatively, it can be renamed to default
, which is more neutral and can mean anything.
Do the same with excel-tab
. For excel
and excel-tab
, it would be better if the supported Excel versions are specified (and tested on).
Has this already been discussed elsewhere?
No response given
Links to previous discussion of this feature:
No response