Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Uniform file IO API and consolidated codebase #15008
There are at least three things that many of the IO methods must deal with: reading from URL, reading/writing to a compressed format, and different text encodings. It would be great if all io functions where these factors were relevant could use the same code (consolidated codebase) and expose the same options (uniform API).
In #14576, we consolidated the codebase but more consolidation is possible. In
Currently, pandas supports the following io methods. First for reading:
And then for writing:
Some of these should definitely use the consilidated/uniform API, such as
Some functions perhaps should be kept separate, such as
referenced this issue
Dec 29, 2016
Here are my thoughts on the API.
Regarding the consolidated codebase:
That sounds great!
Regarding the py2/py3 separation, I think we should just do what is most practical here (having a certain separation makes the code more clear, too much separation can make it more complex again. In any case, having a few but scattered
One more consolidation that would be possible for
Let's wait for #13317 and any other IO PRs that I don't know about to be merged. I'm hesitant to commit since I know it will cut into my other obligations. But if no one else is interested in implementing, I'll consider.
Totally agree. There are still a few things I need to understand before I can make that call. One issue is
It seems to be better to spilt _get_handle into two or more functions to make each single function simpler