-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Add read_parquet
, to_parquet
and to_csv
#6129
Conversation
…ead_csv' changes to properly test roundtripping
…quet' - also add 'compression' option to this, and add it to accepted named parameter in 'parquet-extension.cpp' - in 'read_csv.cpp' it's also accepted but never checked for?
Looks like there are some non-std::moves in this PR - could you patch them out? |
Somehow related, I see this pattern where there are a bunch of read_csv, read_parquet, read_???, would potentially not make sense to have at some point a generic file ingesting functionality that takes as parameter the kind of file? Something like And then on the other side say you implement read_yourformat, it would be nice to be able to register that (either compile time or INSTALL time) into the generic read() handler. I was discussing something similar regarding [de]compress with @samansmink, unsure whether this has value enough that (after some more thinking) this can be considered (probably at the library level so that can be then exposed also to SQL / Python / other bindings. |
@carlopi I think there is value in that, but definitely in addition to methods like these. |
Thanks! |
This PR adds some methods ported from
pandas
to our python API.It also deprecates the old
write_csv
method, making it instead an alias toto_csv
.Also
write_parquet
is added as an alias toto_parquet