-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify read
and scan
functions
#13040
Labels
A-io
Area: reading and writing data
accepted
Ready for implementation
enhancement
New feature or an improvement of an existing feature
Comments
This was referenced Dec 14, 2023
2 tasks
Here are a few open, unaccepted issues that should be addressed during the harmonization of |
👍- was about to ask why there is no On that note - a small documentation recommendation - I think it's time to add sub-headers to the IO functions as grouped by type: csv:
- [polars.read_csv](https://docs.pola.rs/py-polars/html/reference/api/polars.read_csv.html)
- [polars.read_csv_batched](https://docs.pola.rs/py-polars/html/reference/api/polars.read_csv_batched.html)
- [polars.scan_csv](https://docs.pola.rs/py-polars/html/reference/api/polars.scan_csv.html)
- [polars.DataFrame.write_csv](https://docs.pola.rs/py-polars/html/reference/api/polars.DataFrame.write_csv.html)
- [polars.LazyFrame.sink_csv](https://docs.pola.rs/py-polars/html/reference/api/polars.LazyFrame.sink_csv.html)
ipc:
- [polars.read_ipc](https://docs.pola.rs/py-polars/html/reference/api/polars.read_ipc.html)
- [polars.read_ipc_stream](https://docs.pola.rs/py-polars/html/reference/api/polars.read_ipc_stream.html)
- [polars.scan_ipc](https://docs.pola.rs/py-polars/html/reference/api/polars.scan_ipc.html)
- [polars.read_ipc_schema](https://docs.pola.rs/py-polars/html/reference/api/polars.read_ipc_schema.html)
- [polars.DataFrame.write_ipc](https://docs.pola.rs/py-polars/html/reference/api/polars.DataFrame.write_ipc.html)
- [polars.DataFrame.write_ipc_stream](https://docs.pola.rs/py-polars/html/reference/api/polars.DataFrame.write_ipc_stream.html)
- [polars.LazyFrame.sink_ipc](https://docs.pola.rs/py-polars/html/reference/api/polars.LazyFrame.sink_ipc.html)
parquet:
- [polars.read_parquet](https://docs.pola.rs/py-polars/html/reference/api/polars.read_parquet.html)
- [polars.scan_parquet](https://docs.pola.rs/py-polars/html/reference/api/polars.scan_parquet.html)
- [polars.read_parquet_schema](https://docs.pola.rs/py-polars/html/reference/api/polars.read_parquet_schema.html)
- [polars.DataFrame.write_parquet](https://docs.pola.rs/py-polars/html/reference/api/polars.DataFrame.write_parquet.html)
- [polars.LazyFrame.sink_parquet](https://docs.pola.rs/py-polars/html/reference/api/polars.LazyFrame.sink_parquet.html)
database:
- [polars.read_database](https://docs.pola.rs/py-polars/html/reference/api/polars.read_database.html)
- [polars.read_database_uri](https://docs.pola.rs/py-polars/html/reference/api/polars.read_database_uri.html)
- [polars.DataFrame.write_database](https://docs.pola.rs/py-polars/html/reference/api/polars.DataFrame.write_database.html)
json:
- [polars.read_json](https://docs.pola.rs/py-polars/html/reference/api/polars.read_json.html)
- [polars.read_ndjson](https://docs.pola.rs/py-polars/html/reference/api/polars.read_ndjson.html)
- [polars.scan_ndjson](https://docs.pola.rs/py-polars/html/reference/api/polars.scan_ndjson.html)
- [polars.DataFrame.write_json](https://docs.pola.rs/py-polars/html/reference/api/polars.DataFrame.write_json.html)
- [polars.DataFrame.write_ndjson](https://docs.pola.rs/py-polars/html/reference/api/polars.DataFrame.write_ndjson.html)
- [polars.LazyFrame.sink_ndjson](https://docs.pola.rs/py-polars/html/reference/api/polars.LazyFrame.sink_ndjson.html)
avro:
- [polars.read_avro](https://docs.pola.rs/py-polars/html/reference/api/polars.read_avro.html)
- [polars.DataFrame.write_avro](https://docs.pola.rs/py-polars/html/reference/api/polars.DataFrame.write_avro.html)
excel:
- [polars.read_excel](https://docs.pola.rs/py-polars/html/reference/api/polars.read_excel.html)
- [polars.read_ods](https://docs.pola.rs/py-polars/html/reference/api/polars.read_ods.html)
- [polars.DataFrame.write_excel](https://docs.pola.rs/py-polars/html/reference/api/polars.DataFrame.write_excel.html#)
iceberg:
- [polars.scan_iceberg](https://docs.pola.rs/py-polars/html/reference/api/polars.scan_iceberg.html)
delta:
- [polars.scan_delta](https://docs.pola.rs/py-polars/html/reference/api/polars.scan_delta.html)
- [polars.read_delta](https://docs.pola.rs/py-polars/html/reference/api/polars.read_delta.html)
- [polars.DataFrame.write_delta](https://docs.pola.rs/py-polars/html/reference/api/polars.DataFrame.write_delta.html)
... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-io
Area: reading and writing data
accepted
Ready for implementation
enhancement
New feature or an improvement of an existing feature
read
functions should behave exactly likescan
functions followed bycollect
.There may be some added or removed parameters for functionality that is (not) relevant in eager mode.
We should take a look at our existing scan functions and make sure they conform to these expectations:
scan_parquet
scan_ipc
scan_csv
scan_ndjson
scan_delta
scan_iceberg
(has noread
equivalent yet)The text was updated successfully, but these errors were encountered: