Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Refine higher level dataset API #24184

Closed
asfimport opened this issue Feb 28, 2020 · 2 comments
Closed

[Python] Refine higher level dataset API #24184

asfimport opened this issue Feb 28, 2020 · 2 comments

Comments

@asfimport
Copy link

Provide a more intuitive way to construct nested dataset:

1. instead of using confusing factory function
   dataset([
        factory("s3://old-taxi-data", format="parquet"),
        factory("local/path/to/new/data", format="csv")
   ])
   
1. let the user to construct a new dataset directly from dataset objects
   dataset([ 
       dataset("s3://old-taxi-data", format="parquet"),
       dataset("local/path/to/new/data", format="csv")
   ])

In the future we might want to introduce a new Dataset class which wraps functionality of both the dataset actory and the materialized dataset enabling optimizations over rediscovery of already materialized datasets.

Reporter: Krisztian Szucs / @kszucs
Assignee: Krisztian Szucs / @kszucs

PRs and other links:

Note: This issue was originally created as ARROW-7965. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Joris Van den Bossche / @jorisvandenbossche:
This depends on ARROW-8164, but if that gets merged quickly, it would be nice tackle this issue for 0.17 since that would enable us to remove factory() from the high-level user API (which wasn't there yet in 0.16, so this would avoid it ever being in a released version)

@asfimport
Copy link
Author

Krisztian Szucs / @kszucs:
Issue resolved by pull request 6505
#6505

@asfimport asfimport added this to the 0.17.0 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants