Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Dataset] Support async streaming interface for getting fragments in Dataset #32595

Closed
Tracked by #24365
asfimport opened this issue Aug 5, 2022 · 1 comment
Closed
Tracked by #24365

Comments

@asfimport
Copy link

Add GetFragmentsAsync() and GetFragmentsAsyncImpl() functions to the generic Dataset interface, which allows to produce fragments in a streamed fashion.

This is one of the prerequisites for making FileSystemDataset to support lazy fragment processing, which, in turn, can be used to start scan operations without waiting for the entire dataset to be discovered.

To aid the transition process of moving to async implementation in Dataset/AsyncScanner code, a default implementation for GetFragmentsAsyncImpl() should be provided (yielding a VectorGenerator over the fragments vector, which is stored by every implementation of Dataset interface at the moment).

Reporter: Pavel Solodovnikov / @ManManson
Assignee: Pavel Solodovnikov / @ManManson

PRs and other links:

Note: This issue was originally created as ARROW-17318. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Issue resolved by pull request 13804
#13804

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant