Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Refactor map_batches() so it can stream results #32046

Closed
asfimport opened this issue May 31, 2022 · 1 comment
Closed

[R] Refactor map_batches() so it can stream results #32046

asfimport opened this issue May 31, 2022 · 1 comment

Comments

@asfimport
Copy link

As part of ARROW-15271map_batches() was modified to return a RecordBatchReader, but the implementation collects all results as a list of record batches and then converts that to a reader. In theory, if we push the implementation down to C++, we should be able to make a proper streaming RBR.

We won't know the schema ahead of time. We could optionally accept it, which would allow the function to be lazy. Or we could eagerly evaluate just the first batch to determine the schema.

Reporter: Will Jones / @wjones127
Assignee: Dewey Dunnington / @paleolimbot

PRs and other links:

Note: This issue was originally created as ARROW-16703. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Dewey Dunnington / @paleolimbot:
Issue resolved by pull request 13650
#13650

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants