You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As part of ARROW-15271, map_batches() was modified to return a RecordBatchReader, but the implementation collects all results as a list of record batches and then converts that to a reader. In theory, if we push the implementation down to C++, we should be able to make a proper streaming RBR.
We won't know the schema ahead of time. We could optionally accept it, which would allow the function to be lazy. Or we could eagerly evaluate just the first batch to determine the schema.
As part of ARROW-15271,
map_batches()
was modified to return aRecordBatchReader
, but the implementation collects all results as a list of record batches and then converts that to a reader. In theory, if we push the implementation down to C++, we should be able to make a proper streaming RBR.We won't know the schema ahead of time. We could optionally accept it, which would allow the function to be lazy. Or we could eagerly evaluate just the first batch to determine the schema.
Reporter: Will Jones / @wjones127
Assignee: Dewey Dunnington / @paleolimbot
PRs and other links:
Note: This issue was originally created as ARROW-16703. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: