Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arrow notes #215

Open
mdsumner opened this issue Oct 3, 2023 · 0 comments
Open

arrow notes #215

mdsumner opened this issue Oct 3, 2023 · 0 comments

Comments

@mdsumner
Copy link
Member

mdsumner commented Oct 3, 2023

DD

I'm pretty sure the batch size you can set with a gdal property before you get the stream. If you have a pyarrrow.RecordBatchReader, which I'm assuming is what GetArrowStreamAsPyArrow gives you, you can consume it one batch at a time or call https://arrow.apache.org/docs/python/generated/pyarrow.RecordBatchReader.html#pyarrow.RecordBatchReader.read_all to read it into a Table (probably what you want unless you're engineering some streaming yourself).
In R you might have a nanoarrow_array_stream, on which you call convert_array_stream() to get a data.frame. I think you can also call as_arrow_table() on the array stream (or as_arrow_table(as_record_batch_reader(array_stream)) if that doesn't work)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant