Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc][Python] Improve documentation regarding dealing with memory mapped files #28401

Closed
asfimport opened this issue May 4, 2021 · 1 comment

Comments

@asfimport
Copy link

While one of the Arrow promises is that it makes easy to read/write data bigger than memory, it's not immediately obvious from the pyarrow documentation how to deal with memory mapped files.

The doc hints that you can open files as memory mapped ( https://arrow.apache.org/docs/python/memory.html?highlight=memory_map#on-disk-and-memory-mapped-files ) but then it doesn't explain how to read/write Arrow Arrays or Tables from there.

While most high level functions to read/write formats (pqt, feather, ...) have an easy to guess memory_map=True option, the doc doesn't seem to have any example of how that is meant to work for Arrow format itself. For example how you can do that using RecordBatchFile*

An addition to the memory mapping section that makes a more meaningful example that reads/writes actual arrow data (instead of plain bytes) would probably be helpful

Reporter: Alessandro Molina / @amol-
Assignee: Alessandro Molina / @amol-

PRs and other links:

Note: This issue was originally created as ARROW-12650. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Issue resolved by pull request 10266
#10266

@asfimport asfimport added this to the 6.0.0 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants