New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Expose ORC metadata() in Python ORCFile #17254
Comments
Caleb Winston: Edit: Didn't see who was posting this haha. |
Caleb Winston: [1] arrow/cpp/src/arrow/adapters/orc/adapter.cc Line 411 in d542482
|
Jeremy Dyer: [1] arrow/cpp/src/arrow/adapters/orc/adapter.cc Lines 235 to 242 in d542482
|
Ian Alexander Joiner / @iajoiner: |
Ian Alexander Joiner / @iajoiner: |
Ian Alexander Joiner / @iajoiner: |
Antoine Pitrou / @pitrou: |
There is currently no way for a user to directly access the underlying ORC metadata of a given file. It seems the C++ functions and objects already existing and rather the plumbing is just missing the the cython/python and potentially a few c++ shims. Giving users the ability to retrieve the metadata without first reading the entire file could help numerous applications to increase their query performance by allowing them to intelligently determine which ORC stripes should be read.
This would allow for something like
Reporter: Jeremy Dyer
Assignee: Ian Alexander Joiner / @iajoiner
PRs and other links:
Note: This issue was originally created as ARROW-9299. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: