New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Read ORC metadata #35304
Comments
In C++, there is a I see that the ORC C++ |
I think @Fokko is looking for a way to expose |
### Rationale for this change Apache Orc has a per column attribute map and Apache Iceberg depends on this to encode its field metadata. However, the C++ Orc adapter does not know it which makes it difficult to support pyarrow and pyiceberg. ### What changes are included in this PR? Both reader and writer support Orc attributes conversion from/to arrow field metadata. ### Are these changes tested? Added two test cases to make sure the Orc adapter can preserve the attributes well. ### Are there any user-facing changes? No. * Closes: #35304 Authored-by: Gang Wu <ustcwg@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
### Rationale for this change Apache Orc has a per column attribute map and Apache Iceberg depends on this to encode its field metadata. However, the C++ Orc adapter does not know it which makes it difficult to support pyarrow and pyiceberg. ### What changes are included in this PR? Both reader and writer support Orc attributes conversion from/to arrow field metadata. ### Are these changes tested? Added two test cases to make sure the Orc adapter can preserve the attributes well. ### Are there any user-facing changes? No. * Closes: apache#35304 Authored-by: Gang Wu <ustcwg@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
### Rationale for this change Apache Orc has a per column attribute map and Apache Iceberg depends on this to encode its field metadata. However, the C++ Orc adapter does not know it which makes it difficult to support pyarrow and pyiceberg. ### What changes are included in this PR? Both reader and writer support Orc attributes conversion from/to arrow field metadata. ### Are these changes tested? Added two test cases to make sure the Orc adapter can preserve the attributes well. ### Are there any user-facing changes? No. * Closes: apache#35304 Authored-by: Gang Wu <ustcwg@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
Describe the enhancement requested
When reading an ORC schema, the metadata of the fields isn't exposed. For more information check: apache/iceberg#6973 (comment)
Component(s)
Python
The text was updated successfully, but these errors were encountered: