-
Notifications
You must be signed in to change notification settings - Fork 6.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes and improvements for Iceberg storage #55695
Conversation
This is an automated comment for commit be9e84c with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page Successful checks
|
TODO: add tests and docs about unsupported Iceberg features |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Half of the code is not new and was copied from IcebergMetadataParser
.
e5f349e
to
4892618
Compare
Users faced issues in cloud with Iceberg, let's backport this PR. |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Throw an exception when Iceberg table schema was evolved instead of possible incorrect result. Throw an exception if positional or equality deletes were used in Iceberg table instead of possible incorrect result. Fix incorrect metadata file selection (lexicographic order was used and we could select
v9.metadata.json
instead ofv10.metadata.jsonl
). Use schema from metadata file for schema inference instead of listing data files and inferring schema from files. Don't list data files on each read if metadata wasn't changed. Fix parsing Iceberg decimals and timestamptz for ORC and Avro formats.Documentation entry for user-facing changes