Apache Iceberg version
0.11.0 (latest release)
Please describe the bug 🐞
Hello everyone,
iceberg-python/pyiceberg/io/pyarrow.py defines data_file_statistics_from_parquet_metadata() to get statistics from Parquet metadata. There are various ways to get Parquet metadata, but ideal way is to use pyarrow.parquet.read_metadata().
In iceberg-python/pyiceberg/io/pyarrow.py and PyIceberg as a whole, whenever data_file_statistics_from_parquet_metadata() is called, most of the times, argument to parquet_metadata parameter is of pyarrow.parquet.read_metadata(). For e.g:

But in the same file, at one point, argument to parquet_metadata parameter is not of _pyarrow.parquet.read_metadata()_For e.g:

This works perfectly fine if underlying PyArrow isn't encrypting Parquet. But we're building a custom PyArrow that creates file_encryption_properties and file_decryption_properties in runtime and passes them to Parquet writers and readers (which include read_metadata() too). In this case:

fails with:
Cannot decrypt ColumnMetadata. FileDecryption is not setup correctly
Maintaining consistent usage of pyarrow.parquet.read_metadata(), something like:
Before:

After:

solves the problem.
Looking forward to your opinions.
Willingness to contribute