-
Notifications
You must be signed in to change notification settings - Fork 407
Description
Apache Iceberg version
0.9.0 (latest release)
Please describe the bug 🐞
I have an Iceberg V1 table and I am trying to load the table using pyiceberg.table.StaticTable:
from pyiceberg.table import StaticTable
t = StaticTable.from_metadata(
"gs://<project_id>/<table-name>/metadata/v1.metadata.json"
)When I run t.inspect.manifests(), it gives the following error
ResolveError: 504: added_files_count: required int is non-optional, and not part of the file schema
I believe this is because in pyiceberg.manifest.DEFAULT_READ_VERSION is set to 2 but my table is V1. So I patch this to manifest.DEFAULT_READ_VERSION = 1 and this gives me another error:
AttributeError: 'pyiceberg.manifest.ManifestFile' object has no attribute 'content'
I managed to resolve this error temporary by adding the content attribute to pyiceberg.manifest.MANIFEST_LIST_FILE_SCHEMAS[1]. And more errors are raised as I keep resolving:
For pyiceberg.manifest.MANIFEST_ENTRY_SCHEMAS[1]:
AttributeError: 'pyiceberg.manifest.ManifestEntry' object has no attribute 'sequence_number'
When running t.inspect.files(), there an error generated from pyiceberg.manifest.DATA_FILE_TYPE[1]:
AttributeError: 'pyiceberg.manifest.DataFile' object has no attribute 'content'
I am able to load the table after adding all the above missing attributes, but is there a way to parse V1 table or is this a bug when loading V1 table?
Willingness to contribute
- I can contribute a fix for this bug independently
- I would be willing to contribute a fix for this bug with guidance from the Iceberg community
- I cannot contribute a fix for this bug at this time