-
Notifications
You must be signed in to change notification settings - Fork 378
Open
Description
Apache Iceberg version
0.10.0 (latest release)
Please describe the bug 🐞
The upsert works perfectly fine until I needed to add a new field in the table.
Add the new column in the table
# Add in a column to an existing table
from pyiceberg.types import TimestamptzType, TimestampType
table = catalog.load_table(table_identifier)
(
table.update_schema()
.add_column("created_at", TimestamptzType(), doc="UTC created time", required=False)
.commit()
)
print("New schema:", table.schema())Upsert the records
# Batch the records in 1000s
for rb in arrow_table_fixed.to_batches(max_chunksize=1000):
batch_tbl = pa.Table.from_batches([rb])
# Upsert the data into the Iceberg table
try:
upd = iceberg_table.upsert(batch_tbl)
print("Upserted data into the Iceberg table.")
print(upd)
except Exception as e:
print(f"An error occurred during upsert: {e}")Error message saying that the target schema doesn't have the new column
An error occurred during upsert: Target schema's field names are not matching the table's field names: ['cik_str', 'ticker', 'title', 'created_at'], ['cik_str', 'ticker', 'title']
Checked the target schema on Iceberg and the column is definitely there
# Get the schema from the Iceberg table
iceberg_table = catalog.load_table(table_identifier)
# 2) Get the PyArrow schema directly from the Iceberg schema
arrow_schema = iceberg_table.schema().as_arrow()
print(arrow_schema.schema)output
cik_str: large_string not null
-- field metadata --
PARQUET:field_id: '1'
ticker: large_string not null
-- field metadata --
PARQUET:field_id: '2'
title: large_string
-- field metadata --
PARQUET:field_id: '3'
created_at: timestamp[us, tz=UTC]
-- field metadata --
doc: 'UTC created time'
PARQUET:field_id: '5'
Willingness to contribute
- I can contribute a fix for this bug independently
- I would be willing to contribute a fix for this bug with guidance from the Iceberg community
- I cannot contribute a fix for this bug at this time
AlexWendland
Metadata
Metadata
Assignees
Labels
No labels