Apache Iceberg version
1.11.0 (latest release)
Query engine
None
Please describe the bug 🐞
Description
SerializableTable.sortOrders() throws a ValidationException when a table has a historical sort order that references a column that has since been dropped from the schema.
This was introduced in #15150, which added serialization of all sort orders (not just the default) to SerializableTable. During deserialization the new code binds every sort order against the current schema using strict validation, but historical sort orders may legitimately reference fields that no longer exist.
Root cause
Steps to reproduce
@Test
public void testSerializableTableSortOrdersWithDroppedColumn() {
// add an extra column and establish sort order 1 on id + ts
table.updateSchema().addColumn("ts", Types.LongType.get()).commit();
table.replaceSortOrder().asc("id").asc("ts").commit();
table.newAppend().appendFile(FILE_A).commit();
// switch to sort order 2 using only "id", then drop "ts"
// historical sort order 1 still references the dropped "ts" field
table.replaceSortOrder().asc("id").commit();
table.updateSchema().deleteColumn("ts").commit();
// SerializableTable.copyOf captures all sort orders at construction time
Table serialized = SerializableTable.copyOf(table);
// sortOrders() must return all historical orders without throwing a ValidationException,
// even though sort order 1 references the now-dropped "ts" field
assertThat(serialized.sortOrders()).hasSize(3); // unsorted(0), order-1(id+ts), order-2(id)
assertThat(serialized.sortOrder().fields()).hasSize(1); // current default is order-2
}
Expected behavior
sortOrders() should return all historical sort orders without throwing, binding non-default sort orders unchecked (as TableMetadataParser and PartitionSpec already do for the equivalent case).
Willingness to contribute
Apache Iceberg version
1.11.0 (latest release)
Query engine
None
Please describe the bug 🐞
Description
SerializableTable.sortOrders() throws a ValidationException when a table has a historical sort order that references a column that has since been dropped from the schema.
This was introduced in #15150, which added serialization of all sort orders (not just the default) to SerializableTable. During deserialization the new code binds every sort order against the current schema using strict validation, but historical sort orders may legitimately reference fields that no longer exist.
Root cause
Steps to reproduce
Expected behavior
sortOrders() should return all historical sort orders without throwing, binding non-default sort orders unchecked (as TableMetadataParser and PartitionSpec already do for the equivalent case).
Willingness to contribute