Fix partition column projection with schema evolution #2685

KevinJiao · 2025-11-03T21:03:06Z

Rationale for this change

When performing column projection on partitioned tables with schema evolution, PyIceberg incorrectly uses the projected schema (containing only selected columns) instead
of the full table schema when building partition types in _get_column_projection_values(). This causes ValueError: Could not find field with id: X when:

Reading from partitioned Iceberg tables
Using column projection (selecting specific columns, not SELECT *)
Selected columns do NOT include the partition field(s)
The table has undergone schema evolution (fields added/removed after initial creation)
Reading files that are missing some of the selected columns (written before schema evolution)

The root cause is where partition_spec.partition_type(projected_schema) fails because the projected schema may be missing fields that
exist in the partition specification.

The fix passes the full table schema from ArrowScan._table_metadata.schema() through _task_to_record_batches() to _get_column_projection_values(), ensuring all fields are available when building partition accessors.

Are these changes tested?

Yes. Added a test test_partition_column_projection_with_schema_evolution that:

Creates a partitioned table with initial schema
Writes data with the initial schema
Evolves the schema by adding a new column
Writes data with the evolved schema
Performs column projection that excludes the partition field

Are there any user-facing changes?

No. Only internal helpers are changed

Use table schema instead of projected schema when building partition type to avoid 'Could not find field with id' errors during column projection on partitioned tables with schema evolution.

kevinjqliu

LGTM Thanks for the fix!

KevinJiao force-pushed the fix-partition-column-projection-schema-evolution branch from f0f9fa6 to 5508ed2 Compare November 3, 2025 21:21

Core: Fix partition column projection with schema evolution

f658044

Use table schema instead of projected schema when building partition type to avoid 'Could not find field with id' errors during column projection on partitioned tables with schema evolution.

KevinJiao force-pushed the fix-partition-column-projection-schema-evolution branch from 5508ed2 to f658044 Compare November 3, 2025 21:31

kevinjqliu approved these changes Nov 3, 2025

View reviewed changes

kevinjqliu merged commit 2d549a9 into apache:main Nov 3, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix partition column projection with schema evolution #2685

Fix partition column projection with schema evolution #2685

Uh oh!

KevinJiao commented Nov 3, 2025

Uh oh!

kevinjqliu left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix partition column projection with schema evolution #2685

Fix partition column projection with schema evolution #2685

Uh oh!

Conversation

KevinJiao commented Nov 3, 2025

Rationale for this change

Are these changes tested?

Are there any user-facing changes?

Uh oh!

kevinjqliu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants