Skip to content

[SPARK-55626][SQL][4.1] Don't load metadata columns on Table unless needed in V2TableUtil#54909

Closed
szehon-ho wants to merge 1 commit intoapache:branch-4.1from
szehon-ho:spark-55626-v2tableutil-4.1
Closed

[SPARK-55626][SQL][4.1] Don't load metadata columns on Table unless needed in V2TableUtil#54909
szehon-ho wants to merge 1 commit intoapache:branch-4.1from
szehon-ho:spark-55626-v2tableutil-4.1

Conversation

@szehon-ho
Copy link
Member

What changes were proposed in this pull request?

Rebase of : #54433 (by @aokolnychyi )

This PR prevents loading metadata columns on Table unless needed in V2TableUtil.

Why are the changes needed?

These changed are needed to prevent unnecessary load of metadata columns. In some cases, accessing the metadata columns can lead to exceptions on the connector side as detected in Iceberg. Spark should only ask for metadata columns if they were projected.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

This patch comes with tests.

Was this patch authored or co-authored using generative AI tooling?

No.

Closes #54416 from aokolnychyi/spark-55626.

Authored-by: Anton Okolnychyi aokolnychyi@apache.org

… in V2TableUtil

### What changes were proposed in this pull request?

This PR prevents loading metadata columns on Table unless needed in `V2TableUtil`.

### Why are the changes needed?

These changed are needed to prevent unnecessary load of metadata columns. In some cases, accessing the metadata columns can lead to exceptions on the connector side as detected in Iceberg. Spark should only ask for metadata columns if they were projected.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

This patch comes with tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#54416 from aokolnychyi/spark-55626.

Authored-by: Anton Okolnychyi <aokolnychyi@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you so much for your swift help, @szehon-ho .

dongjoon-hyun pushed a commit that referenced this pull request Mar 20, 2026
…eeded in V2TableUtil

### What changes were proposed in this pull request?

Rebase of : #54433 (by aokolnychyi )

This PR prevents loading metadata columns on Table unless needed in `V2TableUtil`.

### Why are the changes needed?

These changed are needed to prevent unnecessary load of metadata columns. In some cases, accessing the metadata columns can lead to exceptions on the connector side as detected in Iceberg. Spark should only ask for metadata columns if they were projected.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

This patch comes with tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #54416 from aokolnychyi/spark-55626.

Authored-by: Anton Okolnychyi <aokolnychyiapache.org>

Closes #54909 from szehon-ho/spark-55626-v2tableutil-4.1.

Authored-by: Anton Okolnychyi <aokolnychyi@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

Merged to branch-4.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants