Skip to content

Spark: Add spark custom property for delete#15588

Closed
yingjianwu98 wants to merge 3 commits intoapache:mainfrom
yingjianwu98:yingjianw/add_spark_custom_property_for_delete
Closed

Spark: Add spark custom property for delete#15588
yingjianwu98 wants to merge 3 commits intoapache:mainfrom
yingjianwu98:yingjianw/add_spark_custom_property_for_delete

Conversation

@yingjianwu98
Copy link
Contributor

In Spark, writes that go through SparkWrite (append, overwrite, dynamic overwrite) correctly populate custom snapshot properties via extraSnapshotMetadata. However, deletes that go through SparkTable.deleteWhere() (partition-level deletes on both v1 and v2 tables) do not apply these properties.

This PR fixes the inconsistency by having deleteWhere() also read and apply extraSnapshotMetadata from the session configuration.

@github-actions github-actions bot added the spark label Mar 11, 2026
@yingjianwu98 yingjianwu98 changed the title add spark custom property for delete Spark: Add spark custom property for delete Mar 11, 2026
String propertyKey = "test-key";
String propertyValue = "session-value";

// use a v1 table so that DELETE goes through SparkTable.deleteWhere()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment correct? The deleteWhere method is called even with format-version 2 & 3 as far as I tested locally.

Copy link
Contributor Author

@yingjianwu98 yingjianwu98 Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry was not clear here.

For v2 and v3, a full partition delete indeed go through this but a row level delete I believe will go through SparkPositionDeltaWrite, which doesn't have the same issue.

@yingjianwu98 yingjianwu98 force-pushed the yingjianw/add_spark_custom_property_for_delete branch from c408411 to 2db879f Compare March 12, 2026 01:05
@amogh-jahagirdar
Copy link
Contributor

Hey @yingjianwu98 sorry I think there was actually an older PR (though not that much older) https://github.com/apache/iceberg/pull/15061/changes which addresses this. I think we should probably just get that one in? Would you be open to reviewing that?

@yingjianwu98
Copy link
Contributor Author

Thanks @amogh-jahagirdar

Good call out, I will review that one instead!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants