Skip to content

[spark] Add OVERWRITE_BY_FILTER capabilities to PartitionedFormatTable for Spark 4 compatibility#7517

Merged
YannByron merged 2 commits intoapache:masterfrom
kerwin-zk:overwrite_by_filter
Mar 25, 2026
Merged

[spark] Add OVERWRITE_BY_FILTER capabilities to PartitionedFormatTable for Spark 4 compatibility#7517
YannByron merged 2 commits intoapache:masterfrom
kerwin-zk:overwrite_by_filter

Conversation

@kerwin-zk
Copy link
Contributor

Purpose

Spark 4 changed ReplaceTableAsSelect to use OverwriteByExpression instead of AppendData,
which requires TRUNCATE or OVERWRITE_BY_FILTER capability.
External format tables (with location) go through PartitionedCSVTable (Spark FileTable),
whose default capabilities {BATCH_READ, BATCH_WRITE} don't include OVERWRITE_BY_FILTER.
The fix adds OVERWRITE_BY_FILTER to PartitionedFormatTable trait.

Tests

CI

@kerwin-zk kerwin-zk force-pushed the overwrite_by_filter branch from a0559a8 to e4ca1b8 Compare March 25, 2026 02:32
@kerwin-zk kerwin-zk closed this Mar 25, 2026
@kerwin-zk kerwin-zk reopened this Mar 25, 2026
@kerwin-zk kerwin-zk force-pushed the overwrite_by_filter branch from ec8f786 to cac8974 Compare March 25, 2026 08:17
@YannByron
Copy link
Contributor

+1

@YannByron YannByron merged commit c37503a into apache:master Mar 25, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants