[WIP] Replace transactions rebase onto refreshed metadata#15904
Draft
smaheshwar-pltr wants to merge 2 commits intoapache:mainfrom
Draft
[WIP] Replace transactions rebase onto refreshed metadata#15904smaheshwar-pltr wants to merge 2 commits intoapache:mainfrom
smaheshwar-pltr wants to merge 2 commits intoapache:mainfrom
Conversation
smaheshwar-pltr
commented
Apr 7, 2026
| private TableMetadata startingMetadataFor(TableMetadata refreshed) { | ||
| return switch (type) { | ||
| case REPLACE_TABLE, CREATE_OR_REPLACE_TABLE -> | ||
| refreshed.buildReplacement( |
Contributor
Author
There was a problem hiding this comment.
Note: New field IDs will be assigned here, need to think about this
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Supersedes #15092.
Motivation
There are a few issues related to table replaces.
BaseTransaction.commitReplaceTransaction()does not re-apply replacement and transaction updates onto refreshed metadata. When concurrent changes occur, the transaction therefore commits stale metadata.When a
REPLACEtransaction commits after concurrent changes (appends, snapshot expiration, other replaces), it overwrites those changes with stale metadata. This can lead to snapshot history loss, and concurrent snapshot expiration can even cause table corruption. (#15090)V3 tables require that
snapshot.first-row-id>=table.next-row-idwhen adding a snapshot. The snapshot'sfirst-row-idis set frombase.nextRowId()when the snapshot is produced.With REST catalogs, updates are sent to the server which are generally applied to the server's current metadata. If a concurrent commit advanced the server's
next-row-id, the snapshot'sfirst-row-id(based on stale metadata) will be behind:This is returned as
CommitFailedExceptionso the client can retry, butcommitReplaceTransactionretries the same stalecurrent— the snapshot still has the oldfirst-row-id, so it fails every time. Therefore, I believe that in V3, any concurrent snapshot change in general (append, compaction, other replace) causes the replace to fail entirely. (#15905)Less severe, but there are currently behaviour differences in concurrent replaces for REST vs non-REST catalogs due to this. E.g. for REST catalogs, properties are sent as a
SetPropertiesdelta and the server generally merges them viaputAll, so concurrent property additions that have succeed survive a concurrent table replace. For non-REST catalogs though, they don't as the fullTableMetadataobject is committed directly, so the stalecurrentoverwrites all concurrent property changes.This PR
This PR makes replace (and createOrReplace) transactions rebase their changes onto refreshed table metadata, using the same
applyUpdatesmechanism thatcommitSimpleTransactionalready uses.The
startmetadata (the initialbuildReplacementresult) is stored onBaseTransactionto allow the replacement to be rebuiltAlso: in
RESTTableOperations, thereplaceBasefield used before to generate requirements is removed - requirements are now generated frombaseand kept in sync viaapplyUpdates.Noting: