-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-7183] Fix static insert overwrite partitions issue #10254
Conversation
cc @boneanxs for the reiview |
} | ||
val overwriteMode = if (isOverWriteTable) SaveMode.Overwrite else SaveMode.Append | ||
val staticPartitions = if (isStaticOverwrite && !isOverWriteTable) { | ||
val fileIndex = HoodieFileIndex(sparkSession, catalogTable.metaClient, None, combinedOpts, FileStatusCache.getOrCreate(sparkSession)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @beyond1920, this pr tries to align with Spark that only under static overwrite mode, insert overwrite an empty dataset will overwrite the old data, this is a breaking change with #9811. Is that suitable to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the behavior makes sense only with static partitioning. #9811 covers the empty input df case which is only relevant with static partitioning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for late response. @boneanxs I think the behavior is reasonable.
} | ||
val staticOverwritePartitionPathOpt = getStaticOverwritePartitionPath(catalogTable, partitionSpec, isOverWritePartition) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the old method can be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@wecharyu : can you look into the test case failures |
@hudi-bot run azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Change Logs
Record matching static partitions for static insert_overwrite operation;
Read replaced partition fileIds through write status for dynamic insert overwrite.
Impact
Fix the issue that static insert overwrite not work for multiple partitioned table.
Risk level (write none, low medium or high below)
low.
Documentation Update
none.
Contributor's checklist