[HUDI-5955] fix incremental clean not work caused by archive#8232
Closed
hbgstc123 wants to merge 1 commit intoapache:masterfrom
Closed
[HUDI-5955] fix incremental clean not work caused by archive#8232hbgstc123 wants to merge 1 commit intoapache:masterfrom
hbgstc123 wants to merge 1 commit intoapache:masterfrom
Conversation
danny0405
reviewed
Mar 20, 2023
| .orElse(true) | ||
| ).filter(s -> | ||
| latestCleanRetainedInstant.map(instant -> | ||
| compareTimestamps(s.getTimestamp(), LESSER_THAN, instant)) |
Contributor
There was a problem hiding this comment.
I can get the error use case for incremental cleaning, but block the archiving of instants with cleaning is kind of risky, it is better we fix the incremental cleaning procedure.
Contributor
Author
There was a problem hiding this comment.
we can fallback to full cleaning if there are archived retained commits.
Contributor
Author
There was a problem hiding this comment.
or just use archived timeline to continue incremental clean
Contributor
There was a problem hiding this comment.
archived timeline should also be good if the loading does not happen in high frequency.
Collaborator
Contributor
Author
|
I submit a new pr that fallback to full clean if instant needed for incremental clean is archived. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Change Logs
Current incremental clean action may miss data files that should be cleaned, if the commit instants of those data files were archived.
This pr make sure when incremental clean enabled,
HoodieTimelineArchiverwon't archive commit instants later than or equals to theearliestCommitToRetainof last complete clean instant, so that clean executor can find those file in active timeline.Impact
no
Risk level (write none, low medium or high below)
low
Documentation Update
no
Contributor's checklist