Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-15107: Support custom metadata for remote log segment #13984
KAFKA-15107: Support custom metadata for remote log segment #13984
Changes from 1 commit
387c64d
621a41b
b988c9a
42abc8c
5d8a53f
34b05ad
9289d8a
a16156f
27fc3a0
6869dc9
cad75d2
473567f
3c8bac2
d564728
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not we need to add respective delete_segment_started and delete_segment finished events?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my reasoning why not: #13984 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we can leave the error here and not to do anything. If we failed to delete the segment and failed the RLMTask here, then how does the operator knows if we need to delete the latest segment or not? If we don't even care if the segment is deleted or not at all, why did we have to delete it anyway? Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this error message doesn't appear, it means the deletion was successful. However, it may be a good idea to make this explicit, so I added info logging for successful deletion and a suggestion to clean manually in case of an error.
On the high level, you're right, we don't care: the operator will fix the issue by increasing the custom metadata size limit or more radically by deleting the topic or disabling remote storage for it. In any case, there shouldn't be any garbage on the remote storage.C
leaning may matter in the case when the placement on the remote storage is nondeterministic. In this case, the subsequent write attempt will not overwrite the files but will write new ones thus leaving some garbage. So to combat this and generally as a measure of hygiene, the best-effort attempt for deletion is made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach taken looks good. Even if we fail to delete the last uploaded segment on error, it will be marked as unreferenced segment. And, when the RLM task is enabled for the topic, it will be removed in the regular segment cleanup cycle.