Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-15802: validate remote segment state before fetching index #14727

Merged
merged 3 commits into from Nov 14, 2023

Conversation

jeqo
Copy link
Contributor

@jeqo jeqo commented Nov 9, 2023

See commits for more details

[KAFKA-15802]

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Copy link
Contributor

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@satishd satishd self-requested a review November 10, 2023 08:09
Copy link
Collaborator

@kamalcph kamalcph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hudeqi hudeqi self-requested a review November 13, 2023 02:46
@divijvaidya divijvaidya added the tiered-storage Pull requests associated with KIP-405 (Tiered Storage) label Nov 13, 2023
Copy link
Contributor

@divijvaidya divijvaidya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You will need a similar filter in cleanupExpiredRemoteLogSegments() method as well where call listRemoteLogSegments() to fetch the metadata. Over there we want to iterate through segments which are in copy_finished state OR delete_started state (otherwise same error you faced in this JIRA will surface there).

  2. Please add integration test which fails prior to this change for both the changes, one which you have added and another which I suggested in point 1 above.

@divijvaidya divijvaidya added the backport-candidate This pull request is a candidate to be backported to previous versions label Nov 13, 2023
@jeqo jeqo force-pushed the jeqo/kafka-15802 branch 2 times, most recently from ec765c7 to cd6b1f8 Compare November 13, 2023 14:34
@jeqo
Copy link
Contributor Author

jeqo commented Nov 13, 2023

@divijvaidya good catch!
Although, in the case of delete there's no exception as deleting a segment that is not present on remote does not throw an exception as it's idempotent. Regardless, it make sense to add this validation to save us a remote call and only call delete segment when appropriate.

// since the checkpoint file was already truncated.
boolean shouldDeleteSegment = remoteLogRetentionHandler.isSegmentBreachByLogStartOffset(

if (SEGMENT_DELETION_VALID_STATES.contains(metadata.state())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since there is no "else" part to it, perhaps short circuiting the if condition with

if (!SEGMENT_DELETION_VALID_STATES.contains(metadata.state())) {
   continue;
}

would provide more readability.

(I don't have a strong opinion on this, hence, let me know if you wish to keep it as as it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, applying this fix.

@@ -133,6 +134,10 @@ public class RemoteLogManager implements Closeable {

private static final Logger LOGGER = LoggerFactory.getLogger(RemoteLogManager.class);
private static final String REMOTE_LOG_READER_THREAD_NAME_PREFIX = "remote-log-reader";
private static final Set<RemoteLogSegmentState> SEGMENT_DELETION_VALID_STATES = Collections.unmodifiableSet(EnumSet.of(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this diagram:

copy_started -> delete_started is also a valid transition. Although, I can't think of a scenario where this would be valid because we always complete copy before calling expiration. Are we missing something here by not including copy_started?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch. I guess the transition hasn't considered the current implementation where copying and deletion are sequential. If the implementation changes, it may be possible to have this scenario.
I'd say it should be considered a valid transition. I will include it in the valid states and adapt the test case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COPY_SEGMENT_STARTED segments are eligible for deletion when those segments were not able to be copied by the leader as the leader went through ungraceful shutdown or for any oher reasons. New leader may pickup the resepctive segments for the targeted offsets that need to be copied and the earlier failed segment will remain in the COPY_SEGMENT_STARTED state and it will eventually be deleted by retention cleanup logic.

So, COPY_SEGMENT_STARTED is a valid transition even now when copy and deletion are happening sequentially.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@satishd in that case wouldn't the new leader retry to upload the segment? If a segment is left on copy_started state, then it'd risk some segment not being uploaded. Is this expected?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeqo Leader will upload the required segment with a new UUID by finding the offset that needs to be copied as mentioned here. It will not change the state of the failed copy of an earlier segment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha! I can see a new segment id being generated here.
Got it, so there may be segment that will stay on copy_started until eventually deleted. Then, the way to assess if this is not an issue is to check that another topic-id-partition-offset combination with a different segment id has copy_finished, isn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, one way to check is whether there is a segment associated with the target offset or not by using RLMM#remoteLogSegmentMetadata().

Copy link
Member

@satishd satishd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jeqo for the PR, Left a couple of comment/clarifications.

@@ -133,6 +134,10 @@ public class RemoteLogManager implements Closeable {

private static final Logger LOGGER = LoggerFactory.getLogger(RemoteLogManager.class);
private static final String REMOTE_LOG_READER_THREAD_NAME_PREFIX = "remote-log-reader";
private static final Set<RemoteLogSegmentState> SEGMENT_DELETION_VALID_STATES = Collections.unmodifiableSet(EnumSet.of(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COPY_SEGMENT_STARTED segments are eligible for deletion when those segments were not able to be copied by the leader as the leader went through ungraceful shutdown or for any oher reasons. New leader may pickup the resepctive segments for the targeted offsets that need to be copied and the earlier failed segment will remain in the COPY_SEGMENT_STARTED state and it will eventually be deleted by retention cleanup logic.

So, COPY_SEGMENT_STARTED is a valid transition even now when copy and deletion are happening sequentially.

Currently, the RLMM is fetching the index even when remote segment metadata state says upload has not complete yet.
Workaround using current APIs to only fetch remote indexes when segment is properly uploaded.
Also, the RLMM is deleting segments even when remote segment metadata state says upload has not complete yet.
Only consider segments where copy has started or completed, or where delete has started.

[KAFKA-15802](https://issues.apache.org/jira/browse/KAFKA-15802)
@satishd
Copy link
Member

satishd commented Nov 14, 2023

@jeqo Please maintain the commit history if possible as it helps in reviewing the updated commits in the PR. It is fine if you need to rebase and merge the commits for valid reasons.

Copy link
Collaborator

@kamalcph kamalcph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, left minor comments.

verifyDeleteLogSegment(segmentMetadataList, deletableSegmentCount, currentLeaderEpoch);
}

@ParameterizedTest(name = "testDeleteLogSegmentDueToRetentionTimeBreach segmentCount={0} deletableSegmentCount={1}")
@CsvSource(value = {"50, 0", "50, 1", "50, 23", "50, 50"})
@CsvSource(value = {"50, 0, DELETE_SEGMENT_STARTED", "50, 1, COPY_SEGMENT_FINISHED", "50, 23, DELETE_SEGMENT_STARTED", "50, 50, COPY_SEGMENT_FINISHED"})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we skip updating the existing tests?

There is already a test (testDeleteRetentionMsBeforeSegmentReady) to verify that if the state is DELETE_SEGMENT_FINISHED, no interaction is expected with the RemoteStorageManager.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also rename the test?

testDeleteRetentionMsBeforeSegmentReady to testDeleteRetentionMsOnExpiredSegment (expired/invalid/something-similar)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, applying suggestion

@@ -988,6 +998,10 @@ void cleanupExpiredRemoteLogSegments() throws RemoteStorageException, ExecutionE
return;
}
RemoteLogSegmentMetadata metadata = segmentsIterator.next();

if (!SEGMENT_DELETION_VALID_STATES.contains(metadata.state())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this condition be simplified to?

if (RemoteLogSegmentState.DELETE_SEGMENT_FINISHED.equals(metadata.state())) {
    continue;
}

If we want to add/remove state, we can change the logic later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, applying change.

@jeqo jeqo requested a review from kamalcph November 14, 2023 09:49
Copy link
Collaborator

@kamalcph kamalcph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@satishd satishd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jeqo for addressing the review comments. LGTM.

@satishd
Copy link
Member

satishd commented Nov 14, 2023

There are a few unrelated test failures, merging the changes to 3.6 branch.

@jeqo Please raise PR against trunk by picking this change and let me know, will review and merge it when there are no related failures in Jenkins job.

@satishd satishd merged commit 5fa2a24 into apache:3.6 Nov 14, 2023
1 check failed
@jeqo jeqo deleted the jeqo/kafka-15802 branch November 14, 2023 17:18
@jeqo jeqo restored the jeqo/kafka-15802 branch November 14, 2023 17:19
jeqo added a commit to jeqo/kafka that referenced this pull request Nov 14, 2023
…che#14727)

Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>
@jeqo jeqo deleted the jeqo/kafka-15802 branch November 14, 2023 19:24
jeqo added a commit to aiven/kafka that referenced this pull request Nov 15, 2023
…che#14727)

Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>
satishd pushed a commit that referenced this pull request Nov 16, 2023
) (#14759)

Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>
mjsax pushed a commit to confluentinc/kafka that referenced this pull request Nov 22, 2023
…che#14727)

Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>
rreddy-22 pushed a commit to rreddy-22/kafka-rreddy that referenced this pull request Jan 2, 2024
…che#14727) (apache#14759)

Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>
anurag-harness pushed a commit to anurag-harness/kafka that referenced this pull request Feb 9, 2024
…che#14727) (apache#14759)

Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>
anurag-harness added a commit to anurag-harness/kafka that referenced this pull request Feb 9, 2024
…che#14727) (apache#14759) (#244)

Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>

Co-authored-by: Jorge Esteban Quilcate Otoya <jorge.quilcate@aiven.io>
yyu1993 pushed a commit to yyu1993/kafka that referenced this pull request Feb 15, 2024
…che#14727) (apache#14759)

Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>
clolov pushed a commit to clolov/kafka that referenced this pull request Apr 5, 2024
…che#14727) (apache#14759)

Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-candidate This pull request is a candidate to be backported to previous versions tiered-storage Pull requests associated with KIP-405 (Tiered Storage)
Projects
None yet
5 participants