Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-15479: Remote log segments should be considered once for retention breach #14407

Merged
merged 5 commits into from Sep 25, 2023

Conversation

kamalcph
Copy link
Collaborator

When a remote log segment contains multiple epoch, then it gets considered for multiple times during breach by retention size/time/start-offset. This will affect the deletion by remote log retention size as it deletes the number of segments lesser than expected. This is a follow-up of KAFKA-15352

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@kamalcph
Copy link
Collaborator Author

@clolov @divijvaidya @showuon @satishd

Call for review. PTAL.

@@ -993,7 +988,9 @@ private void cleanupExpiredRemoteLogSegments() throws RemoteStorageException, Ex
return;
}
RemoteLogSegmentMetadata metadata = segmentsIterator.next();

if (segmentsToDelete.contains(metadata)) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main fix.

@divijvaidya
Copy link
Contributor

I had a comment to fix this/similar problem in original PR - #13561 (comment)

I am curious, why didn't the test which was added to resolve the comment fail? Is it because the test only checked for eligibility of a segment for calculation of size and didn't actually check if same segment is being counted twice?

@divijvaidya divijvaidya added the tiered-storage Pull requests associated with KIP-405 (Tiered Storage) label Sep 19, 2023
@kamalcph
Copy link
Collaborator Author

kamalcph commented Sep 19, 2023

I had a comment to fix this/similar problem in original PR - #13561 (comment)

I am curious, why didn't the test which was added to resolve the comment fail? Is it because the test only checked for eligibility of a segment for calculation of size and didn't actually check if same segment is being counted twice?

We don't have a test for the scenario mentioned in the comment. And, there was a regression after #14349. Added a couple of tests to verify the deleted segment count and log-start-offset.

@clolov
Copy link
Collaborator

clolov commented Sep 20, 2023

I will review this in a couple of hours, apologies for the delay

@divijvaidya divijvaidya self-requested a review September 20, 2023 10:59
@satishd satishd self-requested a review September 20, 2023 15:20
…ion breach

When a remote log segment contains multiple epoch, then it gets considered for multiple times during breach by retention size/time/start-offset. This will affect the deletion by remote log retention size as it deletes the number of segments lesser than expected. This is a follow-up of KAFKA-15352
@kamalcph
Copy link
Collaborator Author

@clolov @divijvaidya @showuon @satishd

Test failures are unrelated. Please take another look!

Copy link
Contributor

@divijvaidya divijvaidya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Test failures are unrelated.

[Build / JDK 11 and Scala 2.13 / kafka.api.ConsumerBounceTest.testSeekAndCommitWithBrokerFailures()](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14407/6/testReport/junit/kafka.api/ConsumerBounceTest/Build___JDK_11_and_Scala_2_13___testSeekAndCommitWithBrokerFailures__/)
[Build / JDK 11 and Scala 2.13 / org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldRemoveOneNamedTopologyWhileAnotherContinuesProcessing()](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14407/6/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_11_and_Scala_2_13___shouldRemoveOneNamedTopologyWhileAnotherContinuesProcessing__/)
[Build / JDK 20 and Scala 2.13 / org.apache.kafka.common.network.SslTransportLayerTest.[2] tlsProtocol=TLSv1.2, useInlinePem=true](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14407/6/testReport/junit/org.apache.kafka.common.network/SslTransportLayerTest/Build___JDK_20_and_Scala_2_13____2__tlsProtocol_TLSv1_2__useInlinePem_true/)
[Build / JDK 20 and Scala 2.13 / kafka.api.TransactionsTest.testBumpTransactionalEpoch(String).quorum=kraft](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14407/6/testReport/junit/kafka.api/TransactionsTest/Build___JDK_20_and_Scala_2_13___testBumpTransactionalEpoch_String__quorum_kraft/)
[Build / JDK 20 and Scala 2.13 / org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldBackOffTaskAndEmitDataWithinSameTopology()](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14407/6/testReport/junit/org.apache.kafka.streams.integration/NamedTopologyIntegrationTest/Build___JDK_20_and_Scala_2_13___shouldBackOffTaskAndEmitDataWithinSameTopology__/)
[Build / JDK 17 and Scala 2.13 / kafka.server.DescribeClusterRequestTest.testDescribeClusterRequestExcludingClusterAuthorizedOperations(String).quorum=kraft](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14407/6/testReport/junit/kafka.server/DescribeClusterRequestTest/Build___JDK_17_and_Scala_2_13___testDescribeClusterRequestExcludingClusterAuthorizedOperations_String__quorum_kraft/)
[Build / JDK 17 and Scala 2.13 / org.apache.kafka.trogdor.coordinator.CoordinatorTest.testTaskRequestWithOldStartMsGetsUpdated()](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14407/6/testReport/junit/org.apache.kafka.trogdor.coordinator/CoordinatorTest/Build___JDK_17_and_Scala_2_13___testTaskRequestWithOldStartMsGetsUpdated__/)

@divijvaidya
Copy link
Contributor

I will wait for other reviewers on this PR (cc: @satishd @clolov) to provide an approval in the next few days before merging this.

@divijvaidya
Copy link
Contributor

@satishd do you have any additional comments on this PR? @clolov (the other reviewer in this PR) has already approved and we are waiting for you.

Copy link
Member

@satishd satishd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kamalcph for the PR and addressing the comments. LGTM.

@satishd
Copy link
Member

satishd commented Sep 25, 2023

Merging it to trunk as the failed tests are not related to the changes introduced in this PR.

@satishd satishd merged commit a150120 into apache:trunk Sep 25, 2023
1 check failed
@kamalcph kamalcph deleted the KAFKA-15479 branch September 25, 2023 16:50
jeqo pushed a commit to aiven/kafka that referenced this pull request Sep 27, 2023
…ion breach (apache#14407)

When a remote log segment contains multiple epoch, then it gets considered for multiple times during breach by retention size/time/start-offset. This will affect the deletion by remote log retention size as it deletes the number of segments less than expected. This is a follow-up of KAFKA-15352

Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org>
rreddy-22 pushed a commit to rreddy-22/kafka-rreddy that referenced this pull request Oct 3, 2023
…ion breach (apache#14407)

When a remote log segment contains multiple epoch, then it gets considered for multiple times during breach by retention size/time/start-offset. This will affect the deletion by remote log retention size as it deletes the number of segments less than expected. This is a follow-up of KAFKA-15352

Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org>
jeqo pushed a commit to aiven/kafka that referenced this pull request Oct 4, 2023
…ion breach (apache#14407)

When a remote log segment contains multiple epoch, then it gets considered for multiple times during breach by retention size/time/start-offset. This will affect the deletion by remote log retention size as it deletes the number of segments less than expected. This is a follow-up of KAFKA-15352

Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org>
showuon pushed a commit that referenced this pull request Oct 19, 2023
…ion breach (#14407)

When a remote log segment contains multiple epoch, then it gets considered for multiple times during breach by retention size/time/start-offset. This will affect the deletion by remote log retention size as it deletes the number of segments less than expected. This is a follow-up of KAFKA-15352

Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org>
k-wall pushed a commit to k-wall/kafka that referenced this pull request Nov 21, 2023
…ion breach (apache#14407)

When a remote log segment contains multiple epoch, then it gets considered for multiple times during breach by retention size/time/start-offset. This will affect the deletion by remote log retention size as it deletes the number of segments less than expected. This is a follow-up of KAFKA-15352

Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org>
mjsax pushed a commit to confluentinc/kafka that referenced this pull request Nov 22, 2023
…ion breach (apache#14407)

When a remote log segment contains multiple epoch, then it gets considered for multiple times during breach by retention size/time/start-offset. This will affect the deletion by remote log retention size as it deletes the number of segments less than expected. This is a follow-up of KAFKA-15352

Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org>
anurag-harness pushed a commit to anurag-harness/kafka that referenced this pull request Feb 9, 2024
…ion breach (apache#14407)

When a remote log segment contains multiple epoch, then it gets considered for multiple times during breach by retention size/time/start-offset. This will affect the deletion by remote log retention size as it deletes the number of segments less than expected. This is a follow-up of KAFKA-15352

Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org>
anurag-harness added a commit to anurag-harness/kafka that referenced this pull request Feb 9, 2024
…ion breach (apache#14407) (#33)

When a remote log segment contains multiple epoch, then it gets considered for multiple times during breach by retention size/time/start-offset. This will affect the deletion by remote log retention size as it deletes the number of segments less than expected. This is a follow-up of KAFKA-15352

Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org>

Co-authored-by: Kamal Chandraprakash <kchandraprakash@uber.com>
Cerchie pushed a commit to Cerchie/kafka that referenced this pull request Feb 22, 2024
…ion breach (apache#14407)

When a remote log segment contains multiple epoch, then it gets considered for multiple times during breach by retention size/time/start-offset. This will affect the deletion by remote log retention size as it deletes the number of segments less than expected. This is a follow-up of KAFKA-15352

Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>, Satish Duggana <satishd@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tiered-storage Pull requests associated with KIP-405 (Tiered Storage)
Projects
None yet
4 participants