Skip to content

KAFKA-20522: Refine test coverage for consumer close-on-interrupt best-effort behavior.#22138

Open
chickenchickenlove wants to merge 3 commits intoapache:trunkfrom
chickenchickenlove:KAFKA-20522
Open

KAFKA-20522: Refine test coverage for consumer close-on-interrupt best-effort behavior.#22138
chickenchickenlove wants to merge 3 commits intoapache:trunkfrom
chickenchickenlove:KAFKA-20522

Conversation

@chickenchickenlove
Copy link
Copy Markdown
Contributor

Description

In the previous, PlaintextConsumerTest.testClassicConsumerCloseLeavesGroupOnInterrupt were flaky.

The reason is that LeaveGroup on consumer close has best-effort semantics, but those tests were written under the assumption that the consumer deterministically leaves the consumer group. In practice, however, the consumer may fail to leave the group within the bounded timeout, for example due to metadata expiration. (#21332)

This PR splits the flaky test into two separate tests.

  1. The existing integration test now verifies that callsToRevoked is invoked when interrupt() is called. This behavior is still deterministic even under best-effort semantics.
  2. A unit test verifies that an ApiKeys.LEAVE_GROUP request is recorded in MockClient.requests() when interrupt() is called.

Previously, even if an ApiKeys.LEAVE_GROUP request was created, request processing could be delayed due to metadata expiration. As a result, the exact point at which the consumer leaves the group was non-deterministic. Therefore, taking the best-effort semantics into account, the unit test only verifies that the ApiKeys.LEAVE_GROUP request is recorded.

Related

@github-actions github-actions Bot added triage PRs from the community core Kafka Broker consumer tests Test fixes (including flaky tests) clients small Small PRs labels Apr 24, 2026
@chickenchickenlove
Copy link
Copy Markdown
Contributor Author

chickenchickenlove commented Apr 30, 2026

@chia7712 @lianetm Hi!
When you get a chance, could you take a look?

Copy link
Copy Markdown
Member

@lianetm lianetm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just minor comments.

}

@Test
public void testClassicConsumerCloseAttemptsLeaveGroupWhenInterrupted() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we test it for the AsyncConsumer too please? We can check that it add the LeaveGroupOnCloseEvent (even though it cannot wait for it because it's Interrupted)

--update
actually, I noticed we already have testCloseLeavesGroupDespiteInterrupt for the Async, nice

Thread.interrupted();
}

//THEN
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure this kind of comments add much value?? (given, when, then) What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right!
I've addressed it.

@github-actions github-actions Bot removed the triage PRs from the community label May 2, 2026
@chickenchickenlove
Copy link
Copy Markdown
Contributor Author

@lianetm
Thanks for the review!
I've addressed it based on your comments.
When you get a chance, please take another look. 🙇‍♂️

import java.util.concurrent.ExecutionException

@Timeout(60)
class PlaintextConsumerTest extends AbstractConsumerTest {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can safely delete this test class directly, since it is fully covered by testAsyncConsumerCloseRunsRevocationCallbackOnInterrupt and testClassicConsumerCloseRunsRevocationCallbackOnInterrupt. The only reason it was kept before was to preserve the original code that was causing the flakiness

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chia7712
Thanks for the time to take a look into this!
Yes, you are right. I removed PlaintextConsumerTest.scala class.

@github-actions github-actions Bot removed the small Small PRs label May 4, 2026
@chickenchickenlove
Copy link
Copy Markdown
Contributor Author

@chia7712
Thanks for the review.
I've addressed it based on your comments!
When you have bandwidth, could you please take another look? 🙇‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-approved clients consumer core Kafka Broker tests Test fixes (including flaky tests)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants