Skip to content

KAFKA-20586: Fix flaky LeaderElectionCommandTest.testPreferredReplicaElection#22303

Open
majialoong wants to merge 2 commits into
apache:trunkfrom
majialoong:KAFKA-20586
Open

KAFKA-20586: Fix flaky LeaderElectionCommandTest.testPreferredReplicaElection#22303
majialoong wants to merge 2 commits into
apache:trunkfrom
majialoong:KAFKA-20586

Conversation

@majialoong
Copy link
Copy Markdown
Contributor

@majialoong majialoong commented May 17, 2026

LeaderElectionCommand success only indicates the controller has
written the metadata change, but brokers may not have synced the
corresponding update yet.

The previous helper fetchOrWaitForLeader only waited for any leader
(not the expected one), so assertions could still observe stale leader
state and become flaky.

This change adds waitForExpectedLeader to poll until the expected
leader is observed, eliminating the race in the test path.

@github-actions github-actions Bot added triage PRs from the community tools tests Test fixes (including flaky tests) labels May 17, 2026
private void assertLeader(Admin client, TopicPartition topicPartition, int expectedLeader) throws Exception {
int leader = AdminUtils.fetchOrWaitForLeader(client, topicPartition.topic(), topicPartition.partition(), 30000);
int leader = AdminUtils.fetchOrWaitForExpectedLeader(client, topicPartition.topic(), topicPartition.partition(), expectedLeader, 30000);
assertEquals(expectedLeader, leader);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetchOrWaitForExpectedLeader method already returns when leader == expectedLEader. This assertion seems redundancy

/**
* Fetch the partition leader or wait until the expected leader is observed using the provided admin client.
*/
public static int fetchOrWaitForExpectedLeader(Admin admin,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new method and existing method fetchOrWaitForLeader

are almost identical. There seems to be scope for reducing the duplicate code.

Copy link
Copy Markdown
Contributor

@muralibasani muralibasani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Have a couple of minor comments.

@github-actions github-actions Bot removed the triage PRs from the community label May 18, 2026
@majialoong
Copy link
Copy Markdown
Contributor Author

@muralibasani Thanks for the review. Your suggestions make sense. Updated!

Copy link
Copy Markdown
Contributor

@muralibasani muralibasani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for addressing the feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-approved tests Test fixes (including flaky tests) tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants