Skip to content

KAFKA-19354: KRaft observer should fetch from leader after rediscovery #19854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: trunk
Choose a base branch
from

Conversation

ahuang98
Copy link
Contributor

@ahuang98 ahuang98 commented May 30, 2025

Observers may get stuck fetching from bootstrap servers even on discovery of a leader from a fetch response.

This allows observers to start fetching from the leader again by reseting their fetch timeout on discovery of a leader from a fetch response. (Observers fetch from bootstrap servers once their fetch timeout expires)

testObserverSendDiscoveryFetchAfterFetchTimeoutAndResumesFetchingFromLeader
demonstrates the issue that observers will get stuck fetching from random bootstrap servers after fetch timeout (fails without the change)

testUnattachedWithLeaderCanBecomeFollowerAfterFindingLeader shows
voters are not susceptible to getting stuck fetching from bootstrap servers

@github-actions github-actions bot added triage PRs from the community tests Test fixes (including flaky tests) kraft small Small PRs labels May 30, 2025

context.pollUntilRequest();
fetchRequest = context.assertSentFetchRequest();
assertEquals(leaderId, fetchRequest.destination().id());
Copy link
Contributor Author

@ahuang98 ahuang98 May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check would fail without the fix

@github-actions github-actions bot removed the triage PRs from the community label May 30, 2025
Copy link
Member

@jsancio jsancio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahuang98, the tests didn't execute because the code has a compilation error.

@ahuang98 ahuang98 changed the title [Do not merge] Test which shows observers stuck fetching from bootstrap servers KAFKA-19354: KRaft observer should fetch from leader after rediscovery May 30, 2025
Copy link
Member

@jsancio jsancio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that voters can never rediscover the leader from the bootstrap server? Or this doesn't impact voter because they transition to prospective and back to follower which resets the fetch timer?

@ahuang98
Copy link
Contributor Author

ahuang98 commented Jun 2, 2025

Thanks for the review Jose

Does this mean that voters can never rediscover the leader from the bootstrap server? Or this doesn't impact voter because they transition to prospective and back to follower which resets the fetch timer?

Yes, this doesn't impact voters because prospectives which have a leaderId would transition back to follower (and refreshed fetch timeout)

} else if (leaderId.isPresent()) {
if (quorum.isFollowerObserver()) {
// This allows observers to resume fetching from leader after discovering the leader
// transitionToFollower(epoch, leaderId.getAsInt(), leaderEndpoints, currentTimeMs);
Copy link
Contributor Author

@ahuang98 ahuang98 Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternative solution, needing allowance for follower to follower transition for observers only

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-approved kraft small Small PRs tests Test fixes (including flaky tests)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants