Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Stream coordinator leader selection bug #3967

Merged
merged 3 commits into from Jan 7, 2022

Conversation

kjnilsson
Copy link
Contributor

@kjnilsson kjnilsson commented Jan 7, 2022

Fix leader selection issue and make stream coordinator versioned so that it remains deterministic during live upgrades.

The list consists of candidates which is a tuple {node, tail}, and the tail is made of {epoch, offset}.
While the 'select_leader' think the tail is made of {offset, epoch}. 

Suppose there are two candidates:
[{node1,{1,100}},{node2,{2,99}}] 

It selects node1 as the leader instead of node2 with larger epoch.
@kjnilsson kjnilsson changed the title Fix rabbit_stream_coordinator:select_leader' runs with wrong comparison Fix Stream coordinator leader selection bug Jan 7, 2022
In order to retain deterministic results of state machine applications
during upgrades we need to make the stream coordinator versioned such
that we only use the new logic once the stream coordinator switches to
machine version 1.
@kjnilsson kjnilsson force-pushed the tomyouyou-stream_select_leader branch from d9a232c to 9a5d0f9 Compare January 7, 2022 12:11
@michaelklishin michaelklishin merged commit 4a3d926 into master Jan 7, 2022
@michaelklishin michaelklishin deleted the tomyouyou-stream_select_leader branch January 7, 2022 23:34
@michaelklishin michaelklishin added this to the 3.9.13 milestone Jan 7, 2022
michaelklishin added a commit that referenced this pull request Jan 8, 2022
Fix Stream coordinator leader selection bug (backport #3967)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants