Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: don't promote to voter if previous config is not committed #17675

Merged
merged 5 commits into from
Apr 11, 2024

Conversation

ztlpn
Copy link
Contributor

@ztlpn ztlpn commented Apr 5, 2024

Otherwise we may add several new voters in quick succession, that the old voters will not know of, resulting in a possibility of non-intersecting quorums.

Example scenario:

  1. we start with leader id 1, committed configuration: voters:1,2,3; learners:4,5
  2. 1,4,5 are partitioned from 2,3
  3. 1 finishes recovery of 4 and 5 and adds them as voters
  4. now 1 can remain the leader getting responses from 1,4,5 (that form a quorum in the new voter set 1,2,3,4,5) and 2,3 can elect a leader among themselves, say 2 (because 2,3 is a quorum in the old voter set 1,2,3)
  5. now 1 and 2 both think that they are legitimate leaders and can commit new entries, resulting in divergent logs.

To prevent this, in step 3 we don't add 5 as voter until the configuration with voter 4 is committed (or vice versa).

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

Bug Fixes

  • Fix a bug that could lead to raft log inconsistencies when 2 out of 3 nodes in a configuration are changed.

mmaslankaprv
mmaslankaprv previously approved these changes Apr 5, 2024
bharathv
bharathv previously approved these changes Apr 5, 2024
@piyushredpanda
Copy link
Contributor

/ci-repeat

Comment on lines -69 to +86
consistency_level>>
consistency_level,
isolated_t>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love seeing unit tests!

@piyushredpanda
Copy link
Contributor

Build seems to be failing here, @ztlpn

@ztlpn ztlpn dismissed stale reviews from bharathv and mmaslankaprv via 195feac April 8, 2024 10:16
@ztlpn ztlpn force-pushed the raft-voter-add-divergence branch from 44f1272 to 195feac Compare April 8, 2024 10:16
@ztlpn ztlpn force-pushed the raft-voter-add-divergence branch from 195feac to 4e7a69a Compare April 8, 2024 13:22
mmaslankaprv
mmaslankaprv previously approved these changes Apr 8, 2024
@ztlpn ztlpn force-pushed the raft-voter-add-divergence branch from 4e7a69a to 86c526f Compare April 10, 2024 21:37
Otherwise we may add several new voters in quick succession, that the
old voters will not know of, resulting in a possibility of non-intersecting
quorums.

Example scenario:
1. we start with leader id 1, committed configuration: voters:1,2,3; learners:4,5
2. 1,4,5 are partitioned from 2,3
3. 1 finishes recovery of 4 and 5 and adds them as voters
4. now 1 can remain the leader getting responses from 1,4,5
   (that form a quorum in the new voter set 1,2,3,4,5)
   and 2,3 can elect a leader among themselves, say 2
   (because 2,3 is a quorum in the old voter set 1,2,3)
5. now 1 and 2 both think that they are legitimate leaders and can
   commit new entries, resulting in divergent logs.

To prevent this, in step 3 we don't add 5 as voter until the configuration with
voter 4 is committed (or vice versa).
If we test reconfig with faulty network, nodes leaving the raft group
are not guaranteed to be up to date because they might have left the
group before they have been fully recovered. So we restrict the
after-reconfig checks to target nodes.
@ztlpn ztlpn merged commit 8242d48 into redpanda-data:dev Apr 11, 2024
17 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.2.x

@ztlpn ztlpn deleted the raft-voter-add-divergence branch April 11, 2024 12:16
@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-17675-v23.2.x-644 remotes/upstream/v23.2.x
git cherry-pick -x b1e1e0298b4e2da7985bd43f15739852191a46ff ccb5f640f10a373f99e95daeab42d801ceefd469 1085f755a575f0baffb072059c22859e0040142a eb8218558514bba1c0533a5f028d93c7d32d6a6c cea0c6c85f7a56eaef603d8fbc7a69d3ebb923c0

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.3.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-17675-v23.3.x-536 remotes/upstream/v23.3.x
git cherry-pick -x b1e1e0298b4e2da7985bd43f15739852191a46ff ccb5f640f10a373f99e95daeab42d801ceefd469 1085f755a575f0baffb072059c22859e0040142a eb8218558514bba1c0533a5f028d93c7d32d6a6c cea0c6c85f7a56eaef603d8fbc7a69d3ebb923c0

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants