Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raft's reuse of match_idx can lead to unsafely advancing commit index #5325

Closed
eddyashton opened this issue Jun 2, 2023 · 0 comments · Fixed by #5326
Closed

Raft's reuse of match_idx can lead to unsafely advancing commit index #5325

eddyashton opened this issue Jun 2, 2023 · 0 comments · Fixed by #5326

Comments

@eddyashton
Copy link
Member

As raised by @lemmy in #5321/#5324, we allow match_idx in raft.h to go backwards in the implementation.

As I suspected in this comment, this actually raises a larger safety concern. It goes backwards because we're using it to store a different concept, the probe index that reduces the number of NACKs to realign with a divergent log. But using that probe index to populate match_idx means the primary may overstate how far another node's log matches, and then use that overstatement to unsafely advance commit.

Raft scenario repro proof here:

https://github.com/eddyashton/CCF/blob/match_idx_safety/tests/raft_scenarios/suffix_collision.3

In simpler terms (trying to ignore the surrounding partition complexity and additional nodes we need to reach this state, and almost certainly with some off-by-1s in doing this by-hand), we can have something like this:

Node 0:
 1.1  1.2  3.3  3.4  3.5  5.6  5.7

Node 1:
 1.1  2.2  2.3  2.4  2.5  4.6

Node 2:
 1.1  1.2
  1. Node 0 sends an AE starting at 5.6 to Node 1.
  2. Node 1 says "that doesn't match my log, but maybe we agree from seqno 5?"
  3. Node 0 sees that NACK, and stores match_idx[1] = 5.
  4. Node 0 gets an ACK from node 2, which causes them to try to advance commit.
  5. Node 0 looks at all of their match indices, and sees both it and node 1 (a majority) appear to have seqno 5.
  6. Node 0 advances commit to 5, despite 3.5 only being present on a single node.

I believe the fix is straightforward - store this probe index in a separate variable, and only set match_idx to safe matching values (received in ACKs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants