-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix deadlock in 10-node cluster convergence (#2467)
This PR fixes #2286 . - CheckQuorum was causing us multiple issues. When doing a 5-node Zero cluster bootstrap, it would cause a leader to step down when the size of the cluster is 2, then causing all the rest of the joins to be blocked indefinitely. It would also cause leader step down in a seemingly healthy cluster which is processing proposals. CheckQuorum was mandated by raft.ReadOnlyLeaseBased, which is a less safe option to do linearizable reads. Switch ReadOnlyOption back to raft.ReadOnlySafe. Moreover, we don't need to do quorum based lin reads in the Alpha servers, because of the switch to proposing and then applying transaction updates. - raft.ReadIndex is not working for some reason. So, commented out its usage in Zero (and removed it from Alpha permanently). Needs to be fixed when the following issue is resolved. etcd-io/etcd#9893 - The logic to do lin reads was replicated in both Zero and Alpha. Refactor that into one place in conn/node.go. - Retry conf change proposals if they timeout. This mechanism is similar to the one introduced for normal proposals in a previous commit 06ea4c. - Use a lock to only allow one JoinCluster call at a time. Block JoinCluster until node.AddToCluster is successful (or return the error). - Set raft library to 3.2.23. Before upgrade, we were at 3.2.6. Commit log: * Trying to understand why JoinCluster doesn't work properly. * Fucking works. Fucking works. * It all works now. * More Dgraph servers. Found a new issue where requesting read quorum doesn't respond. * Refactor wait lin read code and move it to conn/node.go * Remove lin read wait for server, because txn timestamp should be sufficient for waiting. Also, for the time being, comment out lin read wait from Zero as well.
- Loading branch information
1 parent
3b5bc66
commit eb3910c
Showing
17 changed files
with
450 additions
and
666 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.