Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it normal if only one node is alive, the leader id becomes -1? #84

Closed
xuluna opened this issue Dec 20, 2019 · 5 comments
Closed

Is it normal if only one node is alive, the leader id becomes -1? #84

xuluna opened this issue Dec 20, 2019 · 5 comments

Comments

@xuluna
Copy link

xuluna commented Dec 20, 2019

I have two nodes in the cluster {1,2}, the leader is 1. Now 2 is disconnected, and some error message is shown like this (on 1):
Error: raft_server.cxx:check_leadership_validity:856: 1 nodes (out of 2, 2 including learners) are not responding longer than 2500 ms, at least 2 nodes (including leader) should be alive to proceed commit
Error: raft_server.cxx:check_leadership_validity:858: will yield the leadership of this node
Is it by design that raft cannot work with only one node?

@greensky00
Copy link
Contributor

Hi @xuluna,

In the original Raft (and also all other quorum-based consensus protocols), that's right. The cluster cannot make any decision and becomes read-only if it loses quorum (a majority of nodes). In 2-node cluster, the minimum size of quorum is 2, so that if at least one node goes down, the cluster is no longer able to serve write operation (In 3-node cluster, one node down will be tolerated safely).

But in NuRaft, we support custom quorum size. If you manually reduce the size of quorum to 1, the 2-node cluster will be available even though one node goes down.
https://github.com/eBay/NuRaft/blob/master/docs/custom_quorum_size.md

But please note that it does not guarantee the safety against data loss or log diverging.

@xuluna
Copy link
Author

xuluna commented Dec 20, 2019

That's very awesome. Thank you!

@xuluna xuluna closed this as completed Dec 20, 2019
@xuluna
Copy link
Author

xuluna commented Dec 25, 2019

@greensky00 If I set the custom quorum size and custom election size to 1, and later on node 2 is back, can it be detected? or I have to manually set sizes back again and add_srv again? The log will be automatically synced as well?

@xuluna xuluna reopened this Dec 25, 2019
@greensky00
Copy link
Contributor

greensky00 commented Dec 26, 2019

@xuluna
Once you add server, the membership info (list of all members in the cluster) is preserved unless you remove the server from the cluster (remove server is different to becoming offline). Hence, its status (online/offline) will be automatically detected by the leader, you don't need to call add_srv again. Logs also will be automatically synced as soon as it becomes online.

Regarding the custom quorum size, you need to manually set the size back to 0 (i.e., default size).

@xuluna
Copy link
Author

xuluna commented Dec 26, 2019

@greensky00 Thank you and happy new year!

@xuluna xuluna closed this as completed Dec 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants