-
Notifications
You must be signed in to change notification settings - Fork 635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReadReplica role (former Non Promotable Clone) #1931
Conversation
…e, the node does not partecipate in the election process as a candidate (completed but need to fix ElectionService unit tests)
// Quick commands (for linux add mono) EventStore.ClusterNode.exe --int-ip 127.0.0.1 --ext-ip 127.0.0.1 --int-tcp-port=2111 --ext-tcp-port=2112 --int-http-port=2113 --ext-http-port=2114 --cluster-size=3 --discover-via-dns=false --gossip-seed=127.0.0.1:1113,127.0.0.1:3113 --structured-log=false EventStore.ClusterNode.exe --int-ip 127.0.0.1 --ext-ip 127.0.0.1 --int-tcp-port=3111 --ext-tcp-port=3112 --int-http-port=3113 --ext-http-port=3114 --cluster-size=3 --discover-via-dns=false --gossip-seed=127.0.0.1:1113,127.0.0.1:2113 --structured-log=false // Start a ReadReplica node |
…ther nodes are back up
A scenario to consider is when a ReadReplica node is running and all other nodes in the cluster are down. The ReadReplica writer checkpoint is still moving forward even without client writes as it is writing internal stats and data. For that reason, when the other nodes are back up and the ReadReplica re-join the cluster, the elected master will subscribe to it with a less recent position and therefore an off line truncation for the ReadReplica will be performed. |
A ReadReplica node is like a Clone that could never be elected. When you attach a ReadReplica node to an existing cluster you keep the same Cluster Size. In a cluster of 4 nodes where one of them is a Read Replica there could still happening network partitions. When 2 nodes are running side by side with the other 2 nodes that is when you have a network partitioning. The difference is that using a Read Replica node, the partition where it belongs to does not form a quorum and therefore clients can't write data to it. When the 2 partitions join back together then the nodes in the partition with less data will be automatically stopped and log truncated at next start. Using a Read Replica you can be sure that what is truncated is only the tail of the log containing stats and internal data. To reproduce the network partition running the nodes locally:
During the next start the tail of the log will be truncated. If you are using a Clone, there is a risk that you can loose client data (split brain problem... 2 masters accepting writes in the same cluster). If you are using a ReadReplica node there are no risks of loosing client data as the quorum can't be formed. |
Closing in favor of #1976 |
There is a new ReadReplica config setting (default false)
When it is set to true, the node does not participate in the election process as a candidate and it does not allow the cluster to stay up if there are not enough (other) nodes to form a quorum
Added a new test for the ElectionService changes (more tests can be added... as usual)
To test this PR, run a cluster of 3 nodes. The nodes could be any previous version.
Once that the cluster is up and running, run a node using this PR code with ReadReplica setting set to true. Remember to keep the clustersize 3 and use the same gossip port of the other nodes in order to join the existing cluster.
You can verify that the node is recognise as ReadReplica looking at the logs and it does not participate as a candidate in the Election Process when you shut down the other nodes for testing.
Closes #1751