Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadReplica role (former Non Promotable Clone) #1931

Closed
wants to merge 5 commits into from
Closed

ReadReplica role (former Non Promotable Clone) #1931

wants to merge 5 commits into from

Conversation

riccardone
Copy link
Contributor

There is a new ReadReplica config setting (default false)
When it is set to true, the node does not participate in the election process as a candidate and it does not allow the cluster to stay up if there are not enough (other) nodes to form a quorum
Added a new test for the ElectionService changes (more tests can be added... as usual)

To test this PR, run a cluster of 3 nodes. The nodes could be any previous version.
Once that the cluster is up and running, run a node using this PR code with ReadReplica setting set to true. Remember to keep the clustersize 3 and use the same gossip port of the other nodes in order to join the existing cluster.
You can verify that the node is recognise as ReadReplica looking at the logs and it does not participate as a candidate in the Election Process when you shut down the other nodes for testing.

Closes #1751

@riccardone riccardone added the kind/enhancement Issues which are a new feature label May 17, 2019
@riccardone riccardone requested a review from jageall May 17, 2019 12:01
@riccardone
Copy link
Contributor Author

// Quick commands (for linux add mono)
// Start 3 nodes cluster
EventStore.ClusterNode.exe --int-ip 127.0.0.1 --ext-ip 127.0.0.1 --int-tcp-port=1111 --ext-tcp-port=1112 --int-http-port=1113 --ext-http-port=1114 --cluster-size=3 --discover-via-dns=false --gossip-seed=127.0.0.1:2113,127.0.0.1:3113 --structured-log=false

EventStore.ClusterNode.exe --int-ip 127.0.0.1 --ext-ip 127.0.0.1 --int-tcp-port=2111 --ext-tcp-port=2112 --int-http-port=2113 --ext-http-port=2114 --cluster-size=3 --discover-via-dns=false --gossip-seed=127.0.0.1:1113,127.0.0.1:3113 --structured-log=false

EventStore.ClusterNode.exe --int-ip 127.0.0.1 --ext-ip 127.0.0.1 --int-tcp-port=3111 --ext-tcp-port=3112 --int-http-port=3113 --ext-http-port=3114 --cluster-size=3 --discover-via-dns=false --gossip-seed=127.0.0.1:1113,127.0.0.1:2113 --structured-log=false

// Start a ReadReplica node
EventStore.ClusterNode.exe --int-ip 127.0.0.1 --ext-ip 127.0.0.1 --int-tcp-port=4111 --ext-tcp-port=4112 --int-http-port=4113 --ext-http-port=4114 --cluster-size=3 --discover-via-dns=false --gossip-seed=127.0.0.1:1113,127.0.0.1:2113,127.0.0.1:3113 --structured-log=false --read-replica=true

@riccardone
Copy link
Contributor Author

A scenario to consider is when a ReadReplica node is running and all other nodes in the cluster are down. The ReadReplica writer checkpoint is still moving forward even without client writes as it is writing internal stats and data. For that reason, when the other nodes are back up and the ReadReplica re-join the cluster, the elected master will subscribe to it with a less recent position and therefore an off line truncation for the ReadReplica will be performed.

@riccardone
Copy link
Contributor Author

riccardone commented May 28, 2019

A ReadReplica node is like a Clone that could never be elected. When you attach a ReadReplica node to an existing cluster you keep the same Cluster Size. In a cluster of 4 nodes where one of them is a Read Replica there could still happening network partitions. When 2 nodes are running side by side with the other 2 nodes that is when you have a network partitioning. The difference is that using a Read Replica node, the partition where it belongs to does not form a quorum and therefore clients can't write data to it. When the 2 partitions join back together then the nodes in the partition with less data will be automatically stopped and log truncated at next start. Using a Read Replica you can be sure that what is truncated is only the tail of the log containing stats and internal data.

To reproduce the network partition running the nodes locally:

  1. start the four nodes with ClusterSize=3
  2. kill two of the nodes
  3. kill the running two nodes and start the other two nodes
  4. start all the nodes... two of them will be automatically shut down

During the next start the tail of the log will be truncated. If you are using a Clone, there is a risk that you can loose client data (split brain problem... 2 masters accepting writes in the same cluster). If you are using a ReadReplica node there are no risks of loosing client data as the quorum can't be formed.

@ChrisChinchilla ChrisChinchilla added the area/documentation Issues relating to project documentation label May 29, 2019
@pgermishuys pgermishuys closed this Sep 3, 2019
@pgermishuys
Copy link
Contributor

Closing in favor of #1976

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Issues relating to project documentation kind/enhancement Issues which are a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants