This compares deployment, behavioral, limitations and benefits of two high availability deployment approaches: orchestrator/raft
vs orchestrator/[galera|xtradb cluster|innodb cluster]
We will assume and compare:
3
data-centers setup (an availability zone may count as a data-center)3
nodeorchestrator/raft
setup3
nodeorchestrator
on multi-writergalera|xtradb cluster|innodb cluster
(each MySQL in cluster may accept writes)- A proxy able to run
HTTP
ormysql
health checks MySQL
,MariaDB
,Percona Server
all considered under the termMySQL
.
Compare | orchestrator/raft | synchronous replication backend |
---|---|---|
General wiring | Each orchestrator node has a private backend DB; orchestrator nodes communicate by raft protocol |
Each orchestrator node connects to a different MySQL member in a synchronous replication group. orchestrator nodes do not communicate with each other. |
Backend DB | MySQL or SQLite |
MySQL |
Backend DB dependency | Service panics if cannot access its own private backend DB | Service unhealthy if cannot access its own private backend DB |
DB data | Independent across DB backends. May vary, but on a stable system converges to same overall picture | Single dataset, synchronously replicated across DB backends. |
DB access | Never write directly. Only raft nodes access the backend DB while coordinating/cooperating. Or else inconsistencies can be introduced. Reads are OK. |
Possible to access & write directly; all orchestrator nodes/clients see exact same picture. |
Leader and actions | Single leader. Only the leader runs recoveries. All nodes run discoveries (probing) and self-analysis | Single leader. Only the leader runs discoveries (probing), analysis and recoveries. |
HTTP Access | Must only access the leader (can be enforced by proxy or orchestrator-client ) |
May access any healthy node (can be enforced by proxy). For read consistency always best to speak to leader only (can be enforced by proxy or orchestrator-client ) |
Command line | HTTP/API access (e.g. curl , jq ) or orchestrator-client script which wraps common HTTP /API calls with familiar command line interface |
HTTP/API, and/or orchestrator-client script, or orchestrator ... command line invocation. |
Install | orchestrator service on service nodes only. orchestrator-client script anywhere (requires access to HTTP/API). |
orchestrator service on service nodes. orchestrator-client script anywhere (requires access to HTTP/API). orchestrator client anywhere (requires access to backend DBs) |
Proxy | HTTP. Must only direct traffic to the leader (/api/leader-check ) |
HTTP. Must only direct traffic to healthy nodes (/api/status ) ; best to only direct traffic to leader node (/api/leader-check ) |
No proxy | Use orchestrator-client with all orchestrator backends. orchestrator-client will direct traffic to master. |
Use orchestrator-client with all orchestrator backends. orchestrator-client will direct traffic to master. |
Cross DC | Each orchestrator node (along with private backend) can run on a different DC. Nodes do not communicate much, low traffic. |
Each orchestrator node (along with associated backend) can run on a different DC. orchestrator nodes do not communicate directly. MySQL group replication is chatty. Amount of traffic mostly linear by size of topologies and by polling rate. Write latencies. |
Probing | Each topology server probed by all orchestrator nodes |
Each topology server probed by the single active node |
Failure analysis | Performed independently by all nodes | Performed by leader only (DB is shared so all nodes see exact same picture anyhow) |
Failover | Performed by leader node only | Performed by leader node only |
Resiliency to failure | 1 node may go down (2 on a 5 node cluster) |
1 node may go down (2 on a 5 node cluster) |
Node back from short failure | Node rejoins cluster, gets updated with changes. | DB node rejoins cluster, gets updated with changes. |
Node back from long outage | DB must be cloned from healthy node. | Depends on your MySQL backend implementation. Potentially SST/restore from backup. |
Here are considerations for choosing between the two approaches:
- You only have a single data center (DC): pick shared DB or even a simpler setup
- You are comfortable with Galera/XtraDB Cluster/InnoDB Cluster and have the automation to set them up and maintain them: pick shared DB backend.
- You have high-latency cross DC network: choose
orchestrator/raft
. - You don't want to allocate MySQL servers for the
orchestrator
backend: chooseorchestrator/raft
withSQLite
backend - You have thousands of MySQL boxes: choose either, but choose
MySQL
backend which is more write performant thanSQLite
.
- Another synchronous replication setup is that of a single writer. This would require an additional proxy between the
orchestrator
nodes and the underlying cluster, and is not considered above.