Cluster can't tolerate more than one service failure #3417
-
Hi, docker-compose.ymlversion: "3.9"
services:
nats:
image: "nats:2.8.4-alpine3.15"
container_name: nats
ports:
- "4222:4222"
- "8222:8222"
volumes:
- ~/Downloads/nats-storage/cluster/jetstream-c1:/tmp/nats/jetstream
- ./config.c.conf:/config.conf
command: "--config /config.conf --server_name S1"
networks:
- nats
nats-c2:
image: "nats:2.8.4-alpine3.15"
container_name: "nats-c2"
ports:
- "5222:4222"
volumes:
- ~/Downloads/nats-storage/cluster/jetstream-c2:/tmp/nats/jetstream
- ./config.c.conf:/config.conf
command: "--config /config.conf --server_name S2"
depends_on:
- nats
networks:
- nats
nats-c3:
image: "nats:2.8.4-alpine3.15"
container_name: "nats-c3"
ports:
- "6222:4222"
volumes:
- ~/Downloads/nats-storage/cluster/jetstream-c3:/tmp/nats/jetstream
- ./config.c.conf:/config.conf
command: "--config /config.conf --server_name S3"
depends_on:
- nats
networks:
- nats
networks:
nats:
external: true config.c.conf
I'm using Golang client version |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 8 replies
-
@AhmedAbouelkher Could you describe a bit more the issue you are having? If by "Cluster can't tolerate more than one service failure" you mean that the cluster will stop serving clients if 2 out of the 3 servers are down, then yes, this is correct as in all distributed systems that rely on a RAFT consensus that need a quorum, which in the case of a cluster size of 3 means that it can survive only 1 failure: https://docs.nats.io/running-a-nats-service/configuration/clustering/jetstream_clustering#the-quorum |
Beta Was this translation helpful? Give feedback.
-
@kozlovic Thanks for your interest.
I tried many configs with the same results as above. |
Beta Was this translation helpful? Give feedback.
@AhmedAbouelkher Could you describe a bit more the issue you are having? If by "Cluster can't tolerate more than one service failure" you mean that the cluster will stop serving clients if 2 out of the 3 servers are down, then yes, this is correct as in all distributed systems that rely on a RAFT consensus that need a quorum, which in the case of a cluster size of 3 means that it can survive only 1 failure: https://docs.nats.io/running-a-nats-service/configuration/clustering/jetstream_clustering#the-quorum