Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Redis Cluster with Node Aware or Rack Aware Replication #11306

Open
vineelyalamarthy opened this issue Sep 22, 2022 · 4 comments
Open

Comments

@vineelyalamarthy
Copy link

Out of curiosity, if we want to build our own Redis as a Service without using any Cloud Provider, how can we make sure that Master and Slave are not scheduled on the same failure domain? Especially if we are doing this in Kubernetes environment, do others use Operators in Kubernetes or any other solutions ?

Would love to hear from the community on this.

@igorwwwwwwwwwwwwwwwwwwww
Copy link
Contributor

We are discussing this issue as well: downstream issue.

We don't have any production experience yet, but the following is the design we are discussing.

It should be possible to achieve with an explicit "replicaset" topology. Redis Cluster and redis-cli do not impose any structure. But you can define your own.

A "replicaset" consists of a primary and 2 replicas. These 3 nodes are in distinct zones. You then scale by adding more replicasets.

- redis-rs0
  - redis-rs0-0   (us-east1-b)
  - redis-rs0-1   (us-east1-c)
  - redis-rs0-2   (us-east1-d)
- redis-rs1
  - redis-rs1-0
  - redis-rs1-1
  - redis-rs1-2
- redis-rs2
  - redis-rs2-0
  - redis-rs2-1
  - redis-rs2-2
...

Since failovers only occur within a replicaset, and its members are in distinct zones, we ensure zonal redundancy. In Kubernetes this can be accomplished via one StatefulSet per replicaset, and pod anti-affinity applied to each of them.

You'll want to configure your redis-server processes with cluster-allow-replica-migration no to ensure replicasets remain stable.

You'll need to take care when setting up the nodes to ensure they are connected in the right way. redis-cli --cluster add-node has --cluster-slave and --cluster-master-id options for this, and there is a CLUSTER REPLICATE command to re-assign replicas on the fly.

Hope that helps!

@vineelyalamarthy
Copy link
Author

vineelyalamarthy commented Jan 6, 2023

I will share the notes soon. There are several ways in which this can be solved.

@jacob-pro
Copy link

Currently you will need to write a sidecar/daemon that runs alongside your Redis cluster that has awareness of your physical infrastructure.

It would need to monitor the cluster to ensure that:

  • Redis masters are distributed in different domains (don't want to lose a majority of masters)
  • A Redis master and its copies are distributed in different domains (don't want to lose all the copies of a slot range).

This will need to run continuously because even if you setup the cluster in a distributed way, after an automatic failover a Redis masters won't move back itself.

Perhaps there should be a way Redis cluster could implement this itself? Maybe we should be able to configure nodes with an "anti-affinity" with other nodes, such that Redis will try to failover masters, and migrate replicas with more intelligence?

@igorwwwwwwwwwwwwwwwwwwww
Copy link
Contributor

@jacob-pro Note that most of these issues would be addressed by Redis Cluster v2: #10875.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants