You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have a 64 node (32 master and 32 slave) Redis cluster setup spawning across 4 hosts ( Each machine 16 core, 48GB), we've spawned so many nodes to better utilize CPUs.
We've totally disabled rdbsaves as we're experiencing CPU and Memory limitations on our Redis cluster.
>>redis-cli -c -p 8001 config get save
1) "save"
2) ""
However, when things start to melt down, i.e.
When this happens from multiple nodes:
12055:M 10 Jan 19:45:10.911 # Cluster state changed: fail
12055:M 10 Jan 19:45:19.527 * FAIL message received from 611a1147e459cff54f96dc625dca25eb8047b651 about 389d2923b63143dca8a907ad7b1ecc6c294399e5
....
I see that the failover happens after master and slave perform rdb save's on their end. The issue for us with this is that, the nodes got flaky because they were not getting CPU/Memory, and triggering an rdb save at that time would only worsen things further, making other nodes on the same machine fail too.
What's the way out in such cases
The text was updated successfully, but these errors were encountered:
As far as I know, no setting could stop it directly. But we can avoid Full-sync by setting related options, and then rdb-save operation should not happened.
We have a 64 node (32 master and 32 slave) Redis cluster setup spawning across 4 hosts ( Each machine 16 core, 48GB), we've spawned so many nodes to better utilize CPUs.
We've totally disabled rdbsaves as we're experiencing CPU and Memory limitations on our Redis cluster.
However, when things start to melt down, i.e.
When this happens from multiple nodes:
12055:M 10 Jan 19:45:10.911 # Cluster state changed: fail
12055:M 10 Jan 19:45:19.527 * FAIL message received from 611a1147e459cff54f96dc625dca25eb8047b651 about 389d2923b63143dca8a907ad7b1ecc6c294399e5
....
I see that the failover happens after master and slave perform rdb save's on their end. The issue for us with this is that, the nodes got flaky because they were not getting CPU/Memory, and triggering an rdb save at that time would only worsen things further, making other nodes on the same machine fail too.
What's the way out in such cases
The text was updated successfully, but these errors were encountered: