-
Notifications
You must be signed in to change notification settings - Fork 639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NEW] Trigger manual failover on SIGTERM to primary (cluster) #939
Comments
This is a good idea. In my fork, i have a similar feature like this (detect a disk error and do failover). The basic idea is that, the primary will pick a best replica and send a CLUSTER FAILOVER to it (and wait for it). Do you like this approach or need me to try it? |
Yes, this is strait-forward. I like it. I have another idea, similar but maybe faster(?) but more complex(?). This is the idea: The primary first pauses writes, then waits for the replica to replicate everything and then sends CLUSTER FAILOVER FORCE. This avoids step 1 below. This is from the docs of CLUSTER FAILOVER:
And for FORCE:
|
yeah, this seems ok to me, faster. This different i think is, one is that the replica thinks the offset is ok and start the failover, and the other is that the primary tells the replica that it can start the failover. The CLUSTER FAILOVER one:
primary serverCron, primary send CLUSTER FAILOVER, replcia send a MFSTART, primary send a PING, replica clusterCron and start the failover. The CLUSTER FAILOVER FORCE one:
primary serverCron, primary send REPLCONF GETACK, replica send REPLCONF ACK, primary send CLUSTER FAILOVER FORCE, replica clusterCron and start the failover |
In shutdown, we already have a feature to stop writes and wait for replicas before shutting down. These two features will need to be combined. Let's discuss the details in a PR. :) Do you want to implement this? |
yeah, i can do it in this week. |
i dont have enouth time to write the test right now, i guess you might want to take a look at it in advance, so here is the commit enjoy-binbin@9777d01 i did a small manual testing locally and it seems to be work, i will try to find time to finish the test code and the rest of it. |
Don't worry. I hope we can have it for 8.2 so you have a few months to finish it. 😸 |
When a primary disappears, its slots are not served until an automatic failover happens. It takes about n seconds (node timeout plus some seconds). It's too much time for us to not accept writes. If the host machine is about to shutdown for any reason, the processes typically get a sigterm and have some time to shutdown gracefully. In Kubernetes, this is 30 seconds by default. When a primary receives a SIGTERM or a SHUTDOWN, let it trigger a failover to one of the replicas as part of the graceful shutdown. This can reduce some unavailability time. For example the replica needs to sense the primary failure within the node-timeout before initating an election, and now it can initiate an election quickly and win and gossip it. This closes valkey-io#939.
When a primary disappears, its slots are not served until an automatic failover happens. It takes about n seconds (node timeout plus some seconds). It's too much time for us to not accept writes. If the host machine is about to shutdown for any reason, the processes typically get a sigterm and have some time to shutdown gracefully. In Kubernetes, this is 30 seconds by default. When a primary receives a SIGTERM or a SHUTDOWN, let it trigger a failover to one of the replicas as part of the graceful shutdown. This can reduce some unavailability time. For example the replica needs to sense the primary failure within the node-timeout before initating an election, and now it can initiate an election quickly and win and gossip it. This closes valkey-io#939. Signed-off-by: Binbin <binloveplay1314@qq.com>
The problem/use-case that the feature addresses
When a primary disappears, its slots are not served until an automatic failover happens. It takes about 3 seconds (node timeout plus some second). It's too much time for us to not accept writes.
If the host machine is about to shutdown for any reason, the processes typically get a sigterm and have some time to shutdown gracefully. In Kubernetes, this is 30 seconds by default.
Description of the feature
When a primary receives a SIGTERM, let it trigger a failover to one of the replicas as part of the graceful shutdown.
Alternatives you've considered
Our current solution is to have a wrapper process, a small script, starting valkey. Then, this process receives the SIGTERM and can handle this. It's like a workaround though. It's better to have this built in.
Additional information
We use cluster.
The text was updated successfully, but these errors were encountered: