Sentinel tries to elect odown master. #1821

DutchMark · 2014-06-17T23:18:41Z

I have a very simple setup, one master and one slave. On both instances I have also a sentinel, with the same configuration file on both:

sentinel monitor mymaster 10.99.13.107 6379 1
sentinel down-after-milliseconds mymaster 10000
sentinel failover-timeout mymaster 10000
loglevel verbose

When I kill the master instance, the failover procedure kicks in correctly and the slave gets promoted to master. However, when I kill both the master and the sentinel on the same instance (I want to simulate what happens when an instance crashes or goes down completely) then the failover procedure does not happen. The sentinel that lives on the slave instance keeps trying to elect the original master. The log of that sentinel is this:

[28069] 17 Jun 22:57:20.302 # Sentinel runid is 7d08ab54ddce7931c745459996aa0cf1e33f98c1
[28069] 17 Jun 22:57:20.302 # +monitor master mymaster 10.99.13.107 6379 quorum 1
[28069] 17 Jun 22:57:20.900 * +sentinel sentinel 10.99.13.107:26379 10.99.13.107 26379 @ mymaster 10.99.13.107 6379
[28069] 17 Jun 22:57:20.914 # +new-epoch 283
[28069] 17 Jun 22:57:22.395 - Accepted 10.99.13.107:49615
[28069] 17 Jun 22:57:40.395 * +slave slave 10.194.250.140:6379 10.194.250.140 6379 @ mymaster 10.99.13.107 6379
[28069] 17 Jun 22:58:50.896 - Client closed connection
[28069] 17 Jun 22:59:00.940 # +sdown sentinel 10.99.13.107:26379 10.99.13.107 26379 @ mymaster 10.99.13.107 6379
[28069] 17 Jun 22:59:01.196 # +sdown master mymaster 10.99.13.107 6379
[28069] 17 Jun 22:59:01.196 # +odown master mymaster 10.99.13.107 6379 #quorum 1/1
[28069] 17 Jun 22:59:01.196 # +new-epoch 284
[28069] 17 Jun 22:59:01.196 # +try-failover master mymaster 10.99.13.107 6379
[28069] 17 Jun 22:59:01.203 # +vote-for-leader 7d08ab54ddce7931c745459996aa0cf1e33f98c1 284
[28069] 17 Jun 22:59:11.538 # -failover-abort-not-elected master mymaster 10.99.13.107 6379
[28069] 17 Jun 22:59:11.628 # Next failover delay: I will not start a failover before Tue Jun 17 22:59:21 2014
[28069] 17 Jun 22:59:21.475 # +new-epoch 285
[28069] 17 Jun 22:59:21.475 # +try-failover master mymaster 10.99.13.107 6379
[28069] 17 Jun 22:59:21.482 # +vote-for-leader 7d08ab54ddce7931c745459996aa0cf1e33f98c1 285
[28069] 17 Jun 22:59:32.449 # -failover-abort-not-elected master mymaster 10.99.13.107 6379
[28069] 17 Jun 22:59:32.525 # Next failover delay: I will not start a failover before Tue Jun 17 22:59:42 2014

etc. etc. it keeps retrying. As you can see from this log it knows about the slave (+slave). It also knows about the master going down ("+sdown master mymaster" and "+odown master mymaster"). So why does it keep doing "+try-failover master mymaster 10.99.13.107 6379" and elects the master that it knows is down?

redis-server --version: Redis server v=2.8.10 sha=00000000:0 malloc=tcmalloc-2.0 bits=64 build=176d015270bbec54

icyice80 · 2014-06-17T23:34:58Z

check the redis sentinel doc, based on your master/slave config, u need to have 3 sentinels min, quroum is 2. so when 1 sentinel goes down, the other 2 could elect leader, then kick in the failover process.

antirez · 2014-06-18T07:44:21Z

Or... a single one if you don't care about Sentinel being a single point of failure (discouraged approach). However note that if you go for three, you need to setup Sentinel in three different computers (or virtual machines) that are likely to fail independently, otherwise you have a setup that is only valid under the assumption of single processes failing (like Redis server crashing) but not working on netsplits, since two or more Sentinels will run into the same physical host (so will always get partitioned together).

jeuniii · 2017-10-29T16:58:37Z

@DutchMark So did you figure out what the issue was ? Im having the exact same issue as you. My quorum is set to 1 since Im just testing it out. Eventually ill have 3 separate nodes with quorum set to 1.

feigyfroilich · 2021-08-16T08:41:27Z

Why is it closed? I am facing the same issue.
@DutchMark Have you find a solution ?

mattsta closed this as completed Oct 29, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sentinel tries to elect odown master. #1821

Sentinel tries to elect odown master. #1821

DutchMark commented Jun 17, 2014

icyice80 commented Jun 17, 2014

antirez commented Jun 18, 2014

jeuniii commented Oct 29, 2017

feigyfroilich commented Aug 16, 2021

Sentinel tries to elect odown master. #1821

Sentinel tries to elect odown master. #1821

Comments

DutchMark commented Jun 17, 2014

icyice80 commented Jun 17, 2014

antirez commented Jun 18, 2014

jeuniii commented Oct 29, 2017

feigyfroilich commented Aug 16, 2021