Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sometimes new sentinels (Redis 2.8.8) give up failover with -failover-abort-no-good-slave #1796

Closed
renhua91 opened this issue Jun 5, 2014 · 9 comments

Comments

@renhua91
Copy link

renhua91 commented Jun 5, 2014

Recently I choose to use two redis servers (master & slave) both running sentinel, and one third node running just sentinel for the quorum process. but sometimes when i shutdown one master redis , then sentinel can't promote another slave redis to master.

here are sentinel logs:
[30728] 03 Jun 10:26:06.201 # +vote-for-leader 0ca2c9cc633be28e522e3c270fc02432e98dc08b 61
[30728] 03 Jun 10:26:06.288 # 192.168.20.230:26379 voted for 0ca2c9cc633be28e522e3c270fc02432e98dc08b 61
[30728] 03 Jun 10:26:06.332 # +elected-leader master mymaster 192.168.20.242 6379
[30728] 03 Jun 10:26:06.333 # +failover-state-select-slave master mymaster 192.168.20.242 6379
[30728] 03 Jun 10:26:06.400 # +selected-slave slave 192.168.20.230:6379 192.168.20.230 6379 @ mymaster 192.168.20.242 6379
[30728] 03 Jun 10:26:06.401 * +failover-state-send-slaveof-noone slave 192.168.20.230:6379 192.168.20.230 6379 @ mymaster 192.168.20.242 6379
[30728] 03 Jun 10:26:06.485 * +failover-state-wait-promotion slave 192.168.20.230:6379 192.168.20.230 6379 @ mymaster 192.168.20.242 6379
[30728] 03 Jun 10:26:07.224 # +promoted-slave slave 192.168.20.230:6379 192.168.20.230 6379 @ mymaster 192.168.20.242 6379
[30728] 03 Jun 10:26:07.225 # +failover-state-reconf-slaves master mymaster 192.168.20.242 6379
[30728] 03 Jun 10:26:07.291 # +failover-end master mymaster 192.168.20.242 6379
[30728] 03 Jun 10:26:07.292 # +switch-master mymaster 192.168.20.242 6379 192.168.20.230 6379
[30728] 03 Jun 10:26:07.293 * +slave slave 192.168.20.242:6379 192.168.20.242 6379 @ mymaster 192.168.20.230 6379
[30728] 03 Jun 10:26:37.341 # +sdown slave 192.168.20.242:6379 192.168.20.242 6379 @ mymaster 192.168.20.230 6379
[30728] 03 Jun 10:26:45.681 # -sdown slave 192.168.20.242:6379 192.168.20.242 6379 @ mymaster 192.168.20.230 6379
[30728] 03 Jun 10:26:55.648 * +convert-to-slave slave 192.168.20.242:6379 192.168.20.242 6379 @ mymaster 192.168.20.230 6379
[30728] 03 Jun 10:27:05.694 * +convert-to-slave slave 192.168.20.242:6379 192.168.20.242 6379 @ mymaster 192.168.20.230 6379
[30728] 03 Jun 10:27:15.738 * +convert-to-slave slave 192.168.20.242:6379 192.168.20.242 6379 @ mymaster 192.168.20.230 6379
[30728] 03 Jun 10:27:31.766 # -sdown sentinel 192.168.20.242:26379 192.168.20.242 26379 @ mymaster 192.168.20.230 6379
[30728] 03 Jun 10:27:33.705 * -dup-sentinel master mymaster 192.168.20.230 6379 #duplicate of 192.168.20.242:26379 or cfdbcd08456b508e4c2adcbc83b4c47c7022da60
[30728] 03 Jun 10:27:33.705 * +sentinel sentinel 192.168.20.242:26379 192.168.20.242 26379 @ mymaster 192.168.20.230 6379
[30728] 03 Jun 10:28:31.402 # +sdown sentinel 192.168.20.230:26379 192.168.20.230 26379 @ mymaster 192.168.20.230 6379
[30728] 03 Jun 10:28:45.485 # +sdown master mymaster 192.168.20.230 6379
[30728] 03 Jun 10:28:45.546 # +odown master mymaster 192.168.20.230 6379 #quorum 2/2
[30728] 03 Jun 10:28:45.546 # +new-epoch 62
[30728] 03 Jun 10:28:45.546 # +try-failover master mymaster 192.168.20.230 6379
[30728] 03 Jun 10:28:45.552 # +vote-for-leader 0ca2c9cc633be28e522e3c270fc02432e98dc08b 62
[30728] 03 Jun 10:28:45.568 # 192.168.20.242:26379 voted for 0ca2c9cc633be28e522e3c270fc02432e98dc08b 62
[30728] 03 Jun 10:28:45.643 # +elected-leader master mymaster 192.168.20.230 6379
[30728] 03 Jun 10:28:45.643 # +failover-state-select-slave master mymaster 192.168.20.230 6379
[30728] 03 Jun 10:28:45.728 # -failover-abort-no-good-slave master mymaster 192.168.20.230 6379
[30728] 03 Jun 10:29:01.950 # +new-epoch 63

@charsyam
Copy link
Contributor

charsyam commented Jun 5, 2014

@PumpkinJack sentinel only promote good slaves to new master.
and some slaves can be good slave to follow below rules.

  1. not slave-priority is 0.
  2. not demote(it was not old master.)
  3. ping reply > info_validity_time
  4. info reply > info_validate_time
  5. not sdown, odown, disconnected.

@renhua91
Copy link
Author

renhua91 commented Jun 5, 2014

but when the problem occured, i check the slave info, i found it was alived

@renhua91
Copy link
Author

renhua91 commented Jun 5, 2014

@charsyam

@charsyam
Copy link
Contributor

charsyam commented Jun 5, 2014

@PumpkinJack maybe it is skipped by 3 or 4. even if slave is alive, it may be not good slave.

@renhua91
Copy link
Author

renhua91 commented Jun 5, 2014

@charsyam when i found that , i start the master and wait 1 min then i shutdown it again, but in this time , sentinel promote slave to master successfully . i don't know why

@chandanbansal
Copy link

i am facing same issue. i am not able to promote a demoted one. is there a way to promote it again or any config is required to do that?
"ex: i have 2 machine m1 and m2.
m1 is master
m2 is slave.
m1 goes down. Now sentinel is working and promote m2 to master.
now m1 come back and new status is.
m1 is slave
m2 is master
When m2 goes down. Sentinel is not promoting the m1.
it says failover-abort-no-good-slave."
Now how to promote m1 to master again?

@badboy
Copy link
Contributor

badboy commented Oct 1, 2014

What does the log say? If Sentinel decides there's no good slave it does so for a reason. Check master and slave log. Check if the master can persist correctly and check that the slave is fully synced

@chandanbansal
Copy link

Thanks logs helped to solve the issue. redis pwd was not defined in master config.

@badboy
Copy link
Contributor

badboy commented Oct 1, 2014

Can be closed then.

@mattsta mattsta closed this as completed Oct 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants