Sidekiq processes unresponsive to signals if redis dies #2235

laboon · 2015-03-13T18:59:40Z

If redis dies, the sidekiq workers go into an infinite loop of trying to connect to redis (which I believe is expected behavior, if I understand #1586 correctly). However, they are at this point unresponsive to SIGINT/SIGTERM/SIGTTIN signals, so there's no way to shut them down (aside from SIGKILL) or see what is happening with them to determine why the processes are sticking around (unless logging is set to DEBUG).

mperham · 2015-03-13T19:02:40Z

Cannot reproduce. If I start Sidekiq on OSX and then stop Redis, I can still Ctrl-C the process.

laboon · 2015-03-13T19:17:10Z

Hmm. I am also running on OS X, sidekiq 3.3.2, redis server 2.8.19. It is running via foreman:

redis: redis-server ./config/redis.conf
sidekiq_archive: bundle exec sidekiq -c 4 -q archive -L log/sidekiq8.log

When I Ctrl-C out of that, the redis server shuts down, but the sidekiq process is still running:

(7635) $ ps -ef | grep sidekiq | grep -v grep
  501 84899     1   0  3:10PM ttys000    0:14.04 sidekiq 3.3.1 apangea [0 of 4 busy]

It is not responsive to kill.

(7638) $ kill 84899

(7639) $ pgrep sidekiq
84899

I try sending in a TTIN to see what it's doing...

(7641) $ kill -TTIN 84899

But I don't see the thread dump in the logs. There is just a repeated (once per 10 seconds) attempt to connect to redis:

2015-03-13T19:14:29.773Z 84899 TID-ov2x49iig WARN: Error connecting to Redis on localhost:6380 (ECONNREFUSED)
2015-03-13T19:14:29.773Z 84899 TID-ov2x49iig WARN: /Users/laboon/.rvm/gems/ruby-2.0.0-p481@apangea/gems/redis-3.0.7/lib/redis/client.rb:290:in `rescue in establish_connection'
/Users/laboon/.rvm/gems/ruby-2.0.0-p481@apangea/gems/redis-3.0.7/lib/redis/client.rb:285:in `establish_connection'
/Users/laboon/.rvm/gems/ruby-2.0.0-p481@apangea/gems/redis-3.0.7/lib/redis/client.rb:79:in `block in connect'

Looking further back in the logs, I see that the sidekiq process did receive the SIGINT from foreman -

2015-03-13T19:11:19.638Z 84899 TID-ov316km3k DEBUG: Got INT signal
2015-03-13T19:11:19.638Z 84899 TID-ov316km3k INFO: Shutting down

Perhaps I am missing something?

laboon · 2015-03-13T19:37:50Z

That commit fixes this, thank you very much for your extremely prompt help! Very, very, very appreciated.

mperham · 2015-03-13T19:45:28Z

Huh, I didn't think that would fix your issue. Thanks for following up and I'm glad it did!

laboon · 2015-03-13T20:39:28Z

Ahh, I think I missed an important attribute of the problem in the original description (although implicit in the repro steps of my second comment), that the signals are ignored only during the shutdown process. Since the sidekiq process shuts down correctly now once it receives a SIGINT even if it can't connect to redis (since that extra rescue allows it to get out of that raise exception -> retry redis connection loop), it fixes the "meta-issue" of shutting down appropriately to the first signal, and thus doesn't "need" to deal with any further ones.

mperham added a commit that referenced this issue Mar 13, 2015

Fix crash on exit when Redis is down, #2235

b5f3d36

laboon closed this as completed Mar 13, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sidekiq processes unresponsive to signals if redis dies #2235

Sidekiq processes unresponsive to signals if redis dies #2235

laboon commented Mar 13, 2015

mperham commented Mar 13, 2015

laboon commented Mar 13, 2015

laboon commented Mar 13, 2015

mperham commented Mar 13, 2015

laboon commented Mar 13, 2015

Sidekiq processes unresponsive to signals if redis dies #2235

Sidekiq processes unresponsive to signals if redis dies #2235

Comments

laboon commented Mar 13, 2015

mperham commented Mar 13, 2015

laboon commented Mar 13, 2015

laboon commented Mar 13, 2015

mperham commented Mar 13, 2015

laboon commented Mar 13, 2015