Skip to content

Ent Rolling Restarts

Mike Perham edited this page Jan 30, 2018 · 19 revisions

Do you have long-running jobs which might take minutes or hours to finish? Do you like to deploy frequently? Sidekiq's traditional TSTP+TERM shutdown process can cause jobs to rerun, especially if they are long-running.

Sidekiq Enterprise 1.8.0 adds support for rolling restarts. In this mode, a Sidekiq process will quiet immediately but will not exit until all jobs are complete. There is no limit to the time it can continue running. Upon signalling a rolling restart, a new process will be started to pick up new jobs.

How

einhorn is a special process manager which manages the rolling restart. Use it to run sidekiq or sidekiqswarm:

einhorn -m10 -- bundle exec sidekiqswarm -q critical -q default -q low

Send the USR2 signal to the einhorn process after deploying your new codebase to trigger a rolling restart. If the deploy fails, don't send the signal; the old processes will keep running as normal.

If the new Sidekiq process fails within 10 seconds (i.e. if the new codebase is buggy), einhorn will not stop the old process.

Integration

I recommend you update your systemd/upstart integration so that reload triggers a rolling restart. Modern init systems implement restart as explicit start and stop operations. You'll want to keep those available for situations where a rolling restart is not possible, like major database migrations.

Ideally you'll have two types of deploys: the default, "minor" deploy which rolls Sidekiq and a "major" deploy which stops Sidekiq completely.

servicectl reload sidekiq
initctl reload sidekiq
# For Upstart users
# /etc/init/sidekiq.conf
reload signal USR2
# For systemd users
# /etc/systemd/system/sidekiq.service
ExecReload=kill -USR2 $MAINPID

Notes

  • Old and new processes will be running at the same time. You will need to consider compatibility and take care around database migrations.
  • Rolling restarts work with both the sidekiq and sidekiqswarm binaries.

Clone this wiki locally