Pro Reliability Server

Sidekiq does what it can to not lose jobs. When it shuts down, it will push back any unfinished jobs to Redis. 99% of the time, that's sufficient. But there are limits: jobs are stored in-process while executing so if the process crashes or network connectivity goes down, the job can be lost.

To handle those edge cases, the job must remain in Redis while Sidekiq executes it. Sidekiq Pro provides two different algorithms to do just that. To activate one, add this to your initializer:

ONE_HOUR = 3600 # this is the default
Sidekiq.configure_server do |config|
  # uncomment one!
  #config.reliable_fetch!
  #config.timed_fetch! ONE_HOUR
end

reliable_fetch

This is the algorithm that Sidekiq Pro has provided from Day 1. It uses the rpoplpush command and stores jobs within a private queue for each process while executing.

Pros

Scales to 10,000+ jobs/sec because it uses O(1) operations
Old and battle tested

Cons

Requires stable hostnames and a unique index per-process
Does not work well with Heroku, Docker, Amazon's ECS or Elastic Beanstalk
Susceptible to "poison pill" jobs

Good choice if you are running in the traditional manner on your own servers, virtual or physical. Avoid if you are using containers or a PaaS like Heroku. If a job can crash the Ruby VM, this "poison pill" can crash your processes non-stop until the job is removed manually because jobs are retried when the process restarts.

timed_fetch

This is a new algorithm introduced in Sidekiq Pro v3.1. It uses Lua and stores jobs within a "pending" area with a timeout. If the job execution is not finished and acknowledged by the client within that timeout period, the job can be pushed back onto the queue for another process to pick up.

Pros

No special configuration or specialization required
Works in every deployment environment, containers or not
Handles "poison pills" gracefully

Cons

Less scalable because it uses O(log N) operations.
New, unproven

Good choice for anyone processing less than 10M jobs/day or wanting to use containers. Jobs which crash the Ruby VM, "poison pills", are not retried until the timeout is up (default of one hour) so they can't crash Sidekiq non-stop, only one per hour.

Pro Reliability Server

reliable_fetch

Pros

Cons

timed_fetch

Pros

Cons

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally