Ent Multi Process

Mike Perham edited this page Jul 18, 2018 · 32 revisions

Sidekiq Enterprise has the ability to start and manage multiple Sidekiq child processes, a la Unicorn or Puma. With multi-process mode, you get several advantages:

  1. modest memory savings by sharing memory between processes
  2. running multiple processes on a single Heroku dyno, allowing you to minimize your dyno costs
  3. easy to create a single service in Upstart or Systemd which scales to all machine cores
  4. automated memory monitoring and restart for bloated child processes

The term for running multi-process is swarm. A swarm has a parent process and N child processes.

Starting a Swarm

Sidekiq Enterprise provides a sidekiqswarm binary. This binary is designed to run under Upstart, Systemd or Foreman as a service. It does not allow old-style options like --daemonize, --logfile or --pidfile.

sidekiqswarm [options]

Start and supervise a swarm of Sidekiq processes.
All arguments are passed to each Sidekiq instance.

You may not use the `-d`, `-L` or `-P` options.

Use the SIDEKIQ_* environment variables to control sidekiqswarm.

SIDEKIQ_COUNT	Number of Sidekiq child processes to start, defaults to number of cores
SIDEKIQ_MAXMEM_MB	Max RSS size in MB of child process before the parent will restart it
SIDEKIQ_PRELOAD	Comma-separated list of Bundler groups to preload before forking

SIDEKIQ_COUNT=5 SIDEKIQ_MAXMEM_MB=300 bundle exec bin/sidekiqswarm -r ./myworker.rb

Running via init


Sidekiq has a sample systemd unit file here. Starting sidekiqswarm instead is almost identical, just update the ExecStart line and configure the environment as necessary:

# if you want to override the default number of processes
ExecStart=/usr/local/bin/bundle exec sidekiqswarm -e production


Sidekiq has a sample upstart conf file here. Starting sidekiqswarm instead is almost identical, just update the exec line within the script block and configure the environment as necessary:

# if you want to override the default number of processes

exec bundle exec sidekiqswarm -e production

Signals and Controlling a Swarm

Use the standard upstart and systemd tools to manage the service for your swarm, e.g. systemctl restart sidekiq.

You can send the TERM and TSTP signals to the parent process and it will pass those signals to the underlying children. Once the parent process has received TSTP or TERM, it will not spawn any more children; it must be restarted. The parent process does not handle the TTIN signal.

Bundler Preload

Sidekiq forks the child processes after running Bundler.require(:default) but before booting the application so the children can share the memory consumed by loading the gems. Your Gemfile should eager load gems where possible; using gem 'something', require: false in your Gemfile will limit any memory savings.

If you find that sidekiqswarm's default Bundler require is breaking your app on boot, you can control which groups get preloaded or disable preload completely:

# preload both the default and production groups
SIDEKIQ_PRELOAD=default,production bin/sidekiqswarm ...
# disable gem preload completely
SIDEKIQ_PRELOAD= bin/sidekiqswarm ...

Memory Monitoring

The parent process can watch all children and restart any that get above a certain memory usage. Set the SIDEKIQ_MAXMEM_MB environment variable with the maximum memory in megabytes. If a child goes over that limit, the parent will detect it and do the following:

  1. Send TSTP
  2. Wait 60 seconds
  3. Send TERM
  4. Fork a new child
$ SIDEKIQ_MAXMEM_MB=30 SIDEKIQ_COUNT=1 bundle exec sidekiqswarm -r ./test.rb
2016-03-02T21:12:08.802Z 18308 TID-8nnh4 INFO: Running in ruby 2.0.0p598 (2014-11-13) [x86_64-linux]
2016-03-02T21:12:08.846Z 18308 TID-8nnh4 INFO: Starting processing, hit Ctrl-C to stop
Process 18308 too large at 31184KB, stopping it...
2016-03-02T21:12:43.893Z 18308 TID-8nnh4 INFO: Received TSTP, no longer accepting new work
2016-03-02T21:12:43.893Z 18308 TID-8nnh4 INFO: Terminating quiet workers
2016-03-02T21:12:43.898Z 18308 TID-9xw6k INFO: Scheduler exiting...
2016-03-02T21:13:43.909Z 18308 TID-8nnh4 INFO: Shutting down
2016-03-02T21:13:44.014Z 18308 TID-8nnh4 INFO: Bye!
Child exited, PID 18308, code 0, restarting...

Please see the Problems and Troubleshooting page for more tips on taming process RSS.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.