phased-restart in production is really slow #2669

TheMlok · 2021-07-29T07:50:56Z

Describe the bug
Puma phased-restart is very slow in production. When I send phased-restart all requests are frozen for almost 2 minutes, then everything works fine. Also the app speed is ok. During reload there is no load on server or abnormally used CPUs. When I perform phased-restart on test (same server different installation with same config and also RAILS_ENV is set to production) restart is done in 2 seconds. I think it has something with some wait. But still cannot figure out what the problem is. I use NGINX with upstreams to Puma socket.

Puma config:

# Puma can serve each request in a thread from an internal thread pool.
# The `threads` method setting takes two numbers a minimum and maximum.
# Any libraries that use thread pools should be configured to match
# the maximum value specified for Puma. Default is set to 5 threads for minimum
# and maximum, this matches the default thread size of Active Record.
#
threads_count = ENV.fetch("RAILS_MAX_THREADS") { 5 }.to_i
threads threads_count, threads_count

# Specifies the number of `workers` to boot in clustered mode.
# Workers are forked webserver processes. If using threads and workers together
# the concurrency of the application would be max `threads` * `workers`.
# Workers do not work on JRuby or Windows (both of which do not support
# processes).
#
workers ENV.fetch("WEB_CONCURRENCY") { 2 }

# Specifies the `environment` that Puma will run in.
#
environment ENV.fetch("RAILS_ENV") { "development" }

# Use the `preload_app!` method when specifying a `workers` number.
# This directive tells Puma to first boot the application and load code
# before forking the application. This takes advantage of Copy On Write
# process behavior so workers use less memory. If you use this option
# you need to make sure to reconnect any threads in the `on_worker_boot`
# block.
#
# preload_app!

# Specifies the unix sock.
bind "unix://#{ENV.fetch('APP_DIR')}/shared/tmp/sockets/puma.sock"
pidfile "#{ENV.fetch('APP_DIR')}/shared/tmp/pids/puma.pid"
directory "#{ENV.fetch('APP_DIR')}/current"

# Logging
stdout_redirect "log/puma.stdout.log", "log/puma.stderr.log", true


# The code in the `on_worker_boot` will be called if you are using
# clustered mode by specifying a number of `workers`. After each worker
# process is booted this block will be run, if you are using `preload_app!`
# option you will want to use this block to reconnect to any threads
# or connections that may have been created at application boot, Ruby
# cannot share connections between processes.
#
# on_worker_boot do
#   ActiveSupport.on_load(:active_record) do
#     ActiveRecord::Base.establish_connection
#   end
# end
#
# before_fork do
#   ActiveRecord::Base.connection_pool.disconnect!
# end

prune_bundler true

Command to start:
ExecStart=/var/www/.rbenv/bin/rbenv exec bundle exec puma -q -e production -C /var/www/app/shared/config/puma.rb

Command to phased-restart:
ExecReload=/var/www/.rbenv/bin/rbenv exec bundle exec pumactl -F /var/www/app/shared/config/puma.rb phased-restart

To Reproduce
I was unable to reproduce without real requests outside production.

Expected behavior
A fast reload as was on my apps before upgrading to new server with newer system and latest puma.

Server environment:

OS: Linux 4.19.0-16-amd64 add support for persistent HTTP connections #1 SMP Debian 4.19.181-1 (2021-03-19) x86_64 GNU/Linux
hardware: 16 CPU cores, 16 GB ram, 2TB RAID
Puma Version: 5.3.2
nginx version: nginx/1.18.0
Rails version: 6.0.3.1

The text was updated successfully, but these errors were encountered:

TheMlok · 2021-07-30T06:06:45Z

The problem was in bootsnap gem, with increasingly growing cache, restart was slower and slower, after removing the cache, restart and phased-restart are as it should be, sorry

TheMlok closed this as completed Jul 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

phased-restart in production is really slow #2669

phased-restart in production is really slow #2669

TheMlok commented Jul 29, 2021 •

edited

TheMlok commented Jul 30, 2021

phased-restart in production is really slow #2669

phased-restart in production is really slow #2669

Comments

TheMlok commented Jul 29, 2021 • edited

TheMlok commented Jul 30, 2021

TheMlok commented Jul 29, 2021 •

edited