Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAILS_MAX_THREADS parity with Redis pool for client (i.e. Rails server) #3806

Closed
cupakromer opened this issue Mar 27, 2018 · 3 comments
Closed

Comments

@cupakromer
Copy link
Contributor

cupakromer commented Mar 27, 2018

Ruby version: 2.5.0
Rails: 5.1.5
Sidekiq / Pro / Enterprise version(s): Sidekiq 5.1.1

Issue

In Sidekiq 4.2.0 (issue #2985 implemented 52828e4) the ability to use RAILS_MAX_THREADS was introduced to allow tuning of concurrency with the RAILS_MAX_THREADS. The problem is this only applies for the Sidekiq server run via CLI (e.g. worker: RAILS_MAX_THREADS=10 bundle exec sidekiq), not the Rails server process (e.g. web: bundle exec puma -C config/puma.rb).

When the Rails server is running the connection is often made to Sidekiq::Client (such as with ActiveJob::QueueAdapters::SidekiqAdapter which does not have a redis_pool yet. It then attempts to create one defaulting to a pool size of 5.

Sample Rails `ActiveJob` Stack
.gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq.rb:125:in `redis_pool'
.gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:42:in `initialize'
.gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `new'
.gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `push'
.gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/queue_adapters/sidekiq_adapter.rb:20:in `enqueue'
.gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:51:in `block in enqueue'
.gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:108:in `block in run_callbacks'
.gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:15:in `block (3 levels) in <module:Logging>'
.gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `block in tag_logger'
.gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `block in tagged'
.gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:26:in `tagged'
.gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `tagged'
.gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `tag_logger'
.gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:14:in `block (2 levels) in <module:Logging>'
.gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `instance_exec'
.gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `block in run_callbacks'
.gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:135:in `run_callbacks'
.gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:47:in `enqueue'
.gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:18:in `perform_later'

Depending on network speed, server load, job data size, and the RAILS_MAX_THREADS setting it's possible all of the Redis connections are in use resulting in a Timeout::Error.

Workaround

Use an initializer to explicitly set the pool size:

# frozen_string_literal: true

# See https://github.com/mperham/sidekiq/wiki/Using-Redis
Sidekiq.configure_client do |config|
  # Without an explicit `size` here the pool will be set to 5
  #
  # See:
  #   - https://github.com/mperham/sidekiq/blob/v5.1.1/lib/sidekiq/redis_connection.rb#L18
  #   - https://github.com/mperham/sidekiq/blob/v5.1.1/lib/sidekiq/cli.rb#L244
  config.redis = {
    size: Integer(ENV.fetch("RAILS_MAX_THREADS", 5)),
  }
end

Proposed Solution

I believe to resolve this the Redis pool creation needs similar logic to the CLI. I recognize this is complicated by the fact there is both a server and a client. I think the solution is something like the following, but I don't have enough knowledge of the Sidekiq internals to say for certain:

# Sidekiq::RedisConnection.create (lib/sidekiq/redis_connection.rb#L18)
# ...

size = if options[:size]
         options[:size]
       elsif Sidekiq.server?
         Sidekiq.options[:concurrency] + 5
       elsif ENV["RAILS_MAX_THREADS"]
         Integer(ENV["RAILS_MAX_THREADS"])
       else
         5
       end

verify_sizing(size, Sidekiq.options[:concurrency]) if Sidekiq.server?

# ...
@mperham
Copy link
Collaborator

mperham commented Mar 27, 2018

Correct, my thinking was that a client pool size of 5 is sufficient in 99% of the cases. A Timeout::Error typically indicates a really slow network or Redis and should be investigated. Your solution seems legit to me, want to send a PR?

@cupakromer
Copy link
Contributor Author

Sure I'll send one up shortly. Is test/test_redis_connection.rb the proper spot to add coverage for this?

@mperham
Copy link
Collaborator

mperham commented Mar 27, 2018 via email

cupakromer added a commit to cupakromer/sidekiq that referenced this issue Mar 27, 2018
This is a follow up to sidekiq#2985 (52828e4) adding similar support for the client
connection pool. For Rails servers, Sidekiq is not loaded from the CLI so the
prior change to support setting the concurrency via `RAILS_MAX_THREADS` does
not apply. This means for Rails servers which do not configure a custom size
through an initializer they will run with only the default 5 connections
available to the pool.

When the Rails server runs the initial Redis connection may be made through
`Sidekiq::Client` (e.g. from [`ActiveJob::QueueAdapters::SidekiqAdapter`](https://github.com/rails/rails/blob/v5.1.5/activejob/lib/active_job/queue_adapters/sidekiq_adapter.rb#L20)).
This causes the `redis_pool` to be initialized without any options, setting the
pool size to the default of 5.

    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq.rb:125:in `redis_pool'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:42:in `initialize'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `new'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `push'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/queue_adapters/sidekiq_adapter.rb:20:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:51:in `block in enqueue'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:108:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:15:in `block (3 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `block in tag_logger'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `block in tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:26:in `tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `tagged'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `tag_logger'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:14:in `block (2 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `instance_exec'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:135:in `run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:47:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:18:in `perform_later'

For the majority of cases, a client pool size of 5 is sufficient. However,
servers which utilize a high number of threads, with large job payloads, and
which may experience some network latency issues can see `Timeout::Error`
crashes. This may be further exacerbated by the ~2-20x performance decrease
through `ActiveJob` (sidekiq#3782). Rails addresses this general type of connection
issue for the main database by suggesting that the DB pool size match the
number of threads running. This change applies that logic to the default client
pool size by leveraging the same environment setting; this way there's a
connection available per thread.

This may also have the side effect of a slight performance boost, as there is
less of a chance that threads will be blocked waiting on connections. The
trade-off is that there may be a memory profile increase to handle the
additional Redis connections in the pool; note the pool only creates new
connections as necessary to handle the requests.

Resolves sidekiq#3806
cupakromer added a commit to cupakromer/sidekiq that referenced this issue Mar 27, 2018
This is a follow up to sidekiq#2985 (52828e4) adding similar support for the
client connection pool. For Rails servers, Sidekiq is not loaded from
the CLI so the prior change to support setting the concurrency via
`RAILS_MAX_THREADS` is not applied to the web server process. This means
for Rails servers which do not configure a custom size through an
initializer they will run with the default connection pool size of 5.

When the Rails server runs the initial Redis connection may be made
through `Sidekiq::Client` (e.g. from [`ActiveJob::QueueAdapters::SidekiqAdapter`](https://github.com/rails/rails/blob/v5.1.5/activejob/lib/active_job/queue_adapters/sidekiq_adapter.rb#L20)).
This causes the `redis_pool` to be initialized without any options,
setting the pool size to the default of 5.

    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq.rb:125:in `redis_pool'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:42:in `initialize'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `new'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `push'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/queue_adapters/sidekiq_adapter.rb:20:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:51:in `block in enqueue'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:108:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:15:in `block (3 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `block in tag_logger'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `block in tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:26:in `tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `tagged'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `tag_logger'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:14:in `block (2 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `instance_exec'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:135:in `run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:47:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:18:in `perform_later'

For the majority of cases, a client pool size of 5 is sufficient.
However, servers which utilize a high number of threads, with large job
payloads, and which may experience some network latency issues can see
`Timeout::Error` crashes. This may be further exacerbated by the ~2-20x
performance decrease through `ActiveJob` (sidekiq#3782). Rails addresses this
general type of connection issue for the main database by suggesting
that the DB pool size match the number of threads running. This change
applies that logic to the default client pool size by leveraging the
same environment setting; this way there's a connection available per
thread.

This may also have the side effect of a slight performance boost, as
there is less of a chance that threads will be blocked waiting on
connections. The trade-off is that there may be a memory profile
increase to handle the additional Redis connections in the pool; note
the pool only creates new connections as necessary to handle the
requests.

Resolves sidekiq#3806
mperham pushed a commit that referenced this issue Mar 27, 2018
This is a follow up to #2985 (52828e4) adding similar support for the
client connection pool. For Rails servers, Sidekiq is not loaded from
the CLI so the prior change to support setting the concurrency via
`RAILS_MAX_THREADS` is not applied to the web server process. This means
for Rails servers which do not configure a custom size through an
initializer they will run with the default connection pool size of 5.

When the Rails server runs the initial Redis connection may be made
through `Sidekiq::Client` (e.g. from [`ActiveJob::QueueAdapters::SidekiqAdapter`](https://github.com/rails/rails/blob/v5.1.5/activejob/lib/active_job/queue_adapters/sidekiq_adapter.rb#L20)).
This causes the `redis_pool` to be initialized without any options,
setting the pool size to the default of 5.

    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq.rb:125:in `redis_pool'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:42:in `initialize'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `new'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `push'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/queue_adapters/sidekiq_adapter.rb:20:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:51:in `block in enqueue'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:108:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:15:in `block (3 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `block in tag_logger'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `block in tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:26:in `tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `tagged'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `tag_logger'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:14:in `block (2 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `instance_exec'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:135:in `run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:47:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:18:in `perform_later'

For the majority of cases, a client pool size of 5 is sufficient.
However, servers which utilize a high number of threads, with large job
payloads, and which may experience some network latency issues can see
`Timeout::Error` crashes. This may be further exacerbated by the ~2-20x
performance decrease through `ActiveJob` (#3782). Rails addresses this
general type of connection issue for the main database by suggesting
that the DB pool size match the number of threads running. This change
applies that logic to the default client pool size by leveraging the
same environment setting; this way there's a connection available per
thread.

This may also have the side effect of a slight performance boost, as
there is less of a chance that threads will be blocked waiting on
connections. The trade-off is that there may be a memory profile
increase to handle the additional Redis connections in the pool; note
the pool only creates new connections as necessary to handle the
requests.

Resolves #3806
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants