Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better integration between Sidekiq thread count and Active Record pool size. #2985

Closed
schneems opened this issue May 24, 2016 · 19 comments
Closed

Comments

@schneems
Copy link

Right now if someone wants to use sidekiq they must configure the number of threads they want to run on or use the default (25). If they are using a database (likely) then to not hit a connection timeout error they will need to have at least as many connections in their database pool as sidekiq threads. Currently they must configure these two values in two different places, database pool in database.yml for ActiveRecord, and sidekiq connections either via command line or the sidekiq yml file. This is a problem as it is difficult to remember when you are modifying one value that you need to modify both.

In Rails 5 there is a default environment variable to configure the number of threads that will be in the database connection pool

# config/database.yml
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>

This convention is used along side of web server configuration values. For example Rails 5 also ships with puma which is configured to read from this value:

# config/puma.rb
threads_count = ENV.fetch("RAILS_MAX_THREADS") { 5 }.to_i
threads threads_count, threads_count

It is common that the number of sidekiq threads run in a "worker" process would be greater than the number of web threads used by a server like Puma. For this reason it does not make sense to default the sidekiq connections to RAILS_MAX_THREADS. However it would make sense for sidekiq to use this enviornment variable interface and se the environment variable:

# sidekiq/lib/<somewhere>.rb
ENV["RAILS_MAX_THREADS"] = Sidekiq.number_of_threads #pseudo code

So now when the application boots in a "worker" process via the sidekiq command and we know that we are using Rails, the appropriate number of database connections will be configured by default.

Potential Problems

This behavior may be slightly unexpected (for Sidekiq to change an environment variable at boot time) however I believe it is perfectly warranted. As a mitigation step we could detect when an enviornment variable was already set

# sidekiq/lib/<somewhere>.rb
puts "Sidekiq is setting RAILS_MAX_THREADS=#{ Sidekiq.number_of_threads }, was previously #{ ENV['RAILS_MAX_THREADS'] }"
ENV["RAILS_MAX_THREADS"] = Sidekiq.number_of_threads #pseudo code

A warning or similar would eliminate any confusion. It would also allow the use of RAILS_MAX_THREADS for when the web process is booted, and a different configuration option for use when sidekiq is booted.

People may want to opt out of this behavior. They can do this by modifying their database.yml file to use a different environment variable

# config/database.yml
  pool: <%= ENV['SIDEKIQ_CUSTOM_DB_SIZE'] || ENV.fetch("RAILS_MAX_THREADS") { 5 } %>

So now they have the ability to not rely on the behavior of setting RAILS_MAX_THREADS.

Recap

Right now if you try the default database size of 5 will artificially limit the default sidekiq size of 25. Applying this change means that Rails applications will work with sidekiq out of the box. It also means that as Sidekiq is scaled up or down that the database size will be set appropriately without additional action (if conventions are being followed). I believe the costs to be minimal and the benefits to outweigh them.

@nateberkopec
Copy link
Collaborator

nateberkopec commented May 24, 2016

This strikes me as easier to solve with documentation than with code.

# config/sidekiq.yml
---
:concurrency: <%=  ENV['SIDEKIQ_THREADS']  %>

or

sidekiq -c $SIDEKIQ_THREADS

then

# config/database.yml
  pool: <%= ENV['SIDEKIQ_THREADS'] || ENV.fetch("RAILS_MAX_THREADS") { 5 } %>

All that would be required would be a change wiki docs, and maybe a nice announcement in that fancy Sidekiq Newsletter ;)

edit: or just set :concurrency equal to RAILS_MAX_THREADS. either way.

@mperham
Copy link
Collaborator

mperham commented May 24, 2016

Thanks for the video conference this morning, @schneems.

I think having Sidekiq use this variable over the config settings is reasonable if it means the db pool will auto-size itself. The logic would be something like:

ENV['RAILS_MAX_THREADS'] = concurrency = ENV['RAILS_MAX_THREADS'] || -c || yml['concurrency']

Procfile would just need to be (some pretty nice symmetry here):

web: RAILS_MAX_THREADS=5 bundle exec puma -e production
worker: RAILS_MAX_THREADS=10 bundle exec sidekiq -e production

WDYT?

@nateberkopec
Copy link
Collaborator

That Procfile does look very nice!

In production, I'll bet a lot of people are running with db pools that are too small for their Sidekiq concurrency anyway. If modifying RAILS_MAX_THREADS inside Sidekiq will fix that, then this change makes sense.

What about non-AR ORMs? I've never really used them, though, so I don't know how they configure their connection pools.

@schneems
Copy link
Author

I'm not sure about the procfile solution, it would require you re-deploy after every change, which is a bad experience.

Maybe something like this in Procfile:

worker: RAILS_MAX_THREADS=$SIDEKIQ_CONCURRENCY bundle exec sidekiq -e production

Then you could change SIDEKIQ_CONCURRENCY at will.

Ideally I would like this to work out of the box without a custom procfile.

@jeremyevans any thoughts on @nateberkopec's comment? Would be cool if Sequel could Just work (TM) with Rails 5 + Puma like Active Record. Happy to take that conversation to another thread if you want.

@jeremyevans
Copy link

Sequel can usually work fine with a significantly smaller connection pool than ActiveRecord due to the fact that it doesn't leak connections. That said, if someone wants to use the number of Sidekiq threads as the maximum number of Sequel database connections, they would just need to set use the :max_connections=>Sidekiq.number_of_threads Sequel::Database option.

I'm against adding Rails-specific code to non-Rails-specific libraries as a matter of principle, just as I'm against adding Sequel-specific code to non-Sequel-specific libraries. Giving special treatment to Rails gives Rails an unjust advantage and makes it more difficult for alternatives to Rails to compete on merit. On the flipside, adding support for Rails, Sequel, ROM, Mongoid, etc. leads to a slippery slope.

In generic terms, unless library A depends on library B (either directly or via an adapter for library B that ships with library A), I don't think library A should have library B-specific code.

As a matter of good engineering practices, I think libraries should never set environment variables. I think reading library-specific environment variables is fine, but library A should not care about library B-specific environment variables.

@sgrif
Copy link
Contributor

sgrif commented May 25, 2016

due to the fact that it doesn't leak connections

That tends to help, yes. XD

@schneems
Copy link
Author

Ideally I agree, pragmatically I disagree. There's lots of libraries that only serve to integrate with another library. For example devise. There are also the in-between libraries that bridge an ideologically pure library and something else, for example sprockets-rails adds the bridges needed to integrate the two libraries. In this case I actually wish that sprockets shipped with a railtie instead, yes it would add coupling but also decrease overall complexity in an end user's app. Right now sprockets-rails has to support Sprockets 2-4 which cause problems and lots of potentially unneeded flow control.

My gold standard is the end user's experience. If giving them the best possible experience doesn't introduce a huge maintenance burdeon on a library, I would consider doing it. I have some libaries that don't do authorization but need it, for that end I integrate with devise if it's available, however I also provide custom configuration endpoints. One example is oPRO https://github.com/opro/opro#custom-auth. This gives anyone using the currently number one auth library for Rails a drop in solution, but doesn't prevent anyone else from using it.

In the future if other libraries wanted to add support directly into oPRO, there is already a pattern to follow. If the maintenance burdeon got too large or there were too many people adding in support for their random library, then I would probably start pushing back and asking people to make bridge libraries like opro-devise etc. sort of like omniauth's strategies.

Keep in mind that I make a living working for Heroku maintaining software whose core value proposition is that it is highly knowledgable and coupled to external software. Doing this isn't always the easiest or cleanest proposition, but sometimes it's needed to provide the absolutely best experience. Over time as standards develop hopefully others see that value and start using standard interfaces for example https://blog.heroku.com/archives/2016/5/2/container_ready_rails_5.

On the note of the unfair advantage, it wouldn't be an advantge if it wasn't beneficial for users. I feel it's fine to have each library maintainer make a call on a case by case basis. I agree that docs can do a lot for integrations, but there's a huge difference between "1 step and it works" to working out of the box. Ideally we would have standard interfaces across libraries, a good example of this is DATABASE_URL environment variable. It is supported by many ORMs and across multiple languages.

@jeremyevans
Copy link

If the point of the library is to integrate with one or more other libraries, then as I mentioned earlier, I think it's fine if the library has code specific to the other libraries. That's not the case here, though. If you want to integrate sidekiq with rails, that sounds like a good idea for a sidekiq-rails gem. If you want to ship such support with sidekiq, have it in a separate file that is not required by default (an adapter file).

While the end-user experience is important, I don't believe it trumps good library design principles. I think it is fairly easy to understand that the philosophy of "whatever is good for the majority is what we should do", is harmful to the diversity of any ecosystem.

@nateberkopec
Copy link
Collaborator

If you want to ship such support with sidekiq, have it in a separate file that is not required by default (an adapter file).

That's what a Railtie is?

As a matter of good engineering practices, I think libraries should never set environment variables. I think reading library-specific environment variables is fine, but library A should not care about library B-specific environment variables.

This point is far stronger, IMO. Setting an environment variable at runtime is kinda weird to begin with, then overriding Rails' env variables from Sidekiq doubles that weirdness. My original proposed approach fixes that issue, with the downside of making it not zero-config/plug-n-play for Rails users.

@jeremyevans
Copy link

I don't think there is any problem if sidekiq ships a railtie, or has Rails-specific code in a Rails-specific file that sidekiq doesn't load by default (if Rails loads sidekiq's railtie by default, that's fine). I think you only run into issues if you start doing things by default for the most popular library and not other competing alternatives. For example, I would consider code like this to be a problem: https://github.com/phusion/passenger/blob/085504c2e00b6e6322bb0ec97aa5ff43d037c729/src/ruby_supportlib/phusion_passenger/loader_shared_helpers.rb#L381

@schneems
Copy link
Author

First of all, thanks for all the discussion. The original issue was me throwing things to the wall to see what stuck. I appreciate your input.

I never loved setting another library's env var.

Right now we have two proposed solutions. Auto setting env var and through explicit procfile writing. In thinking about the two proposed solutions I think I have a middle of the road, hybrid solution.

First a recap of the other solutions:

Auto Env Var

Have sidekiq set the RAILS_MAX_THREADS env var on worker boot.

  • Pros:
    • Works out of the box
  • Cons:
    • Pretty close to magic of the darkest sorts, hard to debug or know what's going on.

Procfile

Document how to set the RAILS_MAX_THREADS in the procfile like

worker: RAILS_MAX_THREADS=$SIDEKIQ_CONCURRENCY bundle exec sidekiq -e production
  • Pros:
    • Explicit behavior: Not black magic
  • Cons:
    • By default won't work out of the box, requires users to proactively search for docs or wait till a failure and investigate. That would be a bad experience.

Hybrid solution - Env var default and Procfile docs

I think we should have docs on how to set the procfile:

worker: RAILS_MAX_THREADS=$SIDEKIQ_CONCURRENCY bundle exec sidekiq -e production

However we could also default Sidekiq's default number of threads to match Active Record's by default (do this in a Railtie perhaps). For example if a RAILS_MAX_THREADS is set and an explicit sidekiq concurrency level isn't specified use the RAILS_MAX_THREADS number instead of the sidekiq default.

This still buys us explicitness through documentation. It also buys us that "just works" out of the box experience. In that code run by Sidekiq won't try to check out more connections than AR has.

The downside is that this drastically reduces the number of default threads from 25 to 5. However if they're using the database while running sidekiq they're already artificially limited as only 5 jobs would be able to run at a time since the others will be blocked waiting on a connection. Basically you're limited by amdahl's law anyway so it's not really a downside.

If you REALLY need all possible speed, you can't expect it from defaults, it's reasonable to expect people to look at docs (my opinion)

What do you think?

  • Pros:
    • Works out of the box
    • No magic
  • Cons:
    • Change in default behavior
    • Fewer Sidekiq threads by default when used with Rails (could argue this is a pro).

@mperham
Copy link
Collaborator

mperham commented Jun 3, 2016

Thanks for all the input, folks. I really have been reading and considering all of the things. I like Jeremy's point that ENV vars should be considered read-only for libraries but Sidekiq is not a library, it's the top-level framework and the binary being run. The user is configuring the Sidekiq process via command line parameters and so tuning various libraries within the Sidekiq process by auto-setting ENV vars if not already set is fair game.

For example, Sidekiq sets the RAILS_ENV and RACK_ENV variables based on the -e parameter. I'm leaning toward -c mapping to RAILS_MAX_THREADS if not set.

With that logic, the 90% use case is the user sets sidekiq -c 15 and Rails will configure a DB connection pool with 15 connections. Exactly what we want. They can run RAILS_MAX_THREADS=10 sidekiq -c 15 if they want a smaller pool.

That naming seems silly though; IMO the config/database.yml entry should change to something like:

pool: <%= ENV["DB_POOL_SIZE"] || ENV["RAILS_MAX_THREADS"] || 5 %>

to make the env vars more descriptive: DB_POOL_SIZE=10 sidekiq -c 15.

@HoneyryderChuck
Copy link

HoneyryderChuck commented Jun 3, 2016

Why not just make the db connection pool size an active record configuration setter?

# config.application.rb, somewhere:

  config.active_record.database_pool_size = ENV["CONCURRENCY"]
# another alternative for the foreman case
  config.active_record.database_pool_size = ENV["RAILS_MAX_THREADS"] || ENV["SIDEKIQ_POOL_SIZE"] || 42

From what I understand, rails doesn't care about threads, is thread-safe, these are an application server / bj framework details. Why call it "max threads" when it's just the "db pool size" I want?

@schneems
Copy link
Author

schneems commented Jun 3, 2016

That's a tangential discussion. The problem we have with that approach would be now there are two canonical ways to set the pool size. What happens when you set via the config AND the database.yml? Who provides the canonical value? What if the ordering of your code loading changes? We ran into this problem by suggesting people set that value in Rails pre 4.1 see https://devcenter.heroku.com/articles/rails-database-connection-behavior#prior-to-rails-4-1. It was bad, causes confusion and would likely make things more confusing instead of less confusing.

I would be all for getting rid of database.yml in favor of non-yml configuration entry points like you've suggested, however that's a discussion that's not really on the table.

@HoneyryderChuck
Copy link

I'd say that, if you take RAILS_MAX_THREADS as the highest priority (as rails does with DATABASE_URL(?)), you at least won't break backwards compatibility. and config would be the evangelized way (and in heroku you could just ignore the database.yml if pool is already set, which is yours and mine desired solution). But that's just me. I have a hard time seeing these suggested to stuff like sidekiq, which is a layer above rails, as the author suggested. Same as puma/passenger/unicorn/etc. If you make it configurable, as most of the libs are (I don't think devise relies on envvars, for example), you give freedom to the user, and it's not sidekiq's business anymore. It is indeed a breaking change from the link in heroku you sent, but suggesting that the framework should mitigate the lack of configuration in active record is unfair.

As for the author of sidekiq, I have a question: why is sidekiq handling RAILS_ENV and RACK_ENV? Why not SINATRA_ENV or UNICORN_ENV? There was a discussion before on the different meanings of these variables, and that not all of them correlate to one another. Why is sidekiq handling them then? From what I can get from a quick github search, it's used to require some expected rails project files, and all inside the cli.rb file, which is also loaded for non-rails. Isn't this an example of "giving special treatment to rails"? If you want to target rails, just Rails.env if defined?(Rails), and you get 2 envvars less. In the same file, a quick search for Rails suggests me more tailor-made rails code. One suggestion: did you consider going the rack direction and load a config.sk by default? This way you could load environment.rb in rails, configure sidekiq server separately and other server-only configuration. Rails also ships its own custom config.ru for rack.

@mperham mperham closed this as completed in 52828e4 Aug 1, 2016
@mperham
Copy link
Collaborator

mperham commented Aug 1, 2016

How does that look?

sergiopantoja added a commit to builtbykaiser/vorsprung that referenced this issue Dec 24, 2017
cupakromer added a commit to cupakromer/sidekiq that referenced this issue Mar 27, 2018
This is a follow up to sidekiq#2985 (52828e4) adding similar support for the client
connection pool. For Rails servers, Sidekiq is not loaded from the CLI so the
prior change to support setting the concurrency via `RAILS_MAX_THREADS` does
not apply. This means for Rails servers which do not configure a custom size
through an initializer they will run with only the default 5 connections
available to the pool.

When the Rails server runs the initial Redis connection may be made through
`Sidekiq::Client` (e.g. from [`ActiveJob::QueueAdapters::SidekiqAdapter`](https://github.com/rails/rails/blob/v5.1.5/activejob/lib/active_job/queue_adapters/sidekiq_adapter.rb#L20)).
This causes the `redis_pool` to be initialized without any options, setting the
pool size to the default of 5.

    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq.rb:125:in `redis_pool'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:42:in `initialize'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `new'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `push'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/queue_adapters/sidekiq_adapter.rb:20:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:51:in `block in enqueue'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:108:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:15:in `block (3 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `block in tag_logger'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `block in tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:26:in `tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `tagged'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `tag_logger'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:14:in `block (2 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `instance_exec'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:135:in `run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:47:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:18:in `perform_later'

For the majority of cases, a client pool size of 5 is sufficient. However,
servers which utilize a high number of threads, with large job payloads, and
which may experience some network latency issues can see `Timeout::Error`
crashes. This may be further exacerbated by the ~2-20x performance decrease
through `ActiveJob` (sidekiq#3782). Rails addresses this general type of connection
issue for the main database by suggesting that the DB pool size match the
number of threads running. This change applies that logic to the default client
pool size by leveraging the same environment setting; this way there's a
connection available per thread.

This may also have the side effect of a slight performance boost, as there is
less of a chance that threads will be blocked waiting on connections. The
trade-off is that there may be a memory profile increase to handle the
additional Redis connections in the pool; note the pool only creates new
connections as necessary to handle the requests.

Resolves sidekiq#3806
cupakromer added a commit to cupakromer/sidekiq that referenced this issue Mar 27, 2018
This is a follow up to sidekiq#2985 (52828e4) adding similar support for the
client connection pool. For Rails servers, Sidekiq is not loaded from
the CLI so the prior change to support setting the concurrency via
`RAILS_MAX_THREADS` is not applied to the web server process. This means
for Rails servers which do not configure a custom size through an
initializer they will run with the default connection pool size of 5.

When the Rails server runs the initial Redis connection may be made
through `Sidekiq::Client` (e.g. from [`ActiveJob::QueueAdapters::SidekiqAdapter`](https://github.com/rails/rails/blob/v5.1.5/activejob/lib/active_job/queue_adapters/sidekiq_adapter.rb#L20)).
This causes the `redis_pool` to be initialized without any options,
setting the pool size to the default of 5.

    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq.rb:125:in `redis_pool'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:42:in `initialize'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `new'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `push'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/queue_adapters/sidekiq_adapter.rb:20:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:51:in `block in enqueue'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:108:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:15:in `block (3 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `block in tag_logger'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `block in tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:26:in `tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `tagged'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `tag_logger'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:14:in `block (2 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `instance_exec'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:135:in `run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:47:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:18:in `perform_later'

For the majority of cases, a client pool size of 5 is sufficient.
However, servers which utilize a high number of threads, with large job
payloads, and which may experience some network latency issues can see
`Timeout::Error` crashes. This may be further exacerbated by the ~2-20x
performance decrease through `ActiveJob` (sidekiq#3782). Rails addresses this
general type of connection issue for the main database by suggesting
that the DB pool size match the number of threads running. This change
applies that logic to the default client pool size by leveraging the
same environment setting; this way there's a connection available per
thread.

This may also have the side effect of a slight performance boost, as
there is less of a chance that threads will be blocked waiting on
connections. The trade-off is that there may be a memory profile
increase to handle the additional Redis connections in the pool; note
the pool only creates new connections as necessary to handle the
requests.

Resolves sidekiq#3806
mperham pushed a commit that referenced this issue Mar 27, 2018
This is a follow up to #2985 (52828e4) adding similar support for the
client connection pool. For Rails servers, Sidekiq is not loaded from
the CLI so the prior change to support setting the concurrency via
`RAILS_MAX_THREADS` is not applied to the web server process. This means
for Rails servers which do not configure a custom size through an
initializer they will run with the default connection pool size of 5.

When the Rails server runs the initial Redis connection may be made
through `Sidekiq::Client` (e.g. from [`ActiveJob::QueueAdapters::SidekiqAdapter`](https://github.com/rails/rails/blob/v5.1.5/activejob/lib/active_job/queue_adapters/sidekiq_adapter.rb#L20)).
This causes the `redis_pool` to be initialized without any options,
setting the pool size to the default of 5.

    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq.rb:125:in `redis_pool'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:42:in `initialize'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `new'
    .gem/ruby/2.5.0/gems/sidekiq-5.1.1/lib/sidekiq/client.rb:131:in `push'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/queue_adapters/sidekiq_adapter.rb:20:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:51:in `block in enqueue'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:108:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:15:in `block (3 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `block in tag_logger'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `block in tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:26:in `tagged'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/tagged_logging.rb:69:in `tagged'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:44:in `tag_logger'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/logging.rb:14:in `block (2 levels) in <module:Logging>'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `instance_exec'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:117:in `block in run_callbacks'
    .gem/ruby/2.5.0/gems/activesupport-5.1.5/lib/active_support/callbacks.rb:135:in `run_callbacks'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:47:in `enqueue'
    .gem/ruby/2.5.0/gems/activejob-5.1.5/lib/active_job/enqueuing.rb:18:in `perform_later'

For the majority of cases, a client pool size of 5 is sufficient.
However, servers which utilize a high number of threads, with large job
payloads, and which may experience some network latency issues can see
`Timeout::Error` crashes. This may be further exacerbated by the ~2-20x
performance decrease through `ActiveJob` (#3782). Rails addresses this
general type of connection issue for the main database by suggesting
that the DB pool size match the number of threads running. This change
applies that logic to the default client pool size by leveraging the
same environment setting; this way there's a connection available per
thread.

This may also have the side effect of a slight performance boost, as
there is less of a chance that threads will be blocked waiting on
connections. The trade-off is that there may be a memory profile
increase to handle the additional Redis connections in the pool; note
the pool only creates new connections as necessary to handle the
requests.

Resolves #3806
@kurtlaunchlabs
Copy link

Hi, May I know if the sidekiq concurrency is per client or shared? If shared then, I think it cannot served 50 client sending background task request at the same time, if default is 25. If per client, then much better.

@ionutzp
Copy link

ionutzp commented Dec 14, 2018

@kurtlaunchlabs i don't understand what you mean

@sharshenov
Copy link

Another approach:

# config/database.yml
  pool: <%= Sidekiq.server? ? Sidekiq.options[:concurrency] : ENV.fetch("RAILS_MAX_THREADS") { 5 } %>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants