New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large number of failed jobs after upgrading to ActiveRecord4 #1047
Comments
You'll probably need +1 connection for the main thread which boots the system. You don't state your Sidekiq version but I assume 2.12.x. Try doubling your pool to 41 and see if that helps. I've noticed some oddness where sometimes 2x connections are needed. I still don't understand the cause. |
Ah sorry, should of provided more information.. My DB can accept 500 connections. |
Are you spinning off threads in your worker, to do work in parallel? |
In a few of my jobs I do create a thread or two. |
That'll do it - if they touch ActiveRecord, they'll check out a connection and never return it until stale connections are reaped. |
Ah that sounds like the culprit! New AR4 feature I presume? Best course of action is going to be cleaning up in the thread I take it - is it worth adding any logic to the middleware to help cleanup in such circumstances? |
It shouldn't be new. Best course is to manually check out a connection for your thread's use: Thread.new do
ActiveRecord::Base.connection_pool.with_connection do |conn|
MyModel.find(123)
end
end The |
Maybe |
That sounds like it could be it, I'll have a look this AM, but it did start immediately after I updated. |
Just to confirm, wrapping in a block as mentioned above works a treat.. It appears that behaviour has changed for the |
Hello, We had the same issue on Rails 4.0.0. Migrating to 4.0.1 fixes the issue for us. https://github.com/rails/rails/blob/v4.0.1/activerecord/CHANGELOG.md At the end of the CHANGELOG :
|
Also having this issue, also on 4.0.1 If interpret correctly the suggestion is to use
whenever one needs a database connection from inside the worker right? How does this relate to loading an module into the worker like:
These modules hold methods that connect to the database (instead directly from the worker). Or is there a way to have initialiser and set this behaviour globally Thanks in advanche for any suggestions regarding this matter. |
@rubytastic The advice is if you create a Thread manually within a Worker, use |
@mike so I misinterpreted above. Seems still the issue is caused by active record 4.0.1 regarding not closing the db connections. Im on 4.0.1 and the problem there persists. Do you have a recommendation regarding a solution? |
Facing this issue with rails 4.0.1 and sidekiq 2.17.0. |
@x3qt , @mperham I have tried both with and without sidekiq running. |
After switching to rails 4.0.2 and msyql with msyql2 gem, hoped the issues would be resolved but they persist. |
Any updates on this? |
There's nothing Sidekiq can do. This is a Rails issue. |
Sorry to resurrect a long since closed issue but figured I would mention this since it caused me a headache for a few days... @rubytastic I am not sure how you are using ActiveRecord (i.e. within rails or by itself) but one thing to to look out for is load order when requiring Sidekiq and ActiveRecord. I came across this issue recently while using ActiveRecord outside of Rails and I was (quite stupidly...) requiring Sidekiq before ActiveRecord itself. Because of this the middleware in Sidekiq was obviously not picking up ActiveRecord was defined and in turn not including the middleware to close the connections to the DB properly. A pretty silly mistake but it was not very apparent until I manually copied the middleware itself and loaded it via an initializer that I realized it was not being picked up. I had just assumed that it was some issue with AR4 since it was working with AR3....doh It is odd that this issue seemed to coincide with an upgrade to Active Record 4 for myself as well since the load order in my own app would not change due to that upgrade. But I have not been able to track down why yet. Also, you can pretty easily check which middleware is being included with: # initializers/sidekiq.rb
Sidekiq.configure_server do |config|
config.server_middleware do |chain|
p chain
end
end And you should see the active record middleware within the chain. Hope this helps! tldr: Make sure the middleware is being included haha |
This is still an issue and still broken as far as I can see. The Rails bug report related to this is wholly depressing as clearly deeper problems exist that nobody's addressing and it's since been locked, so impossible to contribute towards. The very clearly stated problem and suggested solutions have apparently not been implemented in favour of a couple of unrelated patches that AFAICS don't address the issue at all. As far as Sidekiq goes, we have a very experimental hack in production right now just to "tide us over" - in practice, for months! - which works around the issue. The failure to process the queue in our case springs from Rails 4 ActiveRecord's "thread safety" actually being just a grotty hack of a giant semaphore around the entire system. Rails 4's PG adapter's "is connection active?" check is using blocking I/O, so when the combination of problems that causes Sidekiq threads to stall occurs, it's usually because there's no response in the "active?" check and that blocks indefinitely, locking the whole of ActiveRecord for everyone (in that execution instance). There are wider questions about the underlying reason for that call to block which are beyond the scope of this comment. https://gist.github.com/pond/5eaec0c30c0b4477f234 Once again, this is experimental and papers over the cracks, but it's kept us going in prod. Use with great caution and read the comments at the top carefully before trying it out. You almost certainly need to have the ActiveRecord pool reaper running in conjunction with this. It was developed while working for Loyalty New Zealand, my current employer, but is posted with permission and can be considered as available for any use (MIT/BSD/whatever kind of imaginary licence floats your boat!). |
I'm a bit confused about wether we should call Can someone clarify? |
You do not need to do anything special.
… On 11 Jul 2017, at 02:27, Philippe Vaucher ***@***.***> wrote:
I'm a bit confused about wether we should call ActiveRecord::Base.connection_pool.with_connection in our workers or not, if we do not create threads in them (just use the ActiveRecord models).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
(I just realised that editing answers on github does not work well with email workflows because you only receive the first text that was submitted. I thus resubmit the whole text and delete my previous reply) @mperham: thanks! What happens if we have more concurrent workers than available database pool size tho? My current pool size is 100, I run rails 5. At the moment I never have more than 80 concurrent jobs and my workers all use ActiveRecord::Base.connection_pool.with_connection because I thought it was better practice and avoided "max DB connexion reached" issues. I'm in the middle of a refactor where my goal is to remove the need for a DB connection for most of my jobs, so they have all the information they need to perform. Then another job's goal will be to collect the results and do the necessary DB operations... That was it should remove the need for a high database pool. I'll also remove the wrapping with ActiveRecord::Base.connection_pool.with_connection based on your advice. |
ActiveRecord keeps a pool of connections to the database, per-process. You should not run more than a concurrency of 50 in Sidekiq and I recommend setting the pool size == concurrency. The pool is lazy so AR will open up to 50 connections if you have a lot of jobs processing at the same time in the process. In Sidekiq <5, Sidekiq's ActiveRecord middleware cleans up the job's db connection after execution. |
@mperham: ah, great thanks a lot for the clarifications! In my case I use Sidekiq 5+ for jobs where I download images from ~130 IP cameras at regular intervals (each 10 minutes), so I kinda need the high concurrency. Maybe this needs to be designed differently... |
Why can't you run more processes?
… On 11 Jul 2017, at 22:56, Philippe Vaucher ***@***.***> wrote:
@mperham: ah, great thanks a lot for the clarifications!
In my case I use Sidekiq 5+ for jobs where I download images from ~130 IP cameras at regular intervals (each 10 minutes), so I kinda need the high concurrency. Maybe this needs to be designed differently...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@mperham: ah multiple sidekiq processes? yes I can do that, that way I can limit each Sidekiq process to a concurrency of 50 or less like you recommend and also the database connection pool becomes less of an issue. I just didn't think of it. Thanks again! p.s: for anyone wondering "why 50", the explanation is in https://github.com/mperham/sidekiq/wiki/Advanced-Options#concurrency |
I'm looking into a potential issue with ActiveRecord4 and Sidekiq.
Since upgrading to AR4, my failure rate shot through the roof - from under 100 failures in 2 weeks to 1000 failures in the first hour after upgrading.
The error presented is this:
I have an ActiveRecord connection pool of 20. The first 20 jobs go through fine and I can see my connection pool growing, however when it hits 20, the errors start occurring.
It appears that either the connections aren't getting released or checked out appropriately.
If I replace the following line from the ActiveRecord Sidekiq middleware:
...with a full disconnect:
All works fine again - though obviously very inefficient.
I'm using a barebones ruby script for my worker that connects to ActiveRecord without rails:
It's all very simple and the queue was running fine until now.
I'll plug away at this a bit more and if no luck, will get a skeleton repo up to reproduce it. Any input is appreciated.
The text was updated successfully, but these errors were encountered: