-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlocks in rails 5 #31019
Comments
Oops... when we added the connection lock I did consider the load lock, but in deciding it wasn't relevant I forgot about transactions. (I can't think of any other way to put user-controlled code inside that lock -- and user-controlled code is the only place that can trigger a load.) The solution is indeed to wrap the lock acquisition with However, I'm hesitant to add so much lock machinery to the log call when 1) most of the time, the only thing happening inside that block will be a database query [which itself does I wonder if we can instead push it out closer to the |
Thanks for the super fast response! I'll have a go at what you described in the transaction-y code and see how I go. I'll report back. |
So I have been doing some poking around and it seems that modifying I think basically the issue boils down to us wanting to I had a go at this with a custom monitor class that we would use in class LoadInterlockAwareMonitor < Monitor
def mon_enter
ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
super
end
end
end That fixes the issue for me, but I'm not in love with the new class's name, and I'm also no sure where that would live necessarily. Do we think that a custom monitor class is even the way we should be solving this? |
Ah, right. Yeah, that does look like an interesting approach. Part of me just wants to monkey-patch the base Monitor (because any time a thread is waiting on something, like a lock acquisition, it's safe for it to permit loads)... but that seems likely to be unpopular. Suitable naming/location will require a bit more thought. One tweak to its contents: I think we can throw in a |
Good suggestion on the |
Apologies when this is not related, but I ran into an issue running rspecs against a postgres 10 db that failed. But they always passed with the postgres 9.5.10 database. Could the issue #31019 possibly also relate to the postgres version? |
This fix also appears to fix an issue where Sidekiq jobs can't run in development due to a deadlock if multiple jobs start at the same time. I was on Rails 5.1.4 using Ruby 2.4.1. Thanks for finding and fixing this. |
As part of resolving a race condition in a gem that uses Active Record, I'm currently deep in understanding Active Record's connection management, and it would really help me to get something cleared up. If Active Record's connection pool doesn't allow two threads to use the same connection at the same time, how can then multiple Puma threads be deadlocked by AR connection lock, if each thread will be using their own AR connection which have their own connection lock? Why is a lock needed in the first place, if in theory two threads cannot access the same AR connection? I would've asked this on the Discourse channel, but generally I didn't have luck getting answers there. |
@janko i believe the issue is only happening in tests where things are auto loaded. For example, in rails system tests, your server (puma) and your test runner are sharing the same connection even though they’re in separate threads. This is my understanding. I could be wrong. |
Yes, you are correct. I don't think it's necessarily related to autoloading, but to the fact that in system tests the background Puma workers run in a separate threads, all of which use a shared connection in order for transactional tests to work, so this access needs to be synchronized. |
Steps to reproduce
Since upgrading to rails 5 we've started seeing deadlocks in development and test. Using @matthewd's super helpful tool we've tracked it down to a thread being blocked by the load interlocking. This thread is holding the AR connection's lock while it's blocked. The other threads that are sharing the load interlock are blocked waiting for the connection's lock.
We are using rails 5.1.4 with postgres and are seeing this in both webrick and puma. I have a feature spec that reliably reproduces this issue (running puma with 4 threads) and the state we end up in is:
Puma Thread 1:
This thread is updating records inside an explicit
.transaction
block and has a record level lock on an object using#lock!
. It's waiting to autoload an association's class.Puma Thread 2:
This thread is just fetching some data from the database.
Puma Thread 3:
This thread is just fetching some data from the database.
Puma Thread 4:
This thread is just fetching some data from the database.
Removing our
.transaction
and#lock!
calls do not solve the problem.All of the threads that are waiting on the AR connection lock are waiting in the
ActiveRecord::ConnectionAdapters::AbstractAdapter#log
method. If I add aActiveSupport::Dependencies.interlock.permit_concurrent_loads
block outside of the locking code here the problem goes away, but I'm not sure exactly what the negative side effects will be.Is there anything that I'm missing here? I'm really not sure where to start. Any help would be really appreciated!
System configuration
Rails version: 5.1.4
Ruby version: 2.3.3
The text was updated successfully, but these errors were encountered: