Following exception appears when having load on a rails 3.2.8 app under the newest JRuby version using a multi-core machine.
Detected invalid hash contents due to unsynchronized modifications with concurrent users
The bug is probably that in https://github.com/rails/rails/blob/39087068c2e3c85f6839ea51eab4480673138a2b/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L118 (method release_connection()) as well as in https://github.com/rails/rails/blob/39087068c2e3c85f6839ea51eab4480673138a2b/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L206 (method clear_stale_cached_connections!()), the Hash @reserved_connections is accessed without synchronizing to the correct mutex.
This bug is probably related to #6464 , but the fix there is apparently insufficient.
I allow myself to ping @tenderlove.
clear_stale_cached_connections! is always called inside a synchronize here and here, so I don't think we need another synchronize block inside that method. release_connection can definitely be called without synchronization, so I think adding a synchronize inside that method will fix it up.
Synchronize around deleting from the reserved connections hash.
As you can see from the stack trace, clear_stale_cached_connections!() is not only called from inside connection_pool.rb, but also from inside connection_specification.rb, as ActiveRecord::Base.clear_active_connections!(). This method is public and is allowed to be called by any thread. You will need to make this method private if you think that you will not need a synchronized block inside clear_stale_cached_connections!(). However, it seems that ActiveRecord::Base.clear_active_connections!() is public for a reason, else long-running threads accessing ActiveRecord (threads separate from main Ruby threads, e.g. dedicaded threads for writing into the database asynchronously) have no chance to release resources allocated while writing to the database.
@xb I don't see that code on 3-2-stable. Can you point me at it?
I'm sorry, this is my error. clear_stale_cached_connections!() is indeed only called within connection_pool.rb, so I believe your commit fixes the bug.
However, I'd suggest making clear_stale_cached_connections!() private anyway. This way, the effort is higher to call this method without proper synchronization (which is likely to be a bug), and such confusion I had is a little bit less likely. ;-)
@xb totally agree. I'll make the method private!
clear_stale_cached_connections! has long been listed in the API docs for ConnectionPool.
I would argue that it is thus part of ConnectionPool's API, and making it private is a backwards-incompatible break (in a patch release!) -- also that, as part of the public API, it ought to have the synchronize itself, so people can call it safely. Does it really effect performance very much to have an extra nested synchronize for expected use case where it's called by methods already synchronized? But the benefit is making it, well, actually work reliably, when called as part of the public API, which it is listed as.
If you are going to break public api in a patch release, rather than fixing it, at least I suggest advertising this very clearly in the docs.