-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent schema_cache leaking connections between threads #38577
Conversation
62e8342
to
bac3a9b
Compare
49b6b21 changed schema_cache to be per-connection-pool rather than per-connection. However, the schema_cache still maintains an internal reference to a single connection. This reference was replaced each time the schema_cache was requested, but the change would affect all threads. Under some circumstances, this can result in multiple threads trying to use the same connection. This commit switches back to a single SchemaCache instance per connection, but also introduces a new SchemaCacheData class. A single instance of SchemaCacheData is shared between all SchemaCache instances for a pool.
bac3a9b
to
4c3e0c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @davidtaylorhq thanks so much for the PR.
Is there a way to do this without making big changes to the SchemaCache class? My concern is that we can't backport it because with the changes it's more of a "feature + bugfix" than just a bug fix. The other possibility is we can merge this as-is (after I do some more testing with our app) and then backport a portion of it to fix the leak you're experiencing.
This change also needs a CHANGELOG.
end | ||
|
||
def set_schema_cache(cache) | ||
self.schema_cache = cache | ||
# The pool's cache should not have a connection attached |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove this comment?
Thanks for reviewing @eileencodes - I removed the unnecessary comment and added to the CHANGELOG (I pushed the commits to the branch, but they haven't appeared here yet... probably due to the earlier GitHub issues)
I've tried to make the changes as minimal as possible. There shouldn't be any new feature introduced here - sharing the cache data across the whole pool was implemented back in 49b6b21. This change just fixes that feature to make it thread-safe. The diff is reasonably large, but the changes are mostly converting instance variables into |
Yes but this class is public and you've change the interface and how to interact with the schema cache, so anyone who is expecting to access the methods. |
I've tried to avoid doing anything which would be backwards-incompatible. All existing interfaces remain the same, so the only change is the addition of I would love to find a solution which has absolutely zero interface change, but haven't been able to come up with one. |
@eileencodes / @davidtaylorhq / @rafaelfranca what we could do is make this PR 2 commits, first commit improves the existing state so it is not as bad, that change is ultra safe to merge. Basically this monkey patch: https://github.com/discourse/discourse/blob/master/lib/freedom_patches/schema_cache_concurrency.rb Something like
This is clearly not ideal, but will work for 99.99% of the cases, you need to do some serious thread shuffling to break it. Then in 6.1 we can introduce the more comprehensive patch, how does that sound? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could pass the pool
to the SchemaCache instead of a connection
? We'd continue to accept a (deprecated) connection for backward compatibility. When SchemaCache needs to use the connection, it'd request one from the pool: … = @pool.with_connection { |c| c.primary_key(table_name) }
Hi @jeremy
This is how I approached the problem to start with, and came up with this solution. Conceptually I prefer it, but after discussing with @SamSaffron we had some concerns:
That said, happy to develop this approach some more if those two problems can be mitigated |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
I've been meaning to do a similar refactor for a while. IMO it should even go further as if you have say read replicas, the read replica pool and the write pool should ideally share their schema cache (or at least most of the data).
@eileencodes could you point out the changes you think are not acceptable? Maybe I'm missing something but I don't really see the interface change here. Or maybe the PR was updated after your review? |
The |
I see, then maybe the solution is to have all these changes as a new class, and leave the original one untouched. |
This adapter ensures that MiniSql locks the ActiveRecord mutex before using the raw PG connection. This ensures that multiple threads will not attempt to use the same connection simultaneously. This commit also removes the schema_cache_concurrency freedom-patch, which is no longer required now that cross-thread connection use is controlled by the mutex. For the original root cause of this issue, see rails/rails#38577
This adapter ensures that MiniSql locks the ActiveRecord mutex before using the raw PG connection. This ensures that multiple threads will not attempt to use the same connection simultaneously. This commit also removes the schema_cache_concurrency freedom-patch, which is no longer required now that cross-thread connection use is controlled by the mutex. For the original root cause of this issue, see rails/rails#38577
This adapter ensures that MiniSql locks the ActiveRecord mutex before using the raw PG connection. This ensures that multiple threads will not attempt to use the same connection simultaneously. This commit also removes the schema_cache_concurrency freedom-patch, which is no longer required now that cross-thread connection use is controlled by the mutex. For the original root cause of this issue, see rails/rails#38577
At Discourse, we have been seeing some unexpected behaviour in our Sidekiq worker processes, which run jobs in a multi-threaded environment. We were seeing multiple threads accessing the same database connections concurrently.
The root cause appears to be 49b6b21, which changed schema_cache to be per-pool rather than per-connection. However, the schema_cache still maintains an internal reference to a single connection. This reference was replaced each time the schema_cache was requested, but the change would affect all threads. Under some circumstances, this can result in multiple threads trying to use the same connection.
This commit switches back to a single SchemaCache instance per connection, but also introduces a new SchemaCacheData class. A single instance of SchemaCacheData is shared between all SchemaCache instances for a pool.
cc @SamSaffron