New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make ARHP's connection switching thread safe #63
base: main
Are you sure you want to change the base?
Conversation
1aaaf27
to
e68fc01
Compare
@@ -3,13 +3,54 @@ | |||
require "active_record/connection_adapters/mysql2_adapter" | |||
|
|||
module ActiveRecordHostPool | |||
module PerThreadHashAccess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module only exists to preserve this line farther below:
@config[:database] = _host_pool_current_database if ActiveRecord::VERSION::MAJOR >= 5
610fcb1
to
e312b2a
Compare
attr_reader(:_host_pool_current_database) | ||
def _host_pool_current_database | ||
_ahrp_storage[:_host_pool_current_database] | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to not move the _host_pool_current_database
accessor methods out of the class_eval and into the main module body?
20b41cf
to
8ccf524
Compare
I'm converting this back to a draft. Using this branch locally for testing has been working well but when running this against an app on TravisCI there are some failures that I need to identify if they are related to this patch. |
20283b8
to
3cfd562
Compare
@_host_pool_current_database = database | ||
@config[:database] = _host_pool_current_database if ActiveRecord::VERSION::MAJOR >= 5 | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved _host_pool_current_database
and _host_pool_current_database=
methods down as methods on the module rather than added on through class_eval
.
so that AHRP can be used in a multi-threaded environment. 1. Make _host_pool_current_database accessors thread local 2. Make the connection adapter's internal @config thread-safe with PerThreadHashAccess module inclusion 3. Make a ThreadSafe module which layers methods and overrides on top of DatabaseSwitch base functionality 4. No longer cache current database (via _cached_current_database) IMPLICATIONS OF NO LONGER CACHING THE CURRENT DATABASE In a multithreaded world it doesn't make sense to cache the current database since it's impossible for the code to know if we have switched away and back again in another thread. This means that AHRP will now always change the database for every statement. This also means that clear_cache! will always be called. Performance hit? The cost of issuing a #select_db for each query seems negligible (sub millisecond). If MySQL already has the current database selected then the switch is essentially a no-op and cost is really the round trip across the network. Locally, the average across 10,000 reads was a fraction of a millisecond. The biggest cost may come from calling #clear_cache! on every query. However, this appears to be the prepared statements cache and not the query (results) cache (which is cleared via a separate method: #clear_query_cache). Separate ThreadSafe module? Not knowing how big of a shift this change will be considered I thought it might be easier to make it possible to update the code so projects could opt-in to thread safety rather than making it the global default. Currently, this change-set assumes it will be the default and includes the ThreadSafe module automatically. REMOVING THE VERSION CHECK AROUND UPDATING THE @config[:database] The internal @config[:database] will now always be set rather than just on ActiveRecord >= 5. This ensures that each version of ActiveRecord is updated to do the right thing with respect to threads. Removing the conditional doesn't break anything in Rails 4.2 and it appears that it was only added to fix something broken in ActiveRecord 5+. Since the code continues to work correctly across Rails 4.2, 5.x, and 6.x it doesn't make sense to leave it in. Removing it also makes testing thread-safe behavior simpler since all versions can be tested in the same way.
3cfd562
to
a04a11c
Compare
@@ -118,3 +181,4 @@ def establish_connection(owner, spec) | |||
# rubocop:enable Lint/DuplicateMethods | |||
|
|||
ActiveRecord::ConnectionAdapters::Mysql2Adapter.include(ActiveRecordHostPool::DatabaseSwitch) | |||
ActiveRecord::ConnectionAdapters::Mysql2Adapter.include(ActiveRecordHostPool::ThreadSafe) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made ThreadSafe
a separate module in case this functionality should be considered opt-in.
If this is made opt-in it may make sense to separate test thread-safety separate from non-thread-safety (to run separately when we modify the Mysql2Adapter with thread-safety from running the original non-thread-safe code), but I did not tackle the in this PR since it may not be necessary.
It just hit me that this works great for the database switching to be thread-safe, but it doesn't do anything for the connection pool. I'll see about writing a test for that and see what shakes. |
I've pushed up 3b3572a which exercises the connection pool by creating a number of records across threads. |
I spent the morning looking at this with @mervync and.... I think it should work?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to try this in Classic.
disconnect_without_host_pooling! | ||
end | ||
|
||
def _host_pool_current_database | ||
_ahrp_storage[:_host_pool_current_database] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_ahrp_storage[:_host_pool_current_database] | |
_arhp_storage[:_host_pool_current_database] |
I also make this mistake constantly when working in this repo but ahrp
should be renamed arhp
end | ||
|
||
def _host_pool_current_database=(database) | ||
_ahrp_storage[:_host_pool_current_database] = database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_ahrp_storage[:_host_pool_current_database] = database | |
_arhp_storage[:_host_pool_current_database] = database |
etc
addendum: turns out this isn't production tested. |
def _ahrp_storage | ||
@_ahrp_storage ||= {} | ||
end | ||
|
||
def _ahrp_fetch_database_to_switch_to | ||
database_name = _host_pool_current_database | ||
if database_name != _cached_current_database || connection.object_id != _cached_connection_object_id | ||
database_name | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need these methods? They look like they're overridden by ThreadSafe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess because ThreadSafe
is opt-in
What does this PR do?
This PR makes ARHP database switching thread-safe by:
@config
to also work in a thread local mannerWhy is this helpful?
This allows ARHP to be used in Rails with concurrency turned on. Specifically for my use-case, this lets me set
config.allow_concurrency=true
inconfig/environments/test.rb
for a feature/integration suite. Without this change concurrency has to be disabled and this makes request processing extremely slow.Trade-offs
No longer caching the current database
Turning thread-safety on means that caching the current database no longer makes sense since the cached value may be stale based on other threads that have interacted with the database. Rather than querying the database to see what the current database is (which would work) the code now always issues the
select_db
call and lets MySQL handle responding to things. MySQL seems to treat operations that don't require switching the current database as no-ops and these are very fast.Always calling
clear_cache!
clear_cache!
is always called now since the code no longer keeps track of the current database (and lets mysql manage that). This essentially kills any prepared statement caching that occurred, but this was already happening to a large degree when an application would switch databases.