New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Part 4: Multi db improvements, Basic API for connection switching #34052
Conversation
def connected_to(database: nil, handler: nil, &blk) | ||
if database && handler | ||
raise ArgumentError, "connected_to can only accept handler or database, but not both arguments." | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we raise an exception if both are nil
?
I like that we can use
|
What would happen if two models have the same handler name for two different databases? Should we support that? Say:
Also say we have that scenario, how would the following code work? Dog.connected_to(hander: :reading) do
Dog.create!
Book.create!
end If I got the implementation correctly |
That's exactly what I want to support because that's how we do it at GitHub. We have 10 connections that belong to a In GitHub we'd don't write this: Dog.connected_to(handler: :reading) do
Dog.create! # explode from Dog bc doing a write on a read
Book.create! # isn't called but not because it's Dog's handler, you told Rails what handler to use - `reading`.
end Instead we write this (but with GitHub instead of Ar Base bc Rails doesn't support this yet) ActiveRecord::Base.connected_to(handler: :reading) do
Dog.create!
Book.create!
end If we want to write to multiple dbs we can do that by using the writing handler: ActiveRecord::Base.connected_to(handler: :writing) do
Dog.create! # success
Book.create! # success
end
|
@rafaelfranca I think we should support your first scenario, but if you want to switch both models you need to do it at |
I like this. A few notes: If you're connecting directly to a specific database, you shouldn't have to declare the role: ModelInPrimary.connected_to(database: :primary_replica_slow) do
ModelInPrimary.do_something_thats_slow
end Re: handler, I don't really like that word much. I'd prefer to use "role". That would connect with the future 3-tier database.yml configuration setup as well. So it would be: ActiveRecord::Base.connected_to(role: :reading) do
Dog.create!
Book.create! # Will raise if a :reading role isn't found on Book
end @tenderlove I'd be curious to see how many instances of connection switching you have in the code? I was initially partial to having some syntatic sugar, but I don't think switching roles mid-flight is going to be a super common action. And if it isn't, then I'd rather be as clear as possible about what's going on. On the larger topic of r/w splitting, @eileencodes, you're working towards a place where AR automatically will pick the :writing role when AR is doing INSERTs and :reading role when AR is doing SELECTs, right? I thought there was some confusion about whether that's within this initial scope of work when discussing with @matthewd in the earlier thread. |
👍 I will work on changing this requirement. Currently I have it so it creates a new
I can change this. For background handler makes sense to me since it is switching on the connection_handler - but perhaps that's too much for the user to need to know.
Yes but this is further down the line (ie not for this PR). Rails needs to be able to switch connections before it can know what to switch to.
We actually do this quite a bit since we default to the replicas, expect in certain circumstances where we need to explicitly call readonly.
|
More than I thought. I was counting and then @eileencodes finished before me. 😊 |
So your default replicas are not readonlys? If we get AR to do the automatic r/w splitting, would you still need as many explicit calls? Or would you only need it when using slow-read dbs? |
A question: Can this switching be used as a failover? Let's say, a primary connection failed, switch to secondary (backup) one. |
@dhh we default to read but switch on the request type (GET == read, POST == write) rather than the sql query. So in some cases we need to switch back to the read or to the write in order to handle that. I assume we will need less of those if we have Rails auto switch based on SQL rather than request type. I think that if we really do need the helper methods we can add those later. @deepj No. We're quite a bit aways from something like that. |
Yes, we use the multiple handlers, but it seems the implementation store the name of the handlers in Would not:
fail because |
Ok I think I get it. The handler is the same for all models but it holds a connection pool for each model with a different I agree with DHH's suggestions for the API. 👍 from me. |
Yup! That's exactly how it works. I'm writing up some tests and will be pushing up later this weekend or early next week. I think we're almost ready to merge this (with DHH's changes). That will unblock a lot of the future work. 😄 Also @matthewd originally had some concerns about threads but we paired today and found it's not a problem. The connection handler is thread local so we're good there 👍 |
291b558
to
008a3e6
Compare
|
008a3e6
to
7a609db
Compare
I think That's why I like |
Wow. I swear your post said |
7a609db
to
72f7bb9
Compare
|
This PR adds the ability to 1) connect to multiple databases in a model, and 2) switch between those connections using a block. To connect a model to a set of databases for writing and reading use the following API. This API supercedes `establish_connection`. The `writing` and `reading` keys represent handler / role names and `animals` and `animals_replica` represents the database key to look up the configuration hash from. ``` class AnimalsBase < ApplicationRecord connects_to database: { writing: :animals, reading: :animals_replica } end ``` Inside the application - outside the model declaration - we can switch connections with a block call to `connected_to`. If we want to connect to a db that isn't default (ie readonly_slow) we can connect like this: Outside the model we may want to connect to a new database (one that is not in the default writing/reading set) - for example a slow replica for making slow queries. To do this we have the `connected_to` method that takes a `database` hash that matches the signature of `connects_to`. The `connected_to` method also takes a block. ``` AcitveRecord::Base.connected_to(database: { slow_readonly: :primary_replica_slow }) do ModelInPrimary.do_something_thats_slow end ``` For models that are already loaded and connections that are already connected, `connected_to` doesn't need to pass in a `database` because you may want to run queries against multiple databases using a specific role/handler. In this case `connected_to` can take a `role` and use that to swap on the connection passed. This simplies queries - and matches how we do it in GitHub. Once you're connected to the database you don't need to re-connect, we assume the connection is in the pool and simply pass the handler we'd like to swap on. ``` ActiveRecord::Base.connected_to(role: :reading) do Dog.read_something_from_dog ModelInPrimary.do_something_from_model_in_primary end ```
72f7bb9
to
31021a8
Compare
…ishes connection Related to rails#34052
Since both methods are public API I think it makes sense to add these tests in order to prevent any regression in the behavior of those methods after the 6.0 release. Exercise `connected_to` - Ensure that the method raises with both `database` and `role` arguments - Ensure that the method raises without `database` and `role` Exercise `connects_to` - Ensure that the method returns an array of established connections(as mentioned in the docs of the method) Related to rails#34052
It seems that
or am I missing something @eileencodes ? |
No you're not missing something, Rails does not yet handle joining across separate databases. We're working on supporting the ability for Rails to recognize the connections are different and to split up the queries into 2 selects but the join syntax isn't going to be possible across 2 machines. |
Can I suggest something? During
what do you think @eileencodes ? |
Perhaps a stupid question but does this enable:
Is there a way to add and or create databases at runtime? From what i understand now you have to append db config to databasy.yml I could not find a conclusive post on this. |
Nobody? |
I mean does rails 6 support multi tennant. Entity universities with 1 database per entity. or can we better use other not to be named solutions? |
As the docs note, no, not yet, Rails doesn't support sharding. |
Hi! May I ask what is the current status? I would like to be able to safely split reads/writes between master and slaves in a MySQL replication, automatically. Is this possible yet? Thanks in advance. |
Part 4: Multi db improvements, Basic API for connection switching rails/rails#34052
Awesome work! |
I had one question. Is it possible to specify a series of replicas in database.yml for reading so that when you do something like this
it can go to one of the different dbs you have specified in that group ... vs just one specific db? |
ActiveRecord::Base.connected_to(database: :key_logs) do . I want manual db switching. connected_to method is looking for adaptor details in database.yml. I have multiple database with same details. I don't want multiple schema . Rails 6 is working on schema switching? |
Hi @eileencodes, I'm working on this features. let me share quickly my use case...
Leave some code to help you see my current scenario... database.yml test:
shard:
<<: *default
url: <%= ENV.fetch('DATABASE_URL_SHARD_TEST') %>
migrations_paths: db/shard_migrate
shard_replica:
<<: *default
url: <%= ENV.fetch('DATABASE_URL_SHARD_TEST') %>
replica: true
global:
<<: *default
url: <%= ENV.fetch('DATABASE_URL_GLOBAL_TEST') %>
migrations_paths: db/global_migrate
global_replica:
<<: *default
url: <%= ENV.fetch('DATABASE_URL_GLOBAL_TEST') %>
replica: true Then, class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
# We have some code to read current config and return the Hash with shards available in database.yml
connects_to shards: { global: { writing: :global, reading: :global_replica }, shard: { writing: :shard, reading: :shard_replica } }
scope :active, -> { where(active: true) }
end Until here no mayor issues, I can switch connections and everything looks ok... But now, imagine I created a new DB connection which is not part of Again my question, I would like to reload ApplicationRecord.connects_to shards: { global: { writing: :global, reading: :global_replica }, shard: { writing: :shard, reading: :shard_replica }, customer1: { writting: :customer1, reading: :customer1_replica} } The ActiveRecord::Base.connected_to(shard: :company1) do
Model.find(id)
end Any thoughts? I appreciate your feedback 👍 |
No this isn't allowed - doing this would clobber all existing connections to add the new one and potentially in the middle of a request. That's super dangerous so even if there was a workaround I wouldn't recommend it. There's also other potential issues with doing a setup like this and it's safest to reload the application when messing with database connections. The recommended way of doing this (or at least the way we discussed at my prior job of working around this) is to pre-setup your shards. For example, say you currently have 4 customers. Instead of only setting up 4 shards for each existing customer you'd set up 100 shards and add the connections for those. Then in your global router table all you would need to do is insert a new record for the new tenant when they sign up. The connections will be active but inaccessible until the customer is added to the DB. The other way is add a new config and new connection and then deploy the app each time you need to add a new customer. In the future it's better to open a new issue or ask questions on the forum. I occasionally unsubscribe from old PRs and it also hides these questions from future readers looking for the same answer |
This PR implements the basic API requirements laid out in #33877 by DHH. The PR aims to focus only on implementing the
connects_to
andconnected_to
API. For now it does not tackle any configuration changes (we can hash that out in future PRs). If this API is acceptable I will add tests.cc/ @dhh @matthewd @rafaelfranca @tenderlove
This PR adds the ability to 1) connect to multiple databases in a model,
and 2) switch between those connections using a block.
To connect a model to a set of databases for writing and reading use
the following API. This API supersedes
establish_connection
. Thewriting
andreading
keys represent handler / mode names andanimals
andanimals_replica
represents the database key to look upthe configuration hash from.
Inside the application - outside the model declaration - we can switch
connections with a block call to
connected_to
.If we want to connect to a db that isn't default (ie readonly_slow) we
can connect like this:
Outside the model we may want to connect to a new database (one that is
not in the default writing/reading set) - for example a slow replica for
making slow queries. To do this we have the
connected_to
method thattakes a
database
hash that matches the signature ofconnects_to
. Theconnected_to
method also takes a block.For models that are already loaded and connections that are already
connected,
connected_to
doesn't need to pass in adatabase
becauseyou may want to run queries against multiple databases using a specific
mode/handler.
In this case
connected_to
can take ahandler
and use that to swap onthe connection passed. This simplies queries - and matches how we do it
in GitHub. Once you're connected to the database you don't need to
re-connect, we assume the connection is in the pool and simply pass the
handler we'd like to swap on.