Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Add an intermediary called RoleManager to manage connections #37622
This commit message is really long
This PR is an alternate solution to #37388. While there are benefits
Shopify and other applications need sharding but Rails has
This PR aims to solve only that problem.
What this PR does:
In this PR we've added a
By using the
A note about why we opened this PR:
We very much appreciate the work that went into #37388 and in no way mean
In addition it doesn't solve the problem of mapping a single connection
The new PR solves this by using the
When I originally designed the API for multiple databases, it wasn't
Since this PR doesn't move around the concepts in connection
We didn't change it yet in this PR because we wanted to keep change
What this PR does not solve:
Our PR here solves a small portion of the problem - it allows models to
Thanks for reading this far. These problems aren't easy to solve. John
Ultimately we're aiming to change as little as the API as possible. Even
We all have the same goal; to add sharding support to Rails. Let me know
Yup, you can get the role that way, but once this is merged we can expose a public API on
This makes it really clear how much we need to rename Role now that it won't be an actual role. And since it no longer returns a connection pool I realized that
If you're ok with this direction @casperisfine I'll come up with a new name. I didn't want to rename it before opening the PR because I thought it would be more difficult to see the direction we want to go in.
Like I said previously this will fix the underlying issue with sharding while avoiding drastic changes to connection management.
TBH I don't find it elegant at all because it makes the hierarchy even more backwards than it already is (
But I don't think it's possible to do without breaking
So long story short:
On another note, something from #37388 that we'll have to reproduce anyway. Right now
This PR is an alternate solution to #37388. While there are benefits to merging #37388 it changes the public API and swaps around existing concepts for how connection management works. The changes are backwards-incompatible and pretty major. This will have a negative impact on gems and applications relying on how conn management currently works. **Background:** Shopify and other applications need sharding but Rails has made it impossible to do this because a handler can only hold one connection pool per class. Sharded apps need to hold multiple connections per handler per class. This PR aims to solve only that problem. **What this PR does:** In this PR we've added a `RoleManager` class that can hold multiple `Roles`. Each `Role` holds the `db_config`, `connection_specification_name`, `schema_cache` and `pool`. By default the `RoleManager` holds a single reference from a `default` key to the `Role` instance. A sharded/multi-tenant app can pass an optional second argument to `remove_connection`, `retrieve_connection_pool`, `establish_connection` and `connected?` on the handler, thus allowing for multiple connections belonging to the same class/handler without breaking backwards compatibility. By using the `RoleManager` we can avoid altering the public API, moving around handler/role concepts, and achieve the internal needs for establishing multiple connections per handler per class. **A note about why we opened this PR:** We very much appreciate the work that went into #37388 and in no way mean to diminish that work. However, it breaks the following public APIs: * `#retrieve_connection`, `#connected?`, and `#remove_connection` are public methods on handler and can't be changed from taking a spec to a role. * The knowledge that the handler keys are symbols relating to a role (`:writing`/`:reading`) is public - changing how handlers are accessed will break apps/libraries. In addition it doesn't solve the problem of mapping a single connection to a single class since it has a 1:1 mapping of `class (handler) -> role (writing) -> db_config`. Multiple pools in a writing role can't exist in that implementation. The new PR solves this by using the `RoleManager` to hold multiple connection objects for the same class. This lets a handler hold a role manager which can hold as many roles for that writer as the app needs. **Regarding the `Role` name:** When I originally designed the API for multiple databases, it wasn't accidental that handler and role are the same concept. Handler is the internal concept (since that's what was there already) and Role was the public external concept. Meaning, role and handler were meant to be the same thing. The concept here means that when you switch a handler/role, Rails automatically can pick up the connection on the other role by knowing the specification name. Changing this would mean not just that we need to rework how GitHub and many many gems work, but also means retraining users of Rails 6.0 that all these concepts changed. Since this PR doesn't move around the concepts in connection management and instead creates an intermediary between `handler` and `role` to manage the connection data (`db_config`, `schema_cache`, `pool`, and `connection_specification`) we think that `Role` and `RoleManager` are the wrong name. We didn't change it yet in this PR because we wanted to keep change churn low for initial review. We also haven't come up with a better name yet.
😄**What this PR does not solve:** Our PR here solves a small portion of the problem - it allows models to have multiple connections on a class. It doesn't aim to solve any other problems than that. Going forward we'll need to still solve the following problems: * `DatabaseConfig` doesn't support a sharding configuration * `connects_to`/`connected_to` still needs a way to switch connections for shards * Automatic switching of shards * `connection_specification_name` still exists **The End** Thanks for reading this far. These problems aren't easy to solve. John and I spent a lot of time trying different things and so I hope that this doesn't come across as if we think we know better. I would have commented on the other PR what changes to make but we needed to try out different solutions in order to get here. Ultimately we're aiming to change as little as the API as possible. Even if the handler/role -> manager -> db_config/pool/etc isn't how we'd design connection management if we could start over, we also don't want to break public APIs. It's important that we make things better while maintaining compatibility. The `RoleManager` class makes it possible for us to fix the underlying problem while maintaining all the backwards compatibility in the public API. We all have the same goal; to add sharding support to Rails. Let me know your thoughts on this change in lieu of #37388 and if you have questions. Co-authored-by: John Crepezzi <email@example.com>
I added a new commit that changes
I've been working with connection management in Rails for over a year. It's basically all I do these days (in Rails and in GitHub).
IMO the current system does make sense and is logical because it's easy to follow. This isn't more backwards, it's just a different way of doing it than you would prefer (and that's ok!).
I don't see a future in which we can rewrite connection management without breaking apps - and I also don't agree the concepts are backwards. We can incrementally solve the real problems with conn management without rewriting it.
It's important to me that we don't break apps/libraries existing behavior. For a long time apps have been rolling their own multiple-databases and we should work within the existing system rather than rewrite it and risk breaking those apps.
No it's not just a matter of personal preference or habit, this has actual repercussions on how that datastructure is accessed.
When I say backwards I don't mean it in its pejorative sense, but in the sense of it being "upside down".
Like if you'd store a
Same here, if I want to grab the
But enough beating of a dead horse, I already agreed this was the only way forward for 6.1.
This commit renames `RoleManager` -> `PoolManager` and `Role` -> `PoolConfig`. Once we introduced the previous commit, and looking at the existing code, it's clearer that `Role` and `RoleManager` are not the right names for these. Since this PR moves away from swapping the connection handler concepts around and the role concept will continue existing on the handler level, we need to rename this. A `PoolConfig` holds a `connection_specification_name` (we may rename this down the road), a `db_config`, a `schema_cache`, and a `pool`. It does feel like `pool` could eventually hold all of these things instead of having a `PoolConfig` object. This would remove one level of the object graph and reduce complexity. For now I'm leaving this object to keep the change churn low and will revisit later. Co-authored-by: John Crepezzi <firstname.lastname@example.org>