Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement granular role and shard swapping #40370

Merged

Conversation

eileencodes
Copy link
Member

@eileencodes eileencodes commented Oct 12, 2020

This PR implements the ability for an application to switch a role or shard for a specific owner class rather than for all classes globally. We maintained backwards compatibility for all applications currently using multiple databases by ensuring that ActiveRecord::Base remains global. While many applications don't use the public methods on connection_handler (we do at GitHub but it's not a common pattern) we implemented deprecation warnings since the behavior of what the methods are returning will changed. Most applications can simply set legacy_connection_handling to false and no other changes would be required. We duplicated the tests for connection handling in order to test the legacy handling and new handling for backwards compatibility. This PR does not yet contain documentation. We'll add that once we've had some discussion on the new implementation.

The changes are relatively minimal. The majority of the diff in this PR are updates to the tests as well as deprecation warnings / silencing warnings on framework level code. The new handling uses a single connection handler and stores the writing/reading/other roles in the PoolManager which allows us to switch a single parent class but not all classes. We chose not to implement the ability to switch individual models at this time. Only abstract models can switch connections.

Previously connection handling worked like this diagram below:

89339764-b2a21480-d66c-11ea-9a72-a2a7ad33ba7e

With the changes here the new implementation works like this:

Untitled Diagram-Proposed Sharding


Commit Message

This change allows for a connection to be swapped on role or shard for a
class. Previously calling connected_to would swap all the connections
to a particular role or shard. Granular connection swapping is useful
for swapping one connection to reading while leaving all other
connection classes on writing.

Any method with a public interface change or a method that changed the
return values throws a deprecation warning in order to preserve previous
behavior. These methods are silenced internally so that the framework
does not throw deprecation warnings.

The major changes in this PR are:

  • We introduced a legacy_connection_handling configuration option that
    is set to true by default. It will be set to false for all new
    applications.
  • In the new connection handling there will be one only connection
    handler. Previously there was a connection handler for each role. Now
    the role is stored in the PoolManager. In order to maintain backwards
    compatibility we introduced a LegacyPoolManager to avoid duplicate
    conditionals. See diagram in PR body for changes to connection
    management.
  • connected_to will now use a stacked concurrent map to keep track of
    the connection for each class. For each opened block the class,
    role, and shard will be added to the stack, when the block is exited
    the class, role, shard array will be removed from the stack.
  • With these changes ActiveRecord::Base.connected_to will remain
    global. If called all connections in the block will use the role and
    shard that was switched to. If called with a parent class like
    AnimalsRecord.connected_to only models under AnimalsRecord will be
    switched and everything else will remain the same.

Examples:

Given an application we have a User model that inherits from
ApplicationRecord and a Dog model that inherits from
AnimalsRecord. AnimalsRecord and ApplicationRecord have writing
and reading connections as well as shard default, one, and two.

ActiveRecord::Base.connected_to(role: :reading) do
  User.first # reads from default replica
  Dog.first # reads from default replica

  AnimalsRecord.connected_to(role: :writing, shard: :one) do
    User.first # reads from default replica
    Dog.first # reads from shard one primary
  end

  User.first # reads from default replica
  Dog.first # reads from default replica

  ApplicationRecord.connected_to(role: :writing, shard: :two) do
    User.first # reads from shard two primary
    Dog.first # reads from default replica
  end
end

Things this PR does not solve:

  • Currently there is no API for swapping more than one but not all
    connections. Apps with many primaries may want to swap 3 but not all 10
    connections. We plan to build an API for that in a followup PR.
  • The middleware remains the same and is using the global switching
    methods. Therefore at this time to use this new feature applications
    must manually switch connections. We will also address this in a
    followup PR.
  • The schema_cache is currently on the PoolConfig. We plan on trying
    to move this up to the PoolManager or elsewhere later on so each
    PoolConfig doesn't need to hold a reference to the schema_cache.

Co-authored-by: John Crepezzi john.crepezzi@gmail.com

cc/ @seejohnrun @rafaelfranca @matthewd @tenderlove @jhawthorn @casperisfine

@eileencodes eileencodes force-pushed the granular-role-and-shard-swapping branch from f44e574 to f7c9127 Compare October 12, 2020 13:15
@eileencodes eileencodes added this to the 6.1.0 milestone Oct 12, 2020
@eileencodes eileencodes force-pushed the granular-role-and-shard-swapping branch 2 times, most recently from 1b937b6 to 95e0b0e Compare October 12, 2020 20:01
yield
end
else
connection_handler.while_preventing_writes(prevent_writes) do
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bug here. We need to move the write prevention into the shard stack, otherwise we can switch connections but the writes prevention remains global. Alternatively it could be moved back to the connection level.

@ritikesh
Copy link
Contributor

ritikesh commented Oct 13, 2020

@eileencodes Hi, I have a query on how things would look for an application I'm working on. We have 2 types of abstract classes for AR. One is GlobalRecord and the other is ApplicationRecord. GlobalRecord as the name suggests is a global DB base class and it's children will always have at-most 1 primary and 1 replica database connections. ApplicationRecord on the other hand is the features DB and will eventually be sharded horizontally.

We have a multitenant architecture and all data of a particular tenant will always live in a single shard. So I plan on doing a connection switch at middlewares - rack / sidekiq middlewares.

# frozen_string_literal: true
module Middleware
  class Multitenancy
    def initialize(app)
      @app = app
    end

    def call(env)
      shard = Shard.find_by(tenant_id_from_request_or_job_params)

      ApplicationRecord.connected_to(shard: shard.handler, role: :primary) do
        @app.call(env)
      end
    end
  end
end

However, when interacting with any table from the GlobalRecord children within the request-response cycle within the connected_to block, do I still have to wrap it around connection switching blocks unless required? Is there anyway I can tell AR that a particular model is not-sharded? This way, I don't have to sprinkle explicit connection switching blocks everywhere while I'm trying to access Shard or any other similar models.

ApplicationRecord.connected_to(shard: :shard1, role: :primary) do
  User.first # reads from ApplicationRecord shard1 primary
  Shard.first # reads from GlobalRecord default primary
end

Or should I tweak my middleware to always establish a connection to the global shard?

GlobalRecord.connected_to(shard: :default, role: :primary) do
  shard = Shard.find_by(tenant_id_from_request_or_job_params)
  ApplicationRecord.connected_to(shard: shard.handler, role: :primary) do
    @app.call(env)
  end
end

@eileencodes
Copy link
Member Author

Is there anyway I can tell AR that a particular model is not-sharded?

@ritikesh not at this time and this PR isn't going to aim to accomplish that. I'd like to keep commentary on this PR to the problem I'm trying to fix here and review of this code. Thanks!

@ritikesh
Copy link
Contributor

Fair enough @eileencodes. Apologies for spamming.

Copy link
Member

@rafaelfranca rafaelfranca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me. I'm testing this patch in our app as I write this comment.

I only have a little concern about performance since we are running an extra conditional every time we get a connection handler. I'm not sure if that makes a lot of difference, but I'm wondering if we can't push all this conditional code to an object that we cache in the first time we call the first method. But of course, in order to justify this optimization we need to measure to know if there is any impact or not, so I'm not sure if it worth doing it right now.

connection_handler.while_preventing_writes(prevent_writes) do
self.role_and_shard_stack << { role: role, shard: shard, klass: self }
return_value = yield
return_value.load if return_value.is_a? ActiveRecord::Relation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the connection handler need to know about relation? Not that I have a better place to put this logic but I'm curious to know why we need to care about this after this new logic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to get this test to pass https://github.com/rails/rails/blob/8f5f17cb62af2c616e763125547ea82e1c161ac6/activerecord/test/cases/connection_adapters/connection_handlers_multi_db_test.rb#L75-L84. We also use it in swap_connection_handler for the legacy version. It was implemented in #38339. Otherwise the return value of the connected to block won't be loaded. We had a few situations at GitHub where teams though it was loading the relation but it wasn't. This resulted in loading a relation from the primary/writer when the author meant to load from the replica/reader database.

@rafaelfranca
Copy link
Member

I tested in our application that also does heavy usage of the API in the connection handler and after flipping the flag to false all tests passed.

@eileencodes eileencodes force-pushed the granular-role-and-shard-swapping branch 2 times, most recently from 37bf742 to 8f5f17c Compare October 23, 2020 12:33
@eileencodes
Copy link
Member Author

There was a bug with the preventing writes code that's now been fixed. It took awhile to get the tests working because the legacy vs not legacy code was leaking in the tests. This wouldn't happen in a real app because the legacy handling is either on or off. The diff is a bit bigger now to accommodate for the duplicate testing around the 2 types of handling in preventing writes. For these I split the tests out into their own files. I also found other tests I had missed while doing that.

I talked with @rafaelfranca in Basecamp a bit about whether the deprecations were really needed. The deprecations are more like deprecation as documentation since the behavior of the method changes if you're using legacy vs not using legacy. We could drop these but my concern is that they would then silently change behavior without notice. Most apps don't use these methods but a lot of ActiveRecord gems do. For now I'd like to leave them unless they prove problematic.

Later today @seejohnrun and I will look into the potential performance issue now that the prevent writes code is fixed and tested.

@eileencodes eileencodes force-pushed the granular-role-and-shard-swapping branch 3 times, most recently from ca3920e to 51092af Compare October 27, 2020 21:06
@eileencodes
Copy link
Member Author

There was a semi-serious performance issue that we tracked down to be in the deprecations themselves for connection_handlers. We decided to undo this deprecation - once everyone switches over from legacy_connection_handling to the new way it will be safe to deprecate and remove the connection_handlers setter and getter. We tested caching the object and found the difference was negligible, the legacy_connection_handling method isn't expensive (especially compared to the deprecations).

I'll merge tomorrow or early Thursday after doing some additional testing on our app.

@@ -164,6 +164,7 @@ def load_defaults(target_version)

if respond_to?(:active_record)
active_record.has_many_inversing = true
active_record.legacy_connection_handling = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be an entry in new_framework_defaults_6_1.rb for this setting. It's not obvious from the deprecation warnings where legacy_connection_handling = false should be added, and it's probably a better experience if the user is prompted by the upgrade guide to enable it.

It's also not clear from the name alone what legacy_connection_handling = false does. What do you think of naming it granular_connection_handling = true instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also not clear from the name alone what legacy_connection_handling = false does

I went with legacy_connection_handling because it reads like something you don't want to use anymore, whereas granular_connection_handling sounds like a nice feature to have. Additionally we maintained global ActiveRecord::Base behavior so using granular_connection_handling as the config option reads like we're changing behavior when 99% of apps don't actually use multi-db handling so they don't need to worry about their connections.

think there should be an entry in new_framework_defaults_6_1.rb for this setting.

I'll add it, thanks.

and it's probably a better experience if the user is prompted by the upgrade guide to enable it.

We'll add that when we write the upgrade guide or if there's already a PR I'll add it there. We don't usually update the upgrade guide in master for each feature added.

@eileencodes eileencodes force-pushed the granular-role-and-shard-swapping branch from 51092af to df6a8e0 Compare October 28, 2020 18:58
This change allows for a connection to be swapped on role or shard for a
class. Previously calling `connected_to` would swap all the connections
to a particular role or shard. Granular connection swapping is useful
for swapping one connection to reading while leaving all other
connection classes on writing.

The public methods on connection handler have been updated to behave the
same as they did previously on the different handlers. The difference
however is instead of calling
`ActiveRecord::Base.connection_handlers[:reading].clear_all_connections!`
you now call
`ActiveRecord::Base.connection_handler.clear_all_connections!` which
will clear based on current role set by a `connected_to` block. Outside
the context of a `connected_to` block, `clear_all_connections!` can take
an optional parameter to clear specific connections by role.

The major changes in this PR are:

* We introduced a `legacy_connection_handling` configuration option that
is set to true by default. It will be set to `false` for all new
applications.
* In the new connection handling there will be one only connection
handler. Previously there was a connection handler for each role. Now
the role is stored in the `PoolManager`. In order to maintain backwards
compatibility we introduced a `LegacyPoolManager` to avoid duplicate
conditionals. See diagram in PR body for changes to connection
management.
* `connected_to` will now use a stacked concurrent map to keep track of
the connection for each class. For each opened block the `class`,
`role`, and `shard` will be added to the stack, when the block is exited
the `class`, `role`, `shard` array will be removed from the stack.
* With these changes `ActiveRecord::Base.connected_to` will remain
global. If called all connections in the block will use the `role` and
`shard` that was switched to. If called with a parent class like
`AnimalsRecord.connected_to` only models under `AnimalsRecord` will be
switched and everything else will remain the same.

Examples:

Given an application we have a `User` model that inherits from
`ApplicationRecord` and a `Dog` model that inherits from
`AnimalsRecord`. `AnimalsRecord` and `ApplicationRecord` have writing
and reading connections as well as shard `default`, `one`, and `two`.

```ruby
ActiveRecord::Base.connected_to(role: :reading) do
  User.first # reads from default replica
  Dog.first # reads from default replica

  AnimalsRecord.connected_to(role: :writing, shard: :one) do
    User.first # reads from default replica
    Dog.first # reads from shard one primary
  end

  User.first # reads from default replica
  Dog.first # reads from default replica

  ApplicationRecord.connected_to(role: :writing, shard: :two) do
    User.first # reads from shard two primary
    Dog.first # reads from default replica
  end
end
```

Things this PR does not solve:

* Currently there is no API for swapping more than one but not all
connections. Apps with many primaries may want to swap 3 but not all 10
connections. We plan to build an API for that in a followup PR.
* The middleware remains the same and is using the global switching
methods. Therefore at this time to use this new feature applications
must manually switch connections. We will also address this in a
followup PR.
* The `schema_cache` is currently on the `PoolConfig`. We plan on trying
to move this up to the `PoolManager` or elsewhere later on so each
`PoolConfig` doesn't need to hold a reference to the `schema_cache`.

Co-authored-by: John Crepezzi <john.crepezzi@gmail.com>
@eileencodes eileencodes force-pushed the granular-role-and-shard-swapping branch from df6a8e0 to 31461d8 Compare October 28, 2020 19:18
@eileencodes
Copy link
Member Author

Spent some time testing this today on our app and found a couple of small issues. Mainly the deprecations around the methods on connection handler were incorrect - we have some code in fixtures that expects to not call clear_all_connections on all types of connections. After looking at it more closely I realized that changing the behavior of these methods wasn't required to implement this so in the interest of a smaller PR and a faster PR (deprecations can slow down code quite significantly) I've removed all the deprecations and updated the methods to use current_role instead of requiring passing a role in. This will also mean that these methods behave as expected in a connected_to block - if you're in reading all the methods should respond as such. I will merge this once the build is passing.

@eileencodes eileencodes force-pushed the granular-role-and-shard-swapping branch 2 times, most recently from 2a30666 to 69a5fa9 Compare October 28, 2020 22:29
This is a separate commit because I want it to be easy to revert if we
change our minds. After some discussion I think it is confusing that you
could swap shard but not role granularly in legacy mode. This change
forces users to always either have global swapping until moved off
legacy mode. This will prevent a situation where `AnimalsBase` can
change the shard granularly but the role globally.
@eileencodes eileencodes force-pushed the granular-role-and-shard-swapping branch from 69a5fa9 to 6b110d7 Compare October 29, 2020 12:58
@eileencodes eileencodes merged commit 60d5928 into rails:master Oct 29, 2020
@eileencodes eileencodes deleted the granular-role-and-shard-swapping branch October 29, 2020 13:09
eileencodes added a commit to eileencodes/rails that referenced this pull request Oct 29, 2020
eileencodes added a commit to eileencodes/rails that referenced this pull request Nov 2, 2020
Now that we have implemented granular connection swapping in rails#40370
we need a new API that will allow connections to multiple databases. The
reason we need this API is it will prevent deep nesting in cases where
we know that we want 3 of our 5 databases to connect to reading and
leave the rest on writing.

With this API, instead of writing:

```ruby
AnimalsRecord.connected_to(role: :reading) do
  MealsRecord.connected_to(role: :reading) do
    Dog.first # read from animals replica
    Dinner.first # read from meals replica
    Person.first # read from primary writer
  end
end
```

This API would allow you to write:

```ruby
ActiveRecord::Base.connected_to_many([AnimalsRecord, MealsRecord], role: :reading) do
  Dog.first # read from animals replica
  Dinner.first # read from meals replica
  Person.first # read from primary writer
end
```

This would come in especially handy for deeper nesting past 2 databases.

Co-authored-by: John Crepezzi <john.crepezzi@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants