Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For fixtures share the connection pool when there are multiple handlers #34773

Conversation

Projects
None yet
4 participants
@eileencodes
Copy link
Member

commented Dec 21, 2018

This PR "fixes" the fixtures so that when there are 2 connections (primary & replica) to a database the tests can see data inserted during the test run on the primary connections. We can't actually use replica connections in tests because Rails keeps a transaction open in order to rollback. I've explained in detail below.

This is kind of hacky but I'm not sure there's a better way to make sure the replica connections can see data inserted on the primary connections.

cc/ @matthewd @tenderlove @rafaelfranca


In an application that has a primary and replica database the data
inserted on the primary connection will not be able to be read by the
replica connection.

In a test like this:

test "creating a home and then reading it" do
  home = Home.create!(owner: "eileencodes")

  ActiveRecord::Base.connected_to(role: :default) do
    assert_equal 3, Home.count
  end

  ActiveRecord::Base.connected_to(role: :readonly) do
    assert_equal 3, Home.count
  end
end

The home inserted in the beginning of the test can't be read by the
replica database because when the test is started a transasction is
opened byy setup_fixtures. That transaction remains open for the
remainder of the test until we are done and run teardown_fixtures.

Because the data isn't actually committed to the database the replica
database cannot see the data insertion.

I considered a couple ways to fix this. I could have written a database
cleaner like class that would allow the data to be committed and then
clean up that data afterwards. But database cleaners can make the
database slow and the point of the fixtures is to be fast.

In GitHub we solve this by sharing the connection pool for the replicas
with the primary (writing) connection. This is a bit hacky but it works.
Additionally since we define replica? || preventing_writes? as the
code that blocks writes to the database this will still prevent writing
on the replica / readonly connection. So we get all the behavior of
multiple connections for the same database without slowing down the
database.

In this PR the code loops through the handlers. If the handler doesn't
match the default handler then it retrieves the connection pool from the
default / writing handler and assigns the reading handler's connections
to that pool.

Then in enlist_fixture_connections it maps all the connections for the
default handler because all the connections are now available on that
handler so we don't need to loop through them again.

The test uses a temporary connection pool so we can test this with
sqlite3_mem. This adapter doesn't behave the same as the others and
after looking over how the query cache test works I think this is the
most correct. The issues comes when calling connects_to because that
establishes new connections and confuses the sqlite3_mem adapter. I'm
not entirely sure why but I wanted to make sure we tested all adapters
for this change and I checked that it wasn't the shared connection code
that was causing issues - it was the connects_to code.

@eileencodes eileencodes added this to the 6.0.0 milestone Dec 21, 2018

@eileencodes eileencodes changed the title Share fixture connections with multiple handlers Share the connection pool when there are multiple handlers Dec 21, 2018

@eileencodes eileencodes changed the title Share the connection pool when there are multiple handlers For fixtures share the connection pool when there are multiple handlers Dec 21, 2018

@eileencodes eileencodes force-pushed the eileencodes:share-fixture-connections-with-multiple-handlers branch 2 times, most recently from 5a85e9e to e104906 Jan 3, 2019

Share the connection pool when there are multiple handlers
In an application that has a primary and replica database the data
inserted on the primary connection will not be able to be read by the
replica connection.

In a test like this:

```
test "creating a home and then reading it" do
  home = Home.create!(owner: "eileencodes")

  ActiveRecord::Base.connected_to(role: :default) do
    assert_equal 3, Home.count
  end

  ActiveRecord::Base.connected_to(role: :readonly) do
    assert_equal 3, Home.count
  end
end
```

The home inserted in the beginning of the test can't be read by the
replica database because when the test is started a transasction is
opened byy `setup_fixtures`. That transaction remains open for the
remainder of the test until we are done and run `teardown_fixtures`.

Because the data isn't actually committed to the database the replica
database cannot see the data insertion.

I considered a couple ways to fix this. I could have written a database
cleaner like class that would allow the data to be committed and then
clean up that data afterwards. But database cleaners can make the
database slow and the point of the fixtures is to be fast.

In GitHub we solve this by sharing the connection pool for the replicas
with the primary (writing) connection. This is a bit hacky but it works.
Additionally since we define `replica? || preventing_writes?` as the
code that blocks writes to the database this will still prevent writing
on the replica / readonly connection. So we get all the behavior of
multiple connections for the same database without slowing down the
database.

In this PR the code loops through the handlers. If the handler doesn't
match the default handler then it retrieves the connection pool from the
default / writing handler and assigns the reading handler's connections
to that pool.

Then in enlist_fixture_connections it maps all the connections for the
default handler because all the connections are now available on that
handler so we don't need to loop through them again.

The test uses a temporary connection pool so we can test this with
sqlite3_mem. This adapter doesn't behave the same as the others and
after looking over how the query cache test works I think this is the
most correct. The issues comes when calling `connects_to` because that
establishes new connections and confuses the sqlite3_mem adapter. I'm
not entirely sure why but I wanted to make sure we tested all adapters
for this change and I checked that it wasn't the shared connection code
that was causing issues - it was the `connects_to` code.

@eileencodes eileencodes force-pushed the eileencodes:share-fixture-connections-with-multiple-handlers branch from e104906 to b24bfcc Jan 3, 2019

@eileencodes

This comment has been minimized.

Copy link
Member Author

commented Jan 3, 2019

I fixed the sqlite memory tests by creating a temporary connection pool. This isn't necessary for the other adapters but the sqlite memory ones break if I create new connection handlers and connections while running. I didn't want to skip sqlite memory for this because it's kind of hacky and I'd like all adapters to be testing it. The temp connection pool is the same method the query cache tests use.

@eileencodes eileencodes merged commit 725c642 into rails:master Jan 4, 2019

2 checks passed

codeclimate All good!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@eileencodes eileencodes deleted the eileencodes:share-fixture-connections-with-multiple-handlers branch Jan 4, 2019

@rafaelfranca

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

How this works in a multi-tenant environment? The two connections needs to have different data. In the tenant 1, user A should exist but in the tenant 2, only user B should exists.

Does the solution here makes harder to implement a multi-tenant approach to fixtures like #34246?

@eileencodes

This comment has been minimized.

Copy link
Member Author

commented Jan 4, 2019

A multi-tenant environment would have different specification names for tenant 1 and tenant 2 right? This only ties a primary and replica together if they have the same connection specification name.

So a case where we need multiple directories for fixtures is still broken but this at least fixes the fact that a replica can't see a primary in tests.

@rafaelfranca

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

Thanks, that makes sense. In the multi-tenant case we would have multiple primaries, so we still need to solve that problem. I'll see with @gmcgibbon what is missing now that the API is defined.

handler.connection_pool_list.each do |pool|
name = pool.spec.name
writing_connection = writing_handler.retrieve_connection_pool(name)
handler.send(:owner_to_pool)[name] = writing_connection

This comment has been minimized.

Copy link
@kamipo

kamipo Jan 30, 2019

Member

Why do we need to share the writing connection pool for all handlers?

I've found the change when creating the example at #35073 (comment), since the TestFixtures is included in the ActiveSupport::TestCase, the overwriting connection pool by the writing connection pool is happened in the before_setup, so it is hard to testing replica connections in their apps.

ActiveSupport.on_load(:active_support_test_case) do
include ActiveRecord::TestDatabases
include ActiveRecord::TestFixtures
self.fixture_path = "#{Rails.root}/test/fixtures/"
self.file_fixture_path = fixture_path + "files"
end

Unless suppressing the effect of setup_shared_connection_pool, the following test won't be passed.

diff --git a/test/test_helper.rb b/test/test_helper.rb
index 0ff12e7..5d82a8b 100644
--- a/test/test_helper.rb
+++ b/test/test_helper.rb
@@ -10,4 +10,11 @@ class ActiveSupport::TestCase
   fixtures :all
 
   # Add more helper methods to be used by all tests here...
+
+  # `enlist_fixture_connections` replaces connection pools in non-default handlers
+  # by default writer connection pool.
+  # We can't test `:reading` connection unless suppressing the effect of the method for now.
+  def enlist_fixture_connections
+    []
+  end
 end
diff --git a/test/models/post_test.rb b/test/models/post_test.rb
index 6d9d463..f2523ce 100644
--- a/test/models/post_test.rb
+++ b/test/models/post_test.rb
@@ -1,6 +1,11 @@
 require 'test_helper'
 
 class PostTest < ActiveSupport::TestCase
+  test "replica?" do
+    pool = Post.connection_handlers[:reading].connection_pool_list.first
+    assert pool.connection.replica?
+  end
+
   # test "the truth" do
   #   assert true
   # end

This comment has been minimized.

Copy link
@matthewd

matthewd Jan 30, 2019

Member

If we don't combine them, then one connection will be inside the fixture transaction and see the fixtures (& any changes to them), and the other will see empty tables.

This comment has been minimized.

Copy link
@kamipo

kamipo Jan 30, 2019

Member

I see... so for now probably people need to restore (re-overwrite) pools in handlers if their app uses read-write-splitting or sharding and want to test that explicitly.

This comment has been minimized.

Copy link
@matthewd

matthewd Jan 30, 2019

Member

I haven't checked, but I think we should only do this when transactional fixtures are enabled -- so the right way to properly test splitting would be to disable that. Then your writes will really hit your database, and the reads in the other connection will be able to see them.

This comment has been minimized.

Copy link
@kamipo

kamipo Jan 30, 2019

Member

Yeah, that approach sounds good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.