Rails 6 RC2 multi-db issue under multithreaded environment #36830

kiyot · 2019-08-01T06:13:53Z

When you use ActiveRecord::Middleware::DatabaseSelector with multithreaded puma, write access to db randomly fails with ActiveRecord::ReadOnlyError: Write query attempted while in readonly mode.

ActiveRecord::Middleware::DatabaseSelector has a mechanism to redirect read access to primary db for a moment. During that period, a flag named @prevent_writes on ActiveRecord::ConnectionAdapters::ConnectionHandler objects is set, and write access to db will raise ActiveRecord::ReadOnlyError.

However, ActiveRecord::ConnectionAdapters::ConnectionHandler objects are shared across threads. Changing @prevent_writes in one thread will cause unexpected effects on requests handled by the other threads.

Steps to reproduce

Clone this repo, and follow the instructions.
https://github.com/kiyot/rails6-multi-db-prevent-write-race

Expected behavior

Write access to db in writing role should succeed.

Actual behavior

Write access to db in writing role randomly fails with ActiveRecord::ReadOnlyError: Write query attempted while in readonly mode under multithreaded environment.

System configuration

Rails version:
6.0.0 RC2

Ruby version:
2.6.3

The text was updated successfully, but these errors were encountered:

Previously if an app attempts to do a write inside a read request it will be impossilbe to switch back to writing to the primary. This PR adds an argument to the `while_preventing_writes` so that we can make sure to turn it off if we're doing a write on a primary. Fixes rails#36830 Co-authored-by: John Crepezzi <john.crepezzi@gmail.com>

kiyot · 2019-08-03T11:51:00Z

@eileencodes Thank you for the fix. But #36843 does not solve the problem completely, and there are still other error cases.

I have created a new repo for reproduction
https://github.com/kiyot/rails6-multi-db-prevent-write-race-2

Suggestion

I think the root problem is that @prevent_writes flag is shared across multiple threads and requests.
This flag should be local in each request, and should be moved to somewhere only one request can access. I guess ActiveRecord::Middleware::DatabaseSelector::Resolver would be a good candidate to have @prevent_writes flag.

eileencodes · 2019-08-05T13:31:04Z

I guess ActiveRecord::Middleware::DatabaseSelector::Resolver would be a good candidate to have @prevent_writes flag.

That won't work since it's not meant to be used only in the middleware.

I'll work on a solution today.

As demonstrated in the test added and in rails#36830 the code that prevents writes wasn't thread safe. If one thread does a read, then another does a write, and then another does a read the second read will cause the first write to be unwriteable. This change removes the instance variable and instead uses a getter/setter on Thread.current[:prevent_writes] for the connection handler to set whether writes are allowed. Fixes rails#36830

jsierles · 2019-08-12T18:19:10Z

We have been testing #36868 with Puma 4, and still are seeing writes going to the replica. In our case, we're using an around_filter in a specific controller - the only one that needs to use the replica. The contents of the filter:

ActiveRecord::Base.connected_to(database: :production_replica) do
      yield
    end

We'll dig in more to see what's happening. But I thought we should mention it here in case this leads to any clues.

kiyot · 2019-08-20T07:56:44Z

@jsierles I encountered the same issue when I use connected_to directly.

According to f2de448, it looks like you should use ActiveRecord::Base.connection_handler.while_preventing_writes(false), too.

eileencodes · 2019-08-20T12:08:08Z

We need more information from you @jsierles before I can determine there's an issue here. For one are you using the middleware? If you're using the middleware you shouldn't use connected_to an around filter to also send traffic to specific databases.

I'd need to see your code to be sure but it sounds like more is going on here. You shouldn't need to set while_preventing_writes directly when using the replica, but since you're using an around_filter more could be going on there.

To dig in more I need an app and reproduction steps.

krames · 2019-08-21T14:06:28Z

@eileencodes I investigated this issue with @jsierles. From my initial debugging it appears that the swap_connection_handler in connection_handling.rb is not swapping back to the original connection. This issue does not appear related to your thread safety write patch.

We ended up fixing the issue by using the connects_to in our ApplicationRecord class and accessing connections using role based access.

class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
 
  connects_to database: { writing: :primary, reading: :primary_replica }
end

Unfortunately, I haven't been able to find time to hunt down the root cause, but hopefully this comment helps someone.

eileencodes · 2019-08-21T14:32:27Z

The provided example is the right way - I'm curious how you were connecting/switching connections before that change?

krames · 2019-08-21T15:30:11Z

@eileencodes we had a controller with the following code

class ReporterController
  around_action :read_from_replica, unless: -> { ENV["SKIP_DB_REPLICA"] }
  
  def read_from_replica
    ActiveRecord::Base.connected_to(database: :production_replica) do
      yield
    end
  end

  def trip
    # controller logic here
  end
end

In our database.yml we had the following:

  primary:
	    <<: *default
	    url: <%= (ENV['DATABASE_CONNECTION_POOL_URL'] || ENV['DATABASE_URL']).sub(/^postgres/, "postgis") %>
	  primary_replica:
	    <<: *default
	    url: <%= (ENV['REPLICA_CONNECTION_POOL_URL'] || ENV['REPLICA_DATABASE_URL'] || ENV['DATABASE_URL'] || "").sub(/^postgres/, "postgis") %>
	    replica: true

The only place we used the replica was in the above controller. Everything else was unchanged.

I was able to duplicate this issue on the console on my local machine by point the replica to my test database and executing the following in a console:

ActiveRecord::Base.connected_to(database: :production_replica) do
  User.count
end

That returned a count from the replica.

If I then did subsequently executed the following in the same console, I again got the same count as the replica:

User.count

I had assumed it would use the primary because this line was executed outside of the Base.connected_to block. Executing User.count on a fresh console would return count for the primary.

Is this a bug or a misunderstanding of how multiple databases are suppose to work in Rails 6?

eileencodes · 2019-08-21T15:44:58Z

You don't want to use the following except for one off connections (like in a script to connect to a slow replica or something like that). This will re-establish the connection every time it's called:

    ActiveRecord::Base.connected_to(database: :production_replica) do
      yield
    end

The other problem is since this is a one off and doesn't swap the role you're basically telling AR Base to connect to the production replica. If you need to swap back to the primary you'll need to do a new block.

So using the Role is the right way to use multiple databases in your controller. You also shouldn't need to do an around_filter since the middleware is supposed to choose the right database in your controller requests. Any custom code you need to push a request to a replica or a primary should be added in your app and then initialized with the middleware.

I think there may be some missing docs here that I can add to help make this easier.

krames · 2019-08-21T15:50:38Z

@eileencodes Thanks for the explanation!

kiyot · 2019-08-22T02:16:00Z

@eileencodes

I am using the middleware and it works fine in most cases. But there are a few controllers that writes to a primary in GET requests.

I added connected_to(role: :writing) there, and it turned out to require this workaround. It seems to work, but maybe I'm on the wrong path.

Any custom code you need to push a request to a replica or a primary should be added in your app and then initialized with the middleware.

What is the supposed way to do this? Creating custom resolver class?

noma4i · 2019-08-22T08:37:10Z

In this case connected_to doesn't need to be a block I guess? more likely good idea is to name it connect_to! as a method w/o yield.

jorisvh · 2019-08-26T13:34:42Z

I have the same issue. We use the middleware but some of our controllers also use the impressionist gem, so some GET requests will try to write. This is what we do (at the end of our index or show methods) :

ActiveRecord::Base.connected_to(database: :master) do
      impressionist(...)
end

It will raise the following error maybe once in a thousand times :

Write query attempted while in readonly mode: INSERT INTO "impressions" ("impressionable_type", ...

pinzonjulian · 2019-08-26T16:38:47Z

My team has experienced the same issue and we've found something

We have a GET action that writes a log to the primary DB. When we do a plain old get this works just fine:

ActiveRecord::Base.connected_to(role: :writing) do
  # code that writes to the primary DB
end

However when the journey that leads to that action comes from a redirect from a POST or PATCH action, it breaks.

# ActiveRecord::ReadOnlyError: Write query attempted while in readonly mode

If we add @kiyot 's solution it works:

ActiveRecord::Base.connected_to(role: :writing) do
  ActiveRecord::Base.connection_handler.while_preventing_writes(false) do
    # code that writes to the primary DB
  end
end

@kiyot, @jorisvh , is this your case too?

We created this provisional method in ApplicationRecord that can be easily be found and replaced when there's a solution for this:

def self.provisional_connect_to(role: :writing, while_preventing_writes: false )
    ActiveRecord::Base.connected_to(role: role) do
      ActiveRecord::Base.connection_handler.while_preventing_writes(while_preventing_writes) do
        yield
      end
    end
  end

We'll try to recreate this today in a new app and let you know.

PS: Is this the correct issue to have this discussion in? I'm not sure if this relates to the main theme of this issue.

eileencodes · 2019-08-26T16:49:37Z

AFAICT this is not a bug. If you're using the middleware AND setting your own connected_to manually you'll need to make sure you're turning writes back on when you do the manual switch with connected_to.

If a user is using the middleware for swapping database connections and manually calling `connected_to` in a controller/model/etc without calling `while_preventing_writes(false)` there is potential for a race condition where writes will be blocked. While the user could _just_ call `while_preventing_writes` in the same place they call `connected_to` this would mean that all cases need to call two methods. This PR changes `connected_to` to call `while_preventing_writes` directly. By default we'll assume you don't want to prevent writes, but if called with `connected_to(role: :writing, prevent_writes: true)` or from the middleware (which calls `connected_to` this way) the writes will be blocked. For replicas, apps should use readonly users to enforce not writing rather than `while_preventing_writes` directly. Should fix the remaining issues in rails#36830

vitobotta · 2019-09-04T19:59:48Z

I am trying the read/write splitting and am having this problem with Devise (see this), because it makes changes to the database even with some GET requests. I tried @kiyot 's change in an around filter and the tests pass/things seem to work, but @eileencodes do we still need to do something to restore the default behaviour even when using a block like this?

ActiveRecord::Base.connected_to(role: :writing) do
  ActiveRecord::Base.connection_handler.while_preventing_writes(false) do
    # code that writes to the primary DB
  end
end

eileencodes · 2019-09-04T20:06:03Z

@vitobotta I don't understand the question. Are you asking if you need to use while_preventing_writes? Or are you asking if there's anything in addition to while_preventing_writes?

I fixed Rails so you don't need to do that in #37065 but it's not in a released version yet.

vitobotta · 2019-09-04T20:42:46Z

Hi @eileencodes, basically I would like to know if it's safe to use that code snippet for now, until 6.0.1 is released with the fix you mentioned. Are there any negative consequences in using that code? Thanks!

rept · 2020-11-10T15:39:18Z

I'm still confused.

In this case connected_to doesn't need to be a block I guess? more likely good idea is to name it connect_to! as a method w/o yield.

This one got me as well. I have a controller method that lets users download a CSV from our datawarehouse.

This is wrapped in a block like this:

ActiveRecord::Base.connected_to(database: :datawarehouse) do
  result = ActiveRecord::Base.connection.execute(
      %q(
      select dc.device_id, pp.x, pp.y, pp.confidence, semantic_map, internal_name as corrected_person_position_type_id, case when vs.id is null then false else true end as validation_set
      ...
    ))
end

The file downloads fine, but subsequent actions are going to the datawarehouse DB instead of primary. I would expect that once out the block it would return to the primary, otherwise what's the use of the do block ?

@eileencodes it this fixed in Rails 6.1 (or am I using it wrong?)

BTW I'm using Puma.

eileencodes · 2020-11-10T16:37:50Z

I completely deprecated the database kwarg on connected_to in 6.1 because it's dangerous to use in requests and that's where everyone was using it. It's a feature that folks were trying to use for sharding and it does not work for that.

If you want to use sharding, you should use the shard key in connected_to on 6.1. This is all documented in the guides. The database kwarg will be removed in 6.2 with no replacement, I don't recommend using it at all.

eileencodes · 2020-11-10T16:47:15Z

To explain more clearly: the problem with using the database kwarg is it's establishing connections on the fly which will replace the existing connections with the same connection_specification_name. In 6.0 there was no way to establish more than one connection per handler per class. That's fixed in 6.1 (if you're using sharding), but not for the database kwarg since it really doesn't do what anyone wants it to do. It was supposed to be for one off connections in a script outside your app (ie for connection to a replica that allows slow requests). If you use it in a request it will replace your connection for "ActiveRecord::Base" / "primary" connection_specification_name which is why you're not returning to the primary at the end of the block - it no longer exists in the pool.

rept · 2020-11-10T18:19:13Z

To explain more clearly: the problem with using the database kwarg is it's establishing connections on the fly which will replace the existing connections with the same connection_specification_name. In 6.0 there was no way to establish more than one connection per handler per class. That's fixed in 6.1 (if you're using sharding)

Ok thanks for the explanation. Good thing it was removed then. So if I get it correctly I would be able to do this like this in the controller (after I defined the shard in database.yml of course):

ActiveRecord::Base.connected_to(role: :reading, shard: :dwh) do
  result = ActiveRecord::Base.connection.execute('some complex query')
end

eileencodes self-assigned this Aug 1, 2019

eileencodes added this to the 6.0.0 milestone Aug 1, 2019

eileencodes added the activerecord label Aug 1, 2019

eileencodes mentioned this issue Aug 2, 2019

Add ability to unset preventing writes #36843

Merged

eileencodes closed this as completed in #36843 Aug 2, 2019

eileencodes reopened this Aug 5, 2019

eileencodes mentioned this issue Aug 6, 2019

Fix thread safety of prevent_writes #36868

Merged

eileencodes closed this as completed in #36868 Aug 6, 2019

eileencodes mentioned this issue Aug 28, 2019

Call while_preventing_writes from connected_to #37065

Merged

cestbalez mentioned this issue Sep 9, 2019

Write queries in Sidekiq jobs/workers fail randomly in Rails 6 when using multiple databases sidekiq/sidekiq#4251

Closed

trevorrjohn mentioned this issue Sep 9, 2019

ActiveRecord::Base.connected_to(database: :replica) { } causes connections to replica outside of block #37158

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rails 6 RC2 multi-db issue under multithreaded environment #36830

Rails 6 RC2 multi-db issue under multithreaded environment #36830

kiyot commented Aug 1, 2019

kiyot commented Aug 3, 2019

eileencodes commented Aug 5, 2019

jsierles commented Aug 12, 2019

kiyot commented Aug 20, 2019

eileencodes commented Aug 20, 2019

krames commented Aug 21, 2019

eileencodes commented Aug 21, 2019

krames commented Aug 21, 2019

eileencodes commented Aug 21, 2019 •

edited

krames commented Aug 21, 2019

kiyot commented Aug 22, 2019

noma4i commented Aug 22, 2019

jorisvh commented Aug 26, 2019

pinzonjulian commented Aug 26, 2019 •

edited

eileencodes commented Aug 26, 2019 •

edited

vitobotta commented Sep 4, 2019

eileencodes commented Sep 4, 2019

vitobotta commented Sep 4, 2019

rept commented Nov 10, 2020

eileencodes commented Nov 10, 2020 •

edited

eileencodes commented Nov 10, 2020

rept commented Nov 10, 2020

Rails 6 RC2 multi-db issue under multithreaded environment #36830

Rails 6 RC2 multi-db issue under multithreaded environment #36830

Comments

kiyot commented Aug 1, 2019

Steps to reproduce

Expected behavior

Actual behavior

System configuration

kiyot commented Aug 3, 2019

Suggestion

eileencodes commented Aug 5, 2019

jsierles commented Aug 12, 2019

kiyot commented Aug 20, 2019

eileencodes commented Aug 20, 2019

krames commented Aug 21, 2019

eileencodes commented Aug 21, 2019

krames commented Aug 21, 2019

eileencodes commented Aug 21, 2019 • edited

krames commented Aug 21, 2019

kiyot commented Aug 22, 2019

noma4i commented Aug 22, 2019

jorisvh commented Aug 26, 2019

pinzonjulian commented Aug 26, 2019 • edited

eileencodes commented Aug 26, 2019 • edited

vitobotta commented Sep 4, 2019

eileencodes commented Sep 4, 2019

vitobotta commented Sep 4, 2019

rept commented Nov 10, 2020

eileencodes commented Nov 10, 2020 • edited

eileencodes commented Nov 10, 2020

rept commented Nov 10, 2020

eileencodes commented Aug 21, 2019 •

edited

pinzonjulian commented Aug 26, 2019 •

edited

eileencodes commented Aug 26, 2019 •

edited

eileencodes commented Nov 10, 2020 •

edited