-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defer verification of database connections #44576
Conversation
43e448f
to
80c7ca8
Compare
4a45007
to
e02e06e
Compare
Still draft because tests are very much needed, but I think this is at least in a state for us to decide whether we're comfortable with the API changes I'm making, etc. cc @yahonda as I can only request reviews from committers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor "tactical" comments.
The overall approach seem good to me so far.
activerecord/lib/active_record/connection_adapters/postgresql/quoting.rb
Show resolved
Hide resolved
activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb
Show resolved
Hide resolved
910340d
to
8ff11be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments, but this mostly looks good and correct to me. I am wondering if some of the refactoring changes can be pulled out into separate PRs so it's easier to read some of the bigger changes.
@@ -145,6 +146,10 @@ def use_metadata_table? | |||
@config.fetch(:use_metadata_table, true) | |||
end | |||
|
|||
def connection_retries | |||
(@config[:connection_retries] || 3).to_i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 3 a number you chose or was it previously a default somewhere? If previously retries were 0, is it surprising behavior or potentially bad to retry if that's unexpected?
My main concern is that this is defaulting to 3 and if someone wants to turn it off they need to know it exists AND that it's set to 3 by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arbitrary choice.
On reflection I think I'll make it lower (1, maybe 2), but I do think it should be greater than zero: the whole idea is that this is a new better default behaviour, which is not believed to have any downsides for existing users. (There is no "review your code for X pattern before turning this on" sort of recommendation, so I don't think there's any value in applying it as a versioned default.)
Notably, we only retry on queries we're absolutely sure are safe (like "BEGIN"), or [in future] if we're absolutely sure the query didn't make it to the database server -- we'll never automatically retry something that has any chance of having executed on the server.
The ability to know the option exists and set it to zero is mostly just as a last-ditch safety mechanism, ensuring there is a workaround if some subset of users discover an unanticipated flaw in the logic that claims this is universally safe.
# In the latter case, we really ought to guarantee the inner | ||
# call will not reconnect (which would interfere with the | ||
# still-yielded connection in the outer block), but we currently | ||
# provide no special enforcement there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commentary reads like it's unsure what will really happen or that Rails should be doing more and isn't. Can you clarify this paragraph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's an (internal, implementation-scoped) comment noting a limitation of the current implementation. My intention was basically to acknowledge there's an obscure corner case that we're not handling; if a future reader is investigating an issue of that shape [because it's more likely than I expected, or because of other implementation changes], it's a helpful breadcrumb that they're on the right trail. I'll try to rephrase.
else | ||
# We don't know whether the connection is okay, but it also doesn't matter: | ||
# we wouldn't be able to reconnect anyway. We're just going to run our query | ||
# and hope for the best. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you did this so it's easy to comment but having 2 empty conditions feels odd and I think some of the comments aren't necessary. I think this can be simplified to the following:
if reconnectable
unless allow_retry
verify!
end
end
Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's verify! unless @verified || !reconnectable || allow_retry
... it just seemed important to me to spell out the three very different reasons we might choose not to verify.
# Returns a raw connection for internal use with methods that are known | ||
# to both be thread-safe and not rely upon actual server communication. | ||
# This is useful for e.g. string escaping methods. | ||
def any_raw_connection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this name because it feels like "get anything, literally any raw connection" but it's a specific raw connection right? I don't have a better idea other than continuing to use @raw_connection
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is partly a bleed-forward from #44591, where it becomes @raw_connection || connect
. Adding it here, with its more defined cousins valid_raw_connection
and with_raw_connection
, helps in auditing existing users of @raw_connection
: most of them should be [and in this PR, are] using with_raw_connection
, but completing the set forces us to review each one, and where appropriate, explicitly opt in to this "I'll take absolutely anything you can offer" behaviour.
Contrasted with "valid", I think "anything, literally any" is a pretty fair representation of how this qualifies what it's returning. "A connection you already know is invalid" is rather extreme... I did briefly consider calling this even_an_invalid_raw_connection
, but that felt a bit too pessimistic. (It's still an instance method, and IMO that scoping is sufficient to convey that it's still only our raw connection, just in any possible state.)
@@ -109,7 +112,7 @@ def default_insert_value(column) | |||
end | |||
|
|||
def last_inserted_id(result) | |||
@raw_connection.last_id | |||
@raw_connection&.last_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So before we had a guaranteed raw_connection but now we might not have one? Does this change behavior of last_inserted_id
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is technically a tiny bleed-over of #44591 too -- this change alone doesn't introduce any occasions where @raw_connection
can be nil. This is a notable occasion where we do want to still use the raw ivar, though, and not [the upcoming definition of] any_raw_connection
: if we don't currently have a connection, there's no point establishing one just to ask what it last inserted, because the answer will definitely be nil.
This is not a behaviour change: a connection that hasn't yet performed an insert will have a nil last_id
.
mark_transaction_written_if_write(sql) | ||
|
||
type_casted_binds = type_casted_binds(binds) | ||
|
||
log(sql, name, binds, type_casted_binds, async: async) do | ||
ActiveSupport::Dependencies.interlock.permit_concurrent_loads do | ||
with_raw_connection do |conn| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this new method, much easier to understand what's going on in the block.
I think I already pulled out the more cleanly separable changes into preceding PRs (#44526 #44527 #44530 |
8ff11be
to
1c7f5dc
Compare
bc93b3a
to
dfd9f4e
Compare
So far this is just infrastructure: no errors are considered retryable, and no real queries are marked retryable. Even in this state, we can drop connection verification from checkout, and defer it until the first query is ready. But the more interesting within-reach win will be when 'BEGIN' is retryable, and we can recognise connectivity-related exceptions.
We're still ultimately accessing the raw connection directly, but this is necessary to keep the statement pool working across full reconnects.
We may be restoring a connection after #with_raw_connection has already passed #materialize_transactions, so it's important we eagerly return things to exactly the state they had before.
Connectivity issues need a reconnect to be retried, but there are also in-server errors that can also be retried (lock timeouts and deadlocks). To handle this better, we move translation into #with_raw_connection: it's a much better representation of the boundary between the underlying adapter library and Active Record's abstractions anyway. (It doesn't know about the query being run, though, so we let #log enrich that into the exception later.)
Building on the work done in rails#44576 and rails#44591, we extend the logic that automatically reconnects broken db connections to take into account a timeout limit. This ensures that retries + reconnects are slow-query aware, and that we don't retry queries if a given amount of time has already passed since the query was first tried. This value will default to 5 seconds, but can be adjusted via the `connection_retry_timeout` config.
This works with either `adapter.materialize_transactions` before the savepoint creation or by skipping the savepoint entirely and reverting the migration by raising `ActiveRecord::Rollback` Close #179 Ref: rails/rails@57bc28f rails/rails#44576
This works with either `adapter.materialize_transactions` before the savepoint creation or by skipping the savepoint entirely and reverting the migration by raising `ActiveRecord::Rollback` Close #179 Ref: rails/rails@57bc28f rails/rails#44576
This works with either `adapter.materialize_transactions` before the savepoint creation or by skipping the savepoint entirely and reverting the migration by raising `ActiveRecord::Rollback` Close #179 Ref: rails/rails@57bc28f rails/rails#44576
This works with either `adapter.materialize_transactions` before the savepoint creation or by skipping the savepoint entirely and reverting the migration by raising `ActiveRecord::Rollback` Close #179 Ref: rails/rails@57bc28f rails/rails#44576
This works with either `adapter.materialize_transactions` before the savepoint creation or by skipping the savepoint entirely and reverting the migration by raising `ActiveRecord::Rollback` Close #179 Ref: rails/rails@57bc28f rails/rails#44576
This works with either `adapter.materialize_transactions` before the savepoint creation or by skipping the savepoint entirely and reverting the migration by raising `ActiveRecord::Rollback` Close #179 Ref: rails/rails@57bc28f rails/rails#44576
This works with either `adapter.materialize_transactions` before the savepoint creation or by skipping the savepoint entirely and reverting the migration by raising `ActiveRecord::Rollback` Close #179 Ref: rails/rails@57bc28f rails/rails#44576
This works with either `adapter.materialize_transactions` before the savepoint creation or by skipping the savepoint entirely and reverting the migration by raising `ActiveRecord::Rollback` Close #179 Ref: rails/rails@57bc28f rails/rails#44576
thank you! |
We can't call @raw_connection right away anymore, because it is nil until it is used after the changes in rails/rails#44576. Also we need to have @lock established in order to be able to use `with_raw_connection`. In the abstract_adapter implementation, that happens after the arel_visitor is set, but in oracle enhanced, we need to use `with_raw_connection` to check the database version in order to pick with visitor to use. Calling lock_thread = nil before super means @lock isi available to set the arel_visitor. PR#44576 also requires an internal_exec_query method that is called by the abstract adpater's exec_query. Naively renaming the oracle-enhanced's exec_query to internal_exec_query makes the test pass, but there is probably more to be done here.
rails/rails#44576 adds `with_raw_connection`. Using that in database_statements instead of @raw_connection mostly works, except it broke 2 tests: * ["should reconnect and execute query if connection is lost and auto retry is enabled"](https://github.com/rsim/oracle-enhanced/blob/deb214decc3799608c8be386e91c6c7531c59793/spec/active_record/connection_adapters/oracle_enhanced/connection_spec.rb#L493) * ["among other things will reconnect and retry if there is a connection error.](https://github.com/rsim/oracle-enhanced/blob/deb214decc3799608c8be386e91c6c7531c59793/spec/active_record/connection_adapters/oracle_enhanced/connection_spec.rb#L500) `with_raw_connection` has a retry and reconnect built into it, but it was catching the oracle connection error. Adjust `retryable_connection_error?` so that both is aware of the OCI errors and reflects the `should_retry?` logic of the `with_retry` that database_statements used before.
We can't call @raw_connection right away anymore, because it is nil until it is used after the changes in rails/rails#44576. Also we need to have @lock established in order to be able to use `with_raw_connection`. In the abstract_adapter implementation, that happens after the arel_visitor is set, but in oracle enhanced, we need to use `with_raw_connection` to check the database version in order to pick with visitor to use. Calling lock_thread = nil before super means @lock isi available to set the arel_visitor. Rails rails/rails#48188 also requires an internal_exec_query method that is called by the abstract adpater's exec_query. Naively renaming the oracle-enhanced's exec_query to internal_exec_query makes the test run, but there is probably more to be done here.
We can't call @raw_connection right away anymore, because it is nil until it is used after the changes in rails/rails#44576 and rails/rails#44591. Also we need to have @lock established in order to be able to use `with_raw_connection`. In the abstract_adapter implementation, that happens after the arel_visitor is set, but in oracle enhanced, we need to use `with_raw_connection` to check the database version in order to pick with visitor to use. Calling lock_thread = nil before super means @lock isi available to set the arel_visitor. Rails rails/rails#48188 also requires an internal_exec_query method that is called by the abstract adpater's exec_query. Naively renaming the oracle-enhanced's exec_query to internal_exec_query makes the oracle enhanced tests run. I also saw errors due to `active?` using @raw_connection. `valid_raw_connection` ends up calling `active?` via `with_raw_connection` and `verify` so that can't be used as a replacement for @raw_connection here. Borrowing from the postgres adapter implementation to get around this.
We can't call @raw_connection right away anymore, because it is nil until it is used after the changes in rails/rails#44576 and rails/rails#44591. Also we need to have @lock established in order to be able to use `with_raw_connection`. In the abstract_adapter implementation, that happens after the arel_visitor is set, but in oracle enhanced, we need to use `with_raw_connection` to check the database version in order to pick with visitor to use. Calling lock_thread = nil before super means @lock isi available to set the arel_visitor. Rails rails/rails#48229 also requires an internal_exec_query method that is called by the abstract adpater's exec_query. Naively renaming the oracle-enhanced's exec_query to internal_exec_query makes the oracle enhanced tests run. I also saw errors due to `active?` using @raw_connection. `valid_raw_connection` ends up calling `active?` via `with_raw_connection` and `verify` so that can't be used as a replacement for @raw_connection here. Borrowing from the postgres adapter implementation to get around this.
We can't call @raw_connection right away anymore, because it is nil until it is used after the changes in rails/rails#44576. Also we need to have @lock established in order to be able to use `with_raw_connection`. In the abstract_adapter implementation, that happens after the arel_visitor is set, but in oracle enhanced, we need to use `with_raw_connection` to check the database version in order to pick with visitor to use. Calling lock_thread = nil before super means @lock isi available to set the arel_visitor. PR#44576 also requires an internal_exec_query method that is called by the abstract adpater's exec_query. Naively renaming the oracle-enhanced's exec_query to internal_exec_query makes the test pass, but there is probably more to be done here.
rails/rails#44576 adds `with_raw_connection`. Using that in database_statements instead of @raw_connection mostly works, except it broke 2 tests: * ["should reconnect and execute query if connection is lost and auto retry is enabled"](https://github.com/rsim/oracle-enhanced/blob/deb214decc3799608c8be386e91c6c7531c59793/spec/active_record/connection_adapters/oracle_enhanced/connection_spec.rb#L493) * ["among other things will reconnect and retry if there is a connection error.](https://github.com/rsim/oracle-enhanced/blob/deb214decc3799608c8be386e91c6c7531c59793/spec/active_record/connection_adapters/oracle_enhanced/connection_spec.rb#L500) `with_raw_connection` has a retry and reconnect built into it, but it was catching the oracle connection error. Adjust `retryable_connection_error?` so that both is aware of the OCI errors and reflects the `should_retry?` logic of the `with_retry` that database_statements used before.
Instead of verifying upon checkout, wait until we run a query.
If the query we need to run is safely retryable (including schema inspection, and most notably
BEGIN
), we don't need to explicitly verify, and can just recover if the query fails.By extension, we can apply the same connection-recovery logic to every retryable query we run -- so e.g. the beginning of any top-level transaction is a safe transparent DB-reconnection point.