New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idempotent option for after_commit :destroy callback #27248

Merged
merged 1 commit into from Dec 9, 2016

Conversation

Projects
None yet
8 participants
@stefanmb
Contributor

stefanmb commented Dec 2, 2016

Summary

In SQL a DELETE statement is always idempotent (will always succeed even if the underlying table is only modified once). Consequently the after_commit callbacks on :destroywill be invoked multiple times for a given model, even if a single deletion occurred. This means that user handlers must also be idempotent, which is not always the case (for example, if counting resources).

The issue addressed in this PR is related to #14735 and is fixed in a similar fashion:

  • Introduce an optional :idempotent flag for the after_commit callback.
  • Default to true to preserve existing behaviour.
  • Cache whether the operation affected any rows and provide an actually_destroyed? getter.
  • Demonstrate problem and solution with new unit tests.

We (Shopify) have been running a monkey patched version of this PR since October in production with no ill-effects.

r: @byroot @sgrif
cc: @tenderlove

P.S. This is my first Rails PR, so if there are any silly issues or violated conventions please let me know and I will resolve them as soon as possible. Thank you!

@rails-bot

This comment has been minimized.

Show comment
Hide comment
@rails-bot

rails-bot Dec 2, 2016

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @sgrif (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

rails-bot commented Dec 2, 2016

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @sgrif (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

@byroot

Looks good to me overall (for what it's worth)

Show outdated Hide outdated activerecord/test/cases/transaction_callbacks_test.rb
assert_equal [:destroy, :destroy], TopicWithIdempotentCallbacksOnDestroy.history
end
def test_helper(klass)

This comment has been minimized.

@byroot

byroot Dec 2, 2016

Member

I would name that differently. Something like: simulate_race_condition or something like that.

@byroot

byroot Dec 2, 2016

Member

I would name that differently. Something like: simulate_race_condition or something like that.

This comment has been minimized.

@stefanmb

stefanmb Dec 2, 2016

Contributor

Oops, you're right. The test_ naming scheme is not good for helper (gets run as a test standalone). I've changed it as you suggested.

@stefanmb

stefanmb Dec 2, 2016

Contributor

Oops, you're right. The test_ naming scheme is not good for helper (gets run as a test standalone). I've changed it as you suggested.

@byroot

This comment has been minimized.

Show comment
Hide comment
@byroot

byroot Dec 2, 2016

Member

Looks like your tests don't pass either.

Member

byroot commented Dec 2, 2016

Looks like your tests don't pass either.

@stefanmb

This comment has been minimized.

Show comment
Hide comment
@stefanmb

stefanmb Dec 2, 2016

Contributor

CI is green, sorry about the test mess. Thanks!

Contributor

stefanmb commented Dec 2, 2016

CI is green, sorry about the test mess. Thanks!

@byroot

This comment has been minimized.

Show comment
Hide comment
@byroot

byroot Dec 2, 2016

Member

👏

Member

byroot commented Dec 2, 2016

👏

@stefanmb

This comment has been minimized.

Show comment
Hide comment
@stefanmb

stefanmb Dec 6, 2016

Contributor

@rafaelfranca Pinging you since you have some context on this issue.

Contributor

stefanmb commented Dec 6, 2016

@rafaelfranca Pinging you since you have some context on this issue.

@rafaelfranca rafaelfranca assigned rafaelfranca and sgrif and unassigned sgrif Dec 6, 2016

@rafaelfranca

This comment has been minimized.

Show comment
Hide comment
@rafaelfranca

rafaelfranca Dec 6, 2016

Member

ACK! I have some things to review now but I put this in the top of my list.

Member

rafaelfranca commented Dec 6, 2016

ACK! I have some things to review now but I put this in the top of my list.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Dec 6, 2016

Member

I don't think we need to make this an option. after_commit :stuff, on: :destroy running multiple times sounds like a bug, not behavior we need to preserve.

Member

sgrif commented Dec 6, 2016

I don't think we need to make this an option. after_commit :stuff, on: :destroy running multiple times sounds like a bug, not behavior we need to preserve.

@rafaelfranca

This comment has been minimized.

Show comment
Hide comment
@rafaelfranca

rafaelfranca Dec 6, 2016

Member

Ah yeah, we should make this behavior the default with no opt-out as we discussed in the original implementation.

Member

rafaelfranca commented Dec 6, 2016

Ah yeah, we should make this behavior the default with no opt-out as we discussed in the original implementation.

Show outdated Hide outdated activerecord/lib/active_record/persistence.rb
@@ -525,6 +529,12 @@ def touch(*names, time: nil)
end
end
protected
def actually_destroyed?

This comment has been minimized.

@matthewd

matthewd Dec 6, 2016

Member

This looks like it can be private

@matthewd

matthewd Dec 6, 2016

Member

This looks like it can be private

Show outdated Hide outdated activerecord/lib/active_record/transactions.rb
destroyed?
else
actually_destroyed?
end
when :update
!(transaction_record_state(:new_record) || destroyed?)

This comment has been minimized.

@kaspth

kaspth Dec 6, 2016

Member

Why won't this destroyed? need to check actually_destroyed??

@kaspth

kaspth Dec 6, 2016

Member

Why won't this destroyed? need to check actually_destroyed??

Show outdated Hide outdated activerecord/lib/active_record/transactions.rb
@@ -292,9 +292,10 @@ def set_options_for_callbacks!(args, enforced_options = {})
if options[:on]
fire_on = Array(options[:on])
idempotent = options[:idempotent]

This comment has been minimized.

@kaspth

kaspth Dec 6, 2016

Member

Perhaps it's just because I'm not a native english speaker, but idempotent sounds needlessly technical to me. Like we're barfing out SQL intricacies to end users.

It could just be an unfamiliarity with the word, though I've been trying to read it several times and it still doesn't click.

consider_succeeded: true, might be closer to the SQL-forces-us-to-consider-this-final aspect, though still needs finagling.

@kaspth

kaspth Dec 6, 2016

Member

Perhaps it's just because I'm not a native english speaker, but idempotent sounds needlessly technical to me. Like we're barfing out SQL intricacies to end users.

It could just be an unfamiliarity with the word, though I've been trying to read it several times and it still doesn't click.

consider_succeeded: true, might be closer to the SQL-forces-us-to-consider-this-final aspect, though still needs finagling.

Show outdated Hide outdated activerecord/lib/active_record/persistence.rb
protected
def actually_destroyed?
@_actually_destroyed ||= false

This comment has been minimized.

@kaspth

kaspth Dec 6, 2016

Member

It might make sense to clarify @destroyed to e.g. @surreptitiously_destroyed so it doesn't fight for the same meaning as @_actually_destroyed.

@kaspth

kaspth Dec 6, 2016

Member

It might make sense to clarify @destroyed to e.g. @surreptitiously_destroyed so it doesn't fight for the same meaning as @_actually_destroyed.

This comment has been minimized.

@sgrif

sgrif Dec 6, 2016

Member

We're going to get rid of the method and option so it's all good. :)

@sgrif

sgrif Dec 6, 2016

Member

We're going to get rid of the method and option so it's all good. :)

@matthewd

This comment has been minimized.

Show comment
Hide comment
@matthewd

matthewd Dec 6, 2016

Member

@sgrif after_commit :update will still get run even if the update didn't change anything...

Member

matthewd commented Dec 6, 2016

@sgrif after_commit :update will still get run even if the update didn't change anything...

@stefanmb

This comment has been minimized.

Show comment
Hide comment
@stefanmb

stefanmb Dec 7, 2016

Contributor

@rafaelfranca @sgrif I've eliminated the idempotent option and made the new callback behaviour the default, as requested.

Contributor

stefanmb commented Dec 7, 2016

@rafaelfranca @sgrif I've eliminated the idempotent option and made the new callback behaviour the default, as requested.

Show outdated Hide outdated activerecord/lib/active_record/persistence.rb
@@ -181,7 +181,11 @@ def destroy
_raise_readonly_record_error if readonly?
destroy_associations
self.class.connection.add_transaction_record(self)
destroy_row if persisted?
@_actually_destroyed = if persisted?

This comment has been minimized.

@sgrif

sgrif Dec 7, 2016

Member

Can this just be changed to

if persisted && !destroyed
  @destroyed = destroy_row > 0
end

Do we even need to check the return value? Is this really different than destroy_row if persisted && !destroyed?

@sgrif

sgrif Dec 7, 2016

Member

Can this just be changed to

if persisted && !destroyed
  @destroyed = destroy_row > 0
end

Do we even need to check the return value? Is this really different than destroy_row if persisted && !destroyed?

This comment has been minimized.

@stefanmb

stefanmb Dec 7, 2016

Contributor

The issue is somehow flagging whether the SQL DELETE actually affected any rows (so we can trigger the callback then and only then). To do so we need to store the result of destroy_row somewhere. Right now we have:

    def persisted?
      sync_with_transaction_state
      !(@new_record || @destroyed)
    end

So by De Morgan's law your suggestion ends up being:

if persisted?
   @destroyed = destroy_row > 0
end

The problem is that does not work - it will fail the following test: https://github.com/rails/rails/blob/master/activerecord/test/cases/associations/bidirectional_destroy_dependencies_test.rb#L34-L39

@stefanmb

stefanmb Dec 7, 2016

Contributor

The issue is somehow flagging whether the SQL DELETE actually affected any rows (so we can trigger the callback then and only then). To do so we need to store the result of destroy_row somewhere. Right now we have:

    def persisted?
      sync_with_transaction_state
      !(@new_record || @destroyed)
    end

So by De Morgan's law your suggestion ends up being:

if persisted?
   @destroyed = destroy_row > 0
end

The problem is that does not work - it will fail the following test: https://github.com/rails/rails/blob/master/activerecord/test/cases/associations/bidirectional_destroy_dependencies_test.rb#L34-L39

This comment has been minimized.

@sgrif

sgrif Dec 7, 2016

Member

De Morgan's law doesn't apply when one of the expressions have side effects.

@sgrif

sgrif Dec 7, 2016

Member

De Morgan's law doesn't apply when one of the expressions have side effects.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Dec 7, 2016

Member

I really would prefer a solution that doesn't need this @actually_destroyed flag. It reeks of mysql_real_escape_string to me.

Member

sgrif commented Dec 7, 2016

I really would prefer a solution that doesn't need this @actually_destroyed flag. It reeks of mysql_real_escape_string to me.

@stefanmb

This comment has been minimized.

Show comment
Hide comment
@stefanmb

stefanmb Dec 7, 2016

Contributor

I think the issue here is a breakdown between the ORM model represented by an ActiveRecord object and the underlying SQL data persisting that record.

As shown in the new test case, it's possible to have a race condition between callbacks on two records being destroyed and pointing to the same underlying database row. This situation is something that ActiveRecord seems to allow by design (for example, by permitting one to delete a record, which keeps the Ruby object but not the SQL data).

We use the on: :destroy callbacks for a global count of resources, so we are interested in the underlying SQL data and not (the potentially many) ActiveRecord objects representing it.

Since a record can have destroy called an unlimited number of times we cannot use the @destroyed variable to represent the state of the SQL database. We need some new variable that stores whether or not the current call actually affected any rows.

Perhaps @__actually_destroyed should be called @_rows_deleted?

Contributor

stefanmb commented Dec 7, 2016

I think the issue here is a breakdown between the ORM model represented by an ActiveRecord object and the underlying SQL data persisting that record.

As shown in the new test case, it's possible to have a race condition between callbacks on two records being destroyed and pointing to the same underlying database row. This situation is something that ActiveRecord seems to allow by design (for example, by permitting one to delete a record, which keeps the Ruby object but not the SQL data).

We use the on: :destroy callbacks for a global count of resources, so we are interested in the underlying SQL data and not (the potentially many) ActiveRecord objects representing it.

Since a record can have destroy called an unlimited number of times we cannot use the @destroyed variable to represent the state of the SQL database. We need some new variable that stores whether or not the current call actually affected any rows.

Perhaps @__actually_destroyed should be called @_rows_deleted?

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Dec 7, 2016

Member

I think you're missing the point that I'm getting at. When we know that we have destroyed the record, we don't actually need to try again. I don't see any reason that we need to introduce a new flag for this.

Member

sgrif commented Dec 7, 2016

I think you're missing the point that I'm getting at. When we know that we have destroyed the record, we don't actually need to try again. I don't see any reason that we need to introduce a new flag for this.

@stefanmb

This comment has been minimized.

Show comment
Hide comment
@stefanmb

stefanmb Dec 7, 2016

Contributor

Aren't we then back to the original problem? Each user code call to destroy produces a on: destroy callback to user code (even though the underlying row was deleted only once)?

Is this the desired behaviour in ActiveRecord? I apologize if I'm not following your suggestion.

Contributor

stefanmb commented Dec 7, 2016

Aren't we then back to the original problem? Each user code call to destroy produces a on: destroy callback to user code (even though the underlying row was deleted only once)?

Is this the desired behaviour in ActiveRecord? I apologize if I'm not following your suggestion.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Dec 7, 2016

Member

No, the same way that we don't attempt when you call .save if .changed? is false, we shouldn't attempt to do anything when you call .destroy if .destroyed? is true.

Member

sgrif commented Dec 7, 2016

No, the same way that we don't attempt when you call .save if .changed? is false, we shouldn't attempt to do anything when you call .destroy if .destroyed? is true.

@stefanmb

This comment has been minimized.

Show comment
Hide comment
@stefanmb

stefanmb Dec 7, 2016

Contributor

I get what you're saying, but this is the sequence I see:

  1. Call destroy: Deletes Rows (first time around)
  2. In transaction_include_any_action: returns destroyed? (== true)
  3. User callback occurs.
  4. Call destroy: No-op (second time around)
  5. In transaction_include_any_action: returns destroyed? (== true) <-- Set during (1) above.
  6. User callback occurs. <-- Bug!
Contributor

stefanmb commented Dec 7, 2016

I get what you're saying, but this is the sequence I see:

  1. Call destroy: Deletes Rows (first time around)
  2. In transaction_include_any_action: returns destroyed? (== true)
  3. User callback occurs.
  4. Call destroy: No-op (second time around)
  5. In transaction_include_any_action: returns destroyed? (== true) <-- Set during (1) above.
  6. User callback occurs. <-- Bug!
@matthewd

This comment has been minimized.

Show comment
Hide comment
@matthewd

matthewd Dec 8, 2016

Member

@sgrif this problem isn't record.destroy; record.destroy -- it's destroy getting called on two different instances that both point to the same DB row, meaning the second one is newly destroyed in AR-land, but it didn't actually kill a row because the row wasn't there.

Member

matthewd commented Dec 8, 2016

@sgrif this problem isn't record.destroy; record.destroy -- it's destroy getting called on two different instances that both point to the same DB row, meaning the second one is newly destroyed in AR-land, but it didn't actually kill a row because the row wasn't there.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Dec 8, 2016

Member

@matthewd It looks like both are problems

Member

sgrif commented Dec 8, 2016

@matthewd It looks like both are problems

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Dec 8, 2016

Member

Let's rename @_actually_destroyed to something like @_deleted_rows_in_db, and check the ivar directly rather than adding another method onto Base

Member

sgrif commented Dec 8, 2016

Let's rename @_actually_destroyed to something like @_deleted_rows_in_db, and check the ivar directly rather than adding another method onto Base

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Dec 8, 2016

Member

Should we do a similar check on updates? (Much harder race condition to create, would only happen if one process deleted a row while another tried to update it)

Member

sgrif commented Dec 8, 2016

Should we do a similar check on updates? (Much harder race condition to create, would only happen if one process deleted a row while another tried to update it)

Show outdated Hide outdated activerecord/lib/active_record/transactions.rb
@@ -461,7 +461,7 @@ def transaction_include_any_action?(actions) #:nodoc:
when :create
transaction_record_state(:new_record)
when :destroy
destroyed?
@_deleted_rows_in_db ||= false

This comment has been minimized.

@sgrif

sgrif Dec 8, 2016

Member

Shouldn't this be || false since the object is frozen at this point?

@sgrif

sgrif Dec 8, 2016

Member

Shouldn't this be || false since the object is frozen at this point?

This comment has been minimized.

@sgrif

sgrif Dec 8, 2016

Member

Oh wait never mind we override freeze to only affect the attributes

@sgrif

sgrif Dec 8, 2016

Member

Oh wait never mind we override freeze to only affect the attributes

This comment has been minimized.

@stefanmb

stefanmb Dec 8, 2016

Contributor

The assignment is needed to avoid: warning: instance variable @_deleted_rows_in_db not initialized

Is there a better way to do this?

@stefanmb

stefanmb Dec 8, 2016

Contributor

The assignment is needed to avoid: warning: instance variable @_deleted_rows_in_db not initialized

Is there a better way to do this?

This comment has been minimized.

@sgrif

sgrif Dec 8, 2016

Member

defined?

@sgrif

sgrif Dec 8, 2016

Member

defined?

@stefanmb

This comment has been minimized.

Show comment
Hide comment
@stefanmb

stefanmb Dec 8, 2016

Contributor

Should we do a similar check on updates? (Much harder race condition to create, would only happen if one process deleted a row while another tried to update it)

I don't think the same issue would happen because the callback check is:

          when :update
            !(transaction_record_state(:new_record) || destroyed?)
          end
Contributor

stefanmb commented Dec 8, 2016

Should we do a similar check on updates? (Much harder race condition to create, would only happen if one process deleted a row while another tried to update it)

I don't think the same issue would happen because the callback check is:

          when :update
            !(transaction_record_state(:new_record) || destroyed?)
          end
@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Dec 8, 2016

Member

Right, which would still run callbacks if the same race condition occurred.

Member

sgrif commented Dec 8, 2016

Right, which would still run callbacks if the same race condition occurred.

@stefanmb

This comment has been minimized.

Show comment
Hide comment
@stefanmb

stefanmb Dec 8, 2016

Contributor

I'm not sure I understand. The destroy race occurred because @destroyed is always true after every call to destroy. The update race cannot occur because the update callback can only trigger if @destroyed == false.

If we are concerned with a record being destroyed during a callback then we need mutual exclusion between (user code) callbacks and the destroy call.

How would the check you are suggesting look? Sorry if I'm misunderstanding.

Contributor

stefanmb commented Dec 8, 2016

I'm not sure I understand. The destroy race occurred because @destroyed is always true after every call to destroy. The update race cannot occur because the update callback can only trigger if @destroyed == false.

If we are concerned with a record being destroyed during a callback then we need mutual exclusion between (user code) callbacks and the destroy call.

How would the check you are suggesting look? Sorry if I'm misunderstanding.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Dec 8, 2016

Member

And the current check for updates is also true, even if the underlying database record was deleted before we tried calling .save. The check would look identical, we would see if the affected row count is greater than zero, and set a flag if so.

Member

sgrif commented Dec 8, 2016

And the current check for updates is also true, even if the underlying database record was deleted before we tried calling .save. The check would look identical, we would see if the affected row count is greater than zero, and set a flag if so.

@stefanmb

This comment has been minimized.

Show comment
Hide comment
@stefanmb

stefanmb Dec 8, 2016

Contributor

@sgrif I made a rough pass at adding the same check to the update calls. It's a little trickier because a bunch of tests fail since they call save without actually updating any attributes (so the affected row count is 0). I made said tests change some attribute so that they pass. Likewise, the touch method has to set the affected row count.

Do you think this approach is acceptable? If so I'll clean up the PR.

P.S. Thanks again for taking the time to walk me through this.

Contributor

stefanmb commented Dec 8, 2016

@sgrif I made a rough pass at adding the same check to the update calls. It's a little trickier because a bunch of tests fail since they call save without actually updating any attributes (so the affected row count is 0). I made said tests change some attribute so that they pass. Likewise, the touch method has to set the affected row count.

Do you think this approach is acceptable? If so I'll clean up the PR.

P.S. Thanks again for taking the time to walk me through this.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Dec 8, 2016

Member

Hm. If the affected row count is 0 when no attributes were touched my base assumption might have been wrong. I will dig deeper in the morning.

Member

sgrif commented Dec 8, 2016

Hm. If the affected row count is 0 when no attributes were touched my base assumption might have been wrong. I will dig deeper in the morning.

@sgrif

Can you add a changelog entry as well?

Show outdated Hide outdated activerecord/lib/active_record/transactions.rb
end
end
end
private

This comment has been minimized.

@sgrif

sgrif Dec 9, 2016

Member

I don't think you meant to delete this

@sgrif

sgrif Dec 9, 2016

Member

I don't think you meant to delete this

Emulate db trigger behaviour for after_commit :destroy, :update
Race conditions can occur when an ActiveRecord is destroyed
twice or destroyed and updated. The callbacks should only be
triggered once, similar to a SQL database trigger.

@sgrif sgrif merged commit a9d72f6 into rails:master Dec 9, 2016

1 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
codeclimate no new or fixed issues
Details

morganatwishpond added a commit to morganatwishpond/rails that referenced this pull request Jun 2, 2017

Restore after_commit on: :update with optimistic locking
rails#29318

rails#27248 introduced
`@_trigger_update_callback` but because optimistic locking doesn't call `super`
when updating rows, it wasn't being set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment