Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idempotent option for after_commit :destroy callback #27248

Merged
merged 1 commit into from Dec 9, 2016
Merged

Idempotent option for after_commit :destroy callback #27248

merged 1 commit into from Dec 9, 2016

Conversation

stefanmb
Copy link
Contributor

@stefanmb stefanmb commented Dec 2, 2016

Summary

In SQL a DELETE statement is always idempotent (will always succeed even if the underlying table is only modified once). Consequently the after_commit callbacks on :destroywill be invoked multiple times for a given model, even if a single deletion occurred. This means that user handlers must also be idempotent, which is not always the case (for example, if counting resources).

The issue addressed in this PR is related to #14735 and is fixed in a similar fashion:

  • Introduce an optional :idempotent flag for the after_commit callback.
  • Default to true to preserve existing behaviour.
  • Cache whether the operation affected any rows and provide an actually_destroyed? getter.
  • Demonstrate problem and solution with new unit tests.

We (Shopify) have been running a monkey patched version of this PR since October in production with no ill-effects.

r: @byroot @sgrif
cc: @tenderlove

P.S. This is my first Rails PR, so if there are any silly issues or violated conventions please let me know and I will resolve them as soon as possible. Thank you!

@rails-bot
Copy link

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @sgrif (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

Copy link
Member

@byroot byroot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall (for what it's worth)

assert_equal [:destroy, :destroy], TopicWithIdempotentCallbacksOnDestroy.history
end

def test_helper(klass)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would name that differently. Something like: simulate_race_condition or something like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, you're right. The test_ naming scheme is not good for helper (gets run as a test standalone). I've changed it as you suggested.

@byroot
Copy link
Member

byroot commented Dec 2, 2016

Looks like your tests don't pass either.

@stefanmb
Copy link
Contributor Author

stefanmb commented Dec 2, 2016

CI is green, sorry about the test mess. Thanks!

@byroot
Copy link
Member

byroot commented Dec 2, 2016

👏

@stefanmb
Copy link
Contributor Author

stefanmb commented Dec 6, 2016

@rafaelfranca Pinging you since you have some context on this issue.

@rafaelfranca rafaelfranca assigned rafaelfranca and sgrif and unassigned sgrif Dec 6, 2016
@rafaelfranca
Copy link
Member

ACK! I have some things to review now but I put this in the top of my list.

@sgrif
Copy link
Contributor

sgrif commented Dec 6, 2016

I don't think we need to make this an option. after_commit :stuff, on: :destroy running multiple times sounds like a bug, not behavior we need to preserve.

@rafaelfranca
Copy link
Member

Ah yeah, we should make this behavior the default with no opt-out as we discussed in the original implementation.

@@ -525,6 +529,12 @@ def touch(*names, time: nil)
end
end

protected

def actually_destroyed?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it can be private

destroyed?
else
actually_destroyed?
end
when :update
!(transaction_record_state(:new_record) || destroyed?)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why won't this destroyed? need to check actually_destroyed??

@@ -292,9 +292,10 @@ def set_options_for_callbacks!(args, enforced_options = {})

if options[:on]
fire_on = Array(options[:on])
idempotent = options[:idempotent]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's just because I'm not a native english speaker, but idempotent sounds needlessly technical to me. Like we're barfing out SQL intricacies to end users.

It could just be an unfamiliarity with the word, though I've been trying to read it several times and it still doesn't click.

consider_succeeded: true, might be closer to the SQL-forces-us-to-consider-this-final aspect, though still needs finagling.

protected

def actually_destroyed?
@_actually_destroyed ||= false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to clarify @destroyed to e.g. @surreptitiously_destroyed so it doesn't fight for the same meaning as @_actually_destroyed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to get rid of the method and option so it's all good. :)

@matthewd
Copy link
Member

matthewd commented Dec 6, 2016

@sgrif after_commit :update will still get run even if the update didn't change anything...

@stefanmb
Copy link
Contributor Author

stefanmb commented Dec 7, 2016

@rafaelfranca @sgrif I've eliminated the idempotent option and made the new callback behaviour the default, as requested.

@@ -181,7 +181,11 @@ def destroy
_raise_readonly_record_error if readonly?
destroy_associations
self.class.connection.add_transaction_record(self)
destroy_row if persisted?
@_actually_destroyed = if persisted?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be changed to

if persisted && !destroyed
  @destroyed = destroy_row > 0
end

Do we even need to check the return value? Is this really different than destroy_row if persisted && !destroyed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is somehow flagging whether the SQL DELETE actually affected any rows (so we can trigger the callback then and only then). To do so we need to store the result of destroy_row somewhere. Right now we have:

    def persisted?
      sync_with_transaction_state
      !(@new_record || @destroyed)
    end

So by De Morgan's law your suggestion ends up being:

if persisted?
   @destroyed = destroy_row > 0
end

The problem is that does not work - it will fail the following test: https://github.com/rails/rails/blob/master/activerecord/test/cases/associations/bidirectional_destroy_dependencies_test.rb#L34-L39

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

De Morgan's law doesn't apply when one of the expressions have side effects.

@sgrif
Copy link
Contributor

sgrif commented Dec 7, 2016

I really would prefer a solution that doesn't need this @actually_destroyed flag. It reeks of mysql_real_escape_string to me.

@stefanmb
Copy link
Contributor Author

stefanmb commented Dec 7, 2016

I think the issue here is a breakdown between the ORM model represented by an ActiveRecord object and the underlying SQL data persisting that record.

As shown in the new test case, it's possible to have a race condition between callbacks on two records being destroyed and pointing to the same underlying database row. This situation is something that ActiveRecord seems to allow by design (for example, by permitting one to delete a record, which keeps the Ruby object but not the SQL data).

We use the on: :destroy callbacks for a global count of resources, so we are interested in the underlying SQL data and not (the potentially many) ActiveRecord objects representing it.

Since a record can have destroy called an unlimited number of times we cannot use the @destroyed variable to represent the state of the SQL database. We need some new variable that stores whether or not the current call actually affected any rows.

Perhaps @__actually_destroyed should be called @_rows_deleted?

@sgrif
Copy link
Contributor

sgrif commented Dec 7, 2016

I think you're missing the point that I'm getting at. When we know that we have destroyed the record, we don't actually need to try again. I don't see any reason that we need to introduce a new flag for this.

@stefanmb
Copy link
Contributor Author

stefanmb commented Dec 7, 2016

Aren't we then back to the original problem? Each user code call to destroy produces a on: destroy callback to user code (even though the underlying row was deleted only once)?

Is this the desired behaviour in ActiveRecord? I apologize if I'm not following your suggestion.

@sgrif
Copy link
Contributor

sgrif commented Dec 7, 2016

No, the same way that we don't attempt when you call .save if .changed? is false, we shouldn't attempt to do anything when you call .destroy if .destroyed? is true.

@stefanmb
Copy link
Contributor Author

stefanmb commented Dec 7, 2016

I get what you're saying, but this is the sequence I see:

  1. Call destroy: Deletes Rows (first time around)
  2. In transaction_include_any_action: returns destroyed? (== true)
  3. User callback occurs.
  4. Call destroy: No-op (second time around)
  5. In transaction_include_any_action: returns destroyed? (== true) <-- Set during (1) above.
  6. User callback occurs. <-- Bug!

@matthewd
Copy link
Member

matthewd commented Dec 8, 2016

@sgrif this problem isn't record.destroy; record.destroy -- it's destroy getting called on two different instances that both point to the same DB row, meaning the second one is newly destroyed in AR-land, but it didn't actually kill a row because the row wasn't there.

@sgrif
Copy link
Contributor

sgrif commented Dec 8, 2016

@matthewd It looks like both are problems

@sgrif
Copy link
Contributor

sgrif commented Dec 8, 2016

Let's rename @_actually_destroyed to something like @_deleted_rows_in_db, and check the ivar directly rather than adding another method onto Base

@sgrif
Copy link
Contributor

sgrif commented Dec 8, 2016

Should we do a similar check on updates? (Much harder race condition to create, would only happen if one process deleted a row while another tried to update it)

@@ -461,7 +461,7 @@ def transaction_include_any_action?(actions) #:nodoc:
when :create
transaction_record_state(:new_record)
when :destroy
destroyed?
@_deleted_rows_in_db ||= false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be || false since the object is frozen at this point?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait never mind we override freeze to only affect the attributes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assignment is needed to avoid: warning: instance variable @_deleted_rows_in_db not initialized

Is there a better way to do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defined?

@stefanmb
Copy link
Contributor Author

stefanmb commented Dec 8, 2016

Should we do a similar check on updates? (Much harder race condition to create, would only happen if one process deleted a row while another tried to update it)

I don't think the same issue would happen because the callback check is:

          when :update
            !(transaction_record_state(:new_record) || destroyed?)
          end

@sgrif
Copy link
Contributor

sgrif commented Dec 8, 2016

Right, which would still run callbacks if the same race condition occurred.

@stefanmb
Copy link
Contributor Author

stefanmb commented Dec 8, 2016

I'm not sure I understand. The destroy race occurred because @destroyed is always true after every call to destroy. The update race cannot occur because the update callback can only trigger if @destroyed == false.

If we are concerned with a record being destroyed during a callback then we need mutual exclusion between (user code) callbacks and the destroy call.

How would the check you are suggesting look? Sorry if I'm misunderstanding.

@sgrif
Copy link
Contributor

sgrif commented Dec 8, 2016

And the current check for updates is also true, even if the underlying database record was deleted before we tried calling .save. The check would look identical, we would see if the affected row count is greater than zero, and set a flag if so.

@stefanmb
Copy link
Contributor Author

stefanmb commented Dec 8, 2016

@sgrif I made a rough pass at adding the same check to the update calls. It's a little trickier because a bunch of tests fail since they call save without actually updating any attributes (so the affected row count is 0). I made said tests change some attribute so that they pass. Likewise, the touch method has to set the affected row count.

Do you think this approach is acceptable? If so I'll clean up the PR.

P.S. Thanks again for taking the time to walk me through this.

@sgrif
Copy link
Contributor

sgrif commented Dec 8, 2016

Hm. If the affected row count is 0 when no attributes were touched my base assumption might have been wrong. I will dig deeper in the morning.

Copy link
Contributor

@sgrif sgrif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a changelog entry as well?

end
end
end

private
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you meant to delete this

Race conditions can occur when an ActiveRecord is destroyed
twice or destroyed and updated. The callbacks should only be
triggered once, similar to a SQL database trigger.
@sgrif sgrif merged commit a9d72f6 into rails:master Dec 9, 2016
morganatwishpond added a commit to morganatwishpond/rails that referenced this pull request Jun 2, 2017
rails#29318

rails#27248 introduced
`@_trigger_update_callback` but because optimistic locking doesn't call `super`
when updating rows, it wasn't being set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants