Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add #create_or_find_by to lean on unique constraints #31989

Merged
merged 7 commits into from
Feb 15, 2018
Merged

Conversation

dhh
Copy link
Member

@dhh dhh commented Feb 13, 2018

Attempts to create a record with the given attributes in a table that has a unique constraint
on one or several of its columns. If a row already exists with one or several of these
unique constraints, the exception such an insertion would normally raise is caught,
and the existing record with those attributes is sought found using #find_by.

This is similar to #find_or_create_by, but avoids the problem of stale reads, as that methods needs
to first query the table, then attempt to insert a row if none is found. That leaves a timing gap
between the SELECT and the INSERT statements that can cause problems in high throughput applications.

There are several drawbacks to #create_or_find_by, though:

  • The underlying table must have the relevant columns defined with unique constraints.
  • A unique constraint violation may be triggered by only one, or at least less than all,
    of the given attributes. This means that the subsequent #find_by may fail to find a
    matching record, which will then return nil, rather than a record will the given attributes.
  • It relies on exception handling to handle control flow, which may be marginally slower. And

This method will always returns a record if all given attributes are covered by unique constraints,
but if creation was attempted and failed due to validation errors it won't be persisted, you get what
#create returns in such situation.

@dhh dhh requested a review from jeremy February 13, 2018 23:47
@nynhex
Copy link

nynhex commented Feb 13, 2018

This is really good. I have multiple use cases for this. 👍

def create_or_find_by!(attributes, &block)
create!(attributes, &block)
rescue ActiveRecord::RecordNotUnique
find_by(attributes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should raise in the fallback find case? It'd be surprising to ever get a nil result (for whatever reason, including a missing unique index) from these methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point. Let's change that to find_by!.

Subscriber.create_or_find_by(nick: "bob", name: "the cat")
end
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is missing a test with create_with (to change the behavior of the create) and with where (to change the behavior of find!

I'm not sure if those method should be used with create_or_find_by but right now they can and I have no idea of the effect they would have, so better to test them to make sure it is what we expect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how that differs? create_or_find_by uses the same flow as find_or_create_by, except instead of an || we're using a rescue. But the flow is the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, it's not that we can't test it, just that I don't think it teaches us anything interesting. The test would be identical to the one for test_find_or_create_by_with_create_with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is the same behavior we want so I guess it is fine being tested in that test

# matching record, which will then return nil, rather than a record will the given attributes.
# * It relies on exception handling to handle control flow, which may be marginally slower. And
#
# This method will always returns a record if all given attributes are covered by unique constraints,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't totally true: another client could update or delete the row between the rejected INSERT and the subsequent SELECT. This race condition is complementary to the one in #find_or_create_by.

That caveat probably belongs in the "drawbacks" section, but even still I don't think we can make this claim here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Will add that point. It's a much rarer race condition in many apps, I'd say.

# #create returns in such situation.
def create_or_find_by(attributes, &block)
create(attributes, &block)
rescue ActiveRecord::RecordNotUnique
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs transaction(requires_new: true) do around the create to work in an ongoing surrounding transaction (on at least PostgreSQL)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not necessary for find_or_create_by?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it doesn't cause an SQL error and then attempt to recover. PostgreSQL remembers when an error has occurred inside a transaction, and disallows all further operations until that transaction has been rolled back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are willing to require PG 9.5 or later for this (when using the PG adapter), we could just stick ON CONFLICT DO NOTHING on the end instead. This would let us wrap the whole thing in a transaction instead of just the create, which eliminates any possibility of race conditions.

def create_or_find_by!(attributes, &block)
create!(attributes, &block)
rescue ActiveRecord::RecordNotUnique
find_by!(attributes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to retry here instead of raising if the find fails? Seems confusing for create_or_find_by! to raise RecordNotFound, just because of a racing delete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't retry this failure as it should only occur when you use attributes where not all of them are covered by unique constraints. It's more like a INVALID QUERY kind of failure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, I was assuming the delete race, rather than operator error. I guess we could retry once, so we're safe from a poorly-timed simple delete. That way we'd only be tripped up by a racing delete; insert; delete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I'm not sure whether having just a single retry is a good warranty, though. Since it's hard for us to tell the difference between legit race condition and invalid query.

@dhh
Copy link
Member Author

dhh commented Feb 14, 2018 via email

@matthewd
Copy link
Member

I think this should do it:

def test_create_or_find_by_within_transaction
  assert_nil Subscriber.find_by(nick: "bob")

  subscriber = Subscriber.create!(nick: "bob")

  Subscriber.transaction do
    assert_equal subscriber, Subscriber.create_or_find_by(nick: "bob")
    assert_not_equal subscriber, Subscriber.create_or_find_by(nick: "cat")
  end
end

Copy link
Contributor

@tjschuck tjschuck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation edits.

# * The underlying table must have the relevant columns defined with unique constraints.
# * A unique constraint violation may be triggered by only one, or at least less than all,
# of the given attributes. This means that the subsequent #find_by may fail to find a
# matching record, which will then raise an `ActiveRecord::NotFound` exception,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`ActiveRecord::NotFound` needs <tt> for Rdoc code formatting, not backticks.

# * A unique constraint violation may be triggered by only one, or at least less than all,
# of the given attributes. This means that the subsequent #find_by may fail to find a
# matching record, which will then raise an `ActiveRecord::NotFound` exception,
# rather than a record will the given attributes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will => with

# Attempts to create a record with the given attributes in a table that has a unique constraint
# on one or several of its columns. If a row already exists with one or several of these
# unique constraints, the exception such an insertion would normally raise is caught,
# and the existing record with those attributes is sought found using #find_by.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is sought found using #find_by — "sought" should be deleted.

# and the existing record with those attributes is sought found using #find_by.
#
# This is similar to #find_or_create_by, but avoids the problem of stale reads between the SELECT
# and the INSERT, as that methods needs to first query the table, then attempt to insert a row
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

methods => method

Otherwise PG will complain with "PG::InFailedSqlTransaction: ERROR:  current transaction is aborted, commands ignored until end of transaction block".

Thanks @matthewd
@dhh dhh merged commit fe6adf4 into master Feb 15, 2018
@dhh dhh deleted the create_or_find_by branch February 15, 2018 00:51
@AnalyzePlatypus
Copy link

Awesome!

@subhashsaran
Copy link

Good stuff.

@dhh
Copy link
Member Author

dhh commented Feb 21, 2018 via email

@sikachu
Copy link
Member

sikachu commented Feb 21, 2018

(Hmm, @sgrif's comment was gone?)

@sgrif I'm 👍 on expanding this feature for PG 9.5+ in PG adapter. Other RDBMS can fallback to unique-key-constaint-exception-then-find.

@yhirano55
Copy link
Contributor

I want to use these awesome class methods on Rails 5.2, so much.
Do you have any plans to backport to 5-2-stable?

@bogdanvlviv
Copy link
Contributor

I think we don't, "New features are only added to the master branch and will not be made available in point releases." by Maintenance Policy for Ruby on Rails.

@yhirano55
Copy link
Contributor

I see, thank you for your explanation.

@djezzzl
Copy link

djezzzl commented Dec 7, 2018

Hi everybody,

As we start relying on the database constraints more, I'd suggest (and few people like the idea) to have the improved versions of validates_uniqueness_of and belongs_to from database_validations gem in the ActiveRecord. I can prepare a PR if needed.

I just opened the issue #34650 where we can discuss the topic.

Any feedback is very appreciated!

MaximeLaurenty added a commit to MaximeLaurenty/rubocop-rails that referenced this pull request Aug 8, 2019
create_or_find_by will be implemented in Rails 6.0 (rails/rails#31989)
However we've decided to use it before it comes out, and would love it to be in rubocop scope as well.
I guess we're not the first ones to do it, so it could benefit to most.
va-bot pushed a commit to department-of-veterans-affairs/caseflow that referenced this pull request Oct 6, 2020
### Description
If `LegacyAppeal.find_or_create_by_vacols_id` is hit multiple times for an appeal that doesn't yet exist, it's possible for a race condition to arise where the appeal is saved by one process in between the time another process initializes and attempts to save it.

I observed this happening when loading the case details page for appeals that exist in VACOLS but not Caseflow, when `find_or_create_by_vacols_id` is invoked from the appeals controller and the tasks controller in quick succession. When the second process attempts to save its version of the appeal, it fails with `PG::UniqueViolation` / `ActiveRecord::RecordNotUnique`.

I changed the method to capture the `ActiveRecord::RecordNotUnique` and return the existing record via `find_by!` (which'll raise a `ActiveRecord::RecordNotFound` if it's unsuccessful), following [this pattern](rails/rails#31989), from the Rails 6 `create_or_find_by` method.

It's not ideal to [use an exception as flow control](https://wiki.c2.com/?DontUseExceptionsForFlowControl), but I think it's okay in this case to let the database constraint determine the action.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet