-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using create_or_find_by
led to primary keys running out
#35543
Comments
Hey @alexcameron89 👋, glad to have you back again 😄 I'm unsure what we can really do about this. Just for completeness, is this also a problem with the old |
Thank you @kaspth, I'm glad to be back! 😄
Yes, this will happen in the case of the data race of My concern with I can already see that All in all, I feel that |
But only on Postgres, correct? Or did SQLite and MySQL auto-increment the primary key without rolling back too? At the very least we should mention this in the documentation. I'm not entirely sure an outright removal is the right course of action. This seems somewhat par for the course when we're talking uniqueness and Postgres. E.g. this could happen with a failed uniqueness validation that's retried as you said, @dhh what's your take on this? As I understand it, |
It's certainly good to point out this gotcha in the documentation, at the very least. We use MySQL at Basecamp and are happy users of this method in production. @alexcameron98 Can you share your concerns with ID auto-incremental growth in general? We're defaulting id columns to bigint, so we have room for 9,223,372,036,854,775,807 values in pgsql. Doesn't seem so likely that this method is going to make scarcely anyone run out of room? Even if you called this method 10 million times per day, it would take 2.5 billion years to run out of IDs 😄 |
The only logical explanation is that @alexcameron89 is speaking to us from 2.5 billion years in the future! 😄 @alexcameron89 I take it your column isn't a big int? |
Column Type: Int
That's a good callout, the table that was affected had Other affected DB's
It looks like Mysql is affected by it (script), but SQLite is not (script). Documentation
I'll push up a PR with a callout in the documentation. The Future
It's true! |
Whatever PR for doc changes should recognize that you're only to be alarmed
if you're using int for your pks. And if you are, you should be concerned
whether you run this method or not! We ran out of ints a while back at
Basecamp. Not a fun time. So you carrying a time bomb in your app if you
keep pks as ints.
…On Mon, Mar 11, 2019 at 2:01 PM Alex Kitchens ***@***.***> wrote:
*Column Type: Int*
I take it your column isn't a big int?
That's a good callout, the table that was affected had int as primary
key. In reality, this is likely to only affect int tables. But given that
bigint didn't become a default until 5.0+, there's a high likelihood that
this method will be used on many tables with int primary keys, and that's
a little concerning.
*Other affected DB's*
Or did SQLite and MySQL auto-increment the primary key without rolling
back too? At the very least we should mention this in the documentation.
It looks like Mysql is affected by it (script
<https://gist.github.com/401fd48d0f21fcf7262c4d20422efee5>), but SQLite
is not (script
<https://gist.github.com/alexcameron89/4da7d4be2a8380a74f9d0708cc66d275>).
*Documentation*
It's certainly good to point out this gotcha in the documentation, at the
very least.
I'll push up a PR with a callout in the documentation.
*The Future*
The only logical explanation is that @alexcameron89
<https://github.com/alexcameron89> is speaking to us from 2.5 billion
years in the future! 😄
It's true!bigbigbigint is the new hip thing these days. 😄 🤖
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#35543 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAKtWQJjPUPXtFEUR3s6xaKARPGjgbEks5vVsQfgaJpZM4bmQjM>
.
|
Perhaps we should deprecate having int for primary keys in Rails 6? Not sure about all the ramifications, but it could be good to something here. Or we could add a task to help generate migrations for it. |
I think that's a really interesting idea. 👍 |
We experienced an outage from It's true that int PKs are a timebomb, but this method accelerates the clock significantly. Running out of ints with a table of a few million wasn't expected. The sequence basically becomes a query count. I've come around to the idea that I'm dubious of the general-purpose utility of |
Bradley, as discussed here, whether it’s a query count or not doesn’t really matter when you have bigint columns. We use the method at Basecamp and it’s a great fit under the constraints detailed. If the race conditions it was designed for isn’t a problem for you, just use find_or_create_by 👍
… On Mar 11, 2019, at 18:13, Bradley Schaefer ***@***.***> wrote:
We experienced an outage from create_or_find_by with a table on the order of a few million rows - big, but not growing at a rate that we'd have expected to exhaust the sequence. We had assumed the sequence roughly corresponded to the table size. Even then migrating from an int column to a bigint took a fair amount of downtime. Is anybody monitoring their database sequences? (I doubt it)
It's true that int PKs are a timebomb, but this method accelerates the clock significantly. Running out of ints with a table of a few million wasn't expected. The sequence basically becomes a query count.
I've come around to the idea that find_or_create_or_find_by is a better behavior in our case (though I'd never suggest rails having that method).
I'm dubious of the general-purpose utility of create_or_find_by after having experienced that outage. There are already 4 bullet point drawbacks listed in the documentation for this method, this would be a 5th. Better to warn people than have them discover it "naturally" as we did.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
The race condition it was designed for IS a problem for me, but It's true it doesn't matter as much with bigint columns. Migrating to bigint columns can take a fair amount of work depending on your system - we took out service offline to isolate the database and ran the migration and it took about 45 minutes against a few million rows - it's rewriting the table and indexes which is expensive the naive way. Zero downtime migrations take a bit of prep that you may not be prepared for if you're thinking you're not going to run out of ids because your table is a few orders of magnitude smaller than the limit. I'm happy it works for you in Basecamp. The people that will hit this are likely to be those upgrading pre-existing apps that have not converted everything to bigints. |
Yeah. Hitting the int ceiling is nasty. Sorry that happened to you. We should definitely document that using this method can hasten that, and suggest that people only use it once they’re on bigint. On board with that part.
… On Mar 11, 2019, at 19:52, Bradley Schaefer ***@***.***> wrote:
The race condition it was designed for IS a problem for me, but find_or_create_or_find is a less-surprising solution. The race condition was a low-grade annoyance in our application's BugSnag, and so we applied create_or_find_by thinking it's no big deal. Months later we had an outage over an hour long because of this decision.
It's true it doesn't matter as much with bigint columns. Migrating to bigint columns can take a fair amount of work depending on your system - we took out service offline to isolate the database and ran the migration and it took about 45 minutes against a few million rows - it's rewriting the table and indexes which is expensive the naive way. Zero downtime migrations take a bit of prep that you may not be prepared for if you're thinking you're not going to run out of ids because your table is a few orders of magnitude smaller than the limit.
I'm happy it works for you in Basecamp. The people that will hit this are likely to be those upgrading pre-existing apps that have not converted everything to bigints.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I'd like to consider making that the behaviour of To me the ideal would be that they both have similar[ly rare] failure scenarios, and the choice of |
Was looking at this, in terms of PostgreSQL, there isn't an obvious way to change anything. Figure I should post the experiment in case any others find value in it. require 'pg'
db = PG.connect dbname: 'josh_testing'
%I[exec exec_params].each do |name|
db.define_singleton_method name do |*args, &block|
super(*args, &block).to_a
rescue
$!.set_backtrace caller.drop(1)
raise
end
end
db.exec <<~SQL
drop table if exists omghi;
create table if not exists omghi (
id serial primary key,
val text unique
)
SQL
insert = -> val {
db.exec_params <<~SQL, [val]
insert into omghi (val) values ($1)
returning *
SQL
}
# insert value
insert['a'] # => [{"id"=>"1", "val"=>"a"}]
# failing to insert increments the sequence
insert['a'] rescue $!.class # => PG::UniqueViolation
insert['b'] # => [{"id"=>"3", "val"=>"b"}]
# an explicit transaction changes nothing
db.exec 'begin transaction'
insert['a'] rescue $!.class # => PG::UniqueViolation
db.exec 'rollback transaction'
insert['c'] # => [{"id"=>"5", "val"=>"c"}]
# "on conflict do nothing" changes nothing
db.exec <<~SQL
insert into omghi (val) values ('a')
on conflict do nothing
returning *
SQL
insert['d'] # => [{"id"=>"7", "val"=>"d"}]
# the db's values
db.exec('select * from omghi').to_a
# => [{"id"=>"1", "val"=>"a"},
# {"id"=>"3", "val"=>"b"},
# {"id"=>"5", "val"=>"c"},
# {"id"=>"7", "val"=>"d"}] |
actually, there is a quick workaround for int to bigint migration in postgres:
Takes seconds on a table with 1M records. |
I'm surprised by the unwillingness to recognize the problem that this issue is attempting to address. While In my opinion, it'd be similar to saying:
While I'm probably more OCD than most about missing chunks of primary keys in my tables, I think a quick education in the docs around the scenarios in which |
The "problem" is recognized. There'll be a note stating that this is the
behavior, which matters if you're on an int pk. But if you're on bigint,
then it won't matter, unless you for some bizarre reason considers
incrementing a counter the same as littering. And if that's the case, you
are, as in all other cases, utterly free not to use this feature.
I for one do not have any sentimentality around integer sequence gaps, and
for our use, I'm OK with having the limited runway of a few thousand
billions of years of runway for our applications. I mean, the sun is going
to explore in a mere 5 billion years...
…On Tue, Mar 12, 2019 at 12:19 PM Tyler Johnson ***@***.***> wrote:
I'm surprised by the unwillingness to recognize the problem that this
issue is attempting to address. While bigint columns certainly pushes the
ultimate breaking problem down the road (billions of years, even), it
doesn't make the actual problem go away (unnecessary skipping of unused
primary key values).
In my opinion, it'd be similar to saying:
You think it's a problem that too many straws are being opened and
discarded unnecessarily before being used, and you're upset that the lake
behind your house is getting filled with straws? I'm sorry you're
experiencing this, but the lake behind my house got filled a long time ago,
so I understand how you feel! Just start discarding your unused straws into
the ocean, it's much bigger than your lake and you won't notice they're
there until you're LONG gone!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#35543 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAKtfsj10zSTLKKGZgREYR2ocJbGNTcks5vV9OggaJpZM4bmQjM>
.
|
I do consider it unnecessary litter in my db. Maybe I'm the anomaly, and if so, I apologize for the inaccurate comparison.
In the case I'm not the anomaly, my concern isn't that I would use the feature inappropriately, but that countless others might; and that they'd eventually be upset at their unintentional littering of their own db. |
This commit addresses the issue in rails#35543 by making note of the growing primary key issue with `create_or_find_by`.
@matthewd I'm working on this and trying to find a good way to test it. |
This commit addresses the issue in #35543 by making note of the growing primary key issue with `create_or_find_by`.
Since the mitigation to this issue was decided to be documenting the issue, and since that was merged in #35573, I'll close this issue. |
Updates `#create_or_find_by` to accept an optional `options` key to house a new `find_first` flag. This flag specifies that the method runs a `#find_by` before attempting the create. This can help mitigate a few potential issues: * (Impact) Description * (Medium) Reduces write load. In most cases, this method is useful to create a record once with all subsequent calls using `#find_by` as a fallback. The proactive `#find_by` can skip the create/rescue loop and mostly rely on the initial fetch. * (Low) Reduces the risk of unneccesary PK increments when an existing record exists (ie. some versions of MySQL increment the PK on INSERT, regardless of whether or not it succeeds). References rails#35543 [Previous implementation](rails#35633)
Steps to reproduce
While this doesn't fully assert the issue, it shows that the primary key can grow tremendously without actually writing new rows.
Problem
Each time
create_or_find_by
hitsActiveRecord::RecordNotUnique
, it rolls back, but the auto-increment does not. The fact that auto-increment does not roll back seems to be expected behavior from PostgreSQL.Still, the fact that this happens with
create_or_find_by
led to one of our database tables running out of primary keys in production, preventing the creation of any new rows.System configuration
Rails version:
6.0
Ruby version:
2.5.3
The text was updated successfully, but these errors were encountered: