-
Notifications
You must be signed in to change notification settings - Fork 21.9k
Avoid validating belongs_to
association if it has not changed
#46522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
305e63a
to
1ff70f8
Compare
1ff70f8
to
6627110
Compare
Hey excellent work in both PRs. Would not be better just check for not null ? I'm from the team who doesn't rely on foreign keys, I would like to have this performance gain too. |
end | ||
|
||
def marshal_load(array) | ||
@version, @columns, _columns_hash, @primary_keys, @data_sources, @indexes, @database_version = array | ||
@version, @columns, _columns_hash, @primary_keys, @data_sources, @indexes, @foreign_keys, @database_version = array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: I need to dig into what the backward and forward compatibility expectations are for marshaled schema cache.
I remember we want to be able to load a schema cache generated by the N-1 version to make it easy to upgrade, but I don't remember if that only applies to YAML serialization or if it applies to Marshal too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the eager computing changes, I am thinking if we really need this schema cache changes anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should apply for both. We want the cache to be read during the deploy of a new version, so that means that both the current version and the new one will try to read from the same file.
record.public_send("#{name}_changed?") || | ||
record.read_attribute(foreign_key).nil? || | ||
reflection.polymorphic? || | ||
model.connection.schema_cache.foreign_keys(model.table_name).none? { |fk| fk.column.to_s == foreign_key } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could cache (or even eagerly compute) this last expression.
It needs access to the schema cache, so to eagerly compute it we might need to wait for the "schema_load" event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is this to_s
necessary? I might be wrong by I think ForeignKeyDefinition#column
is already a String
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@casperisfine Thank you for the comments! Updated the PR with eager computation.
6627110
to
3043052
Compare
I'm curious why only when we have foreign key we can safely skip this validation. Is it because the associated record could be gone from the database? I don't think the reason why we validate this association is to get rid of orphan records, so I'd not even validate it if we already have a value in the id and it didn't change. |
Basically this. From my understanding, when we validate an association id field (e.g. If we start skipping validation when the association id hasn't changed, we could possibly get orphaned records. |
I think this is wasteful. I mean sure you can create this situation easily, but I'd argue it's a bug in your code, and even if it happens, I'm not sure having a validation error on that orphaned record really helps you that much. This validation is causing a lot of extra queries. On paper skipping this if a foreign key exist makes sense and is correct, but in practice on large deployments foreign key checks tend to be disabled, this is especially true for MySQL since foreign keys tend not to play well with online-schema change solutions such as LHM, gh-ost, etc. Now I could get behind to be safe by default, but I'd also see value in disabling this globally. |
So should I just change this PR to basically ignore the validation if the column hasn't changed? Or need to also introduce some app-wide config option that forces to fully check the association presence? |
That's what I'd do yes. But maybe this would deemed too backward incompatible? Hence the app wide config suggestion. But I'd like to hear opinions from others here. @rafaelfranca what do you think? |
That is what I'd do, yes. We have a patch in one application that does exactly that and we were planning to push to Rails. I don't think this validation is to avoid orphan records., every time you update an associated record but not change the primary key. If people want to avoid this they should be using a foreign key, or To easy upgrade we can have a temporary application wide config to keep the old behavior of checking when the association key didn't change. |
3043052
to
f9cd554
Compare
Updated with the suggestions. |
belongs_to
association if it has not changed and is backed by a foreign keybelongs_to
association if it has not changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the meaning of that config need to be flipped around.
record.read_attribute(foreign_key).nil? || | ||
record.attribute_changed?(foreign_key) || | ||
(reflection.polymorphic? && (record.read_attribute(foreign_type).nil? || record.attribute_changed?(foreign_type))) || | ||
!ActiveRecord.belongs_to_required_validates_foreign_key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can assume this one is set during boot and won't change. So if it's set, we can just not generate a condition at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I think the naming suggest that it should be used the other way around:
- if
belongs_to_required_validates_foreign_key == true
then we just put no condition, we always validates the presence. - if
belongs_to_required_validates_foreign_key == false
then we only validate presence when the foreign key changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Under belongs_to_required_validates_foreign_key
I meant something like "validates foreign key attribute for presence, not the association". But yeah, your interpretation is correct. So flipped it.
activerecord/CHANGELOG.md
Outdated
config.active_record.belongs_to_required_validates_foreign_key = true | ||
``` | ||
|
||
and will be enabled by default with `load_defaults 7.1`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we were saying it would be disabled by default in 7.1, no?
f9cd554
to
e5d1514
Compare
foreign_key = reflection.foreign_key | ||
foreign_type = reflection.foreign_type | ||
|
||
record.read_attribute(foreign_key).nil? || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could use _attribute
here? I think we can assume that foreign_key
and foreign_type
are not aliases and are already strings. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, LGTM now. I don't think the _attribute
is a big deal anyway.
belongs_to
associations are required by default, so they are checked for presence when the child record is saved. This is done every time the record is updated, even when the parent record hasn't changed.For example,
If we have a relevant foreign key, there is no need to check for the user (and thus performing an extra SQL query).
After the changes from this PR:
Since, from my observation, most records do not change its
belongs_to
association values during their lifetime, this will greatly reduce the number of database queries for updates. And this will be even more notably for models with multiplebelongs_to
's.cc @byroot (since we implemented a similar optimization previously - #45149)