New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor inversing logic of CollectionAssociation
#45399
base: main
Are you sure you want to change the base?
Refactor inversing logic of CollectionAssociation
#45399
Conversation
27a2224
to
da7e496
Compare
da7e496
to
83f8208
Compare
83f8208
to
77889e8
Compare
Follow-up to rails#40379, rails#42524, and rails#43445. The logic to associate records with a `CollectionAssociation` when using `has_many` inversing has become increasingly complex. Prior to this commit, we defined an "add record" method (`add_to_target`) that can accept a `replace: true` option, and a "replace record" method (`replace_on_target`) that can accept a `replace: false` option. And regardless of the value of the `:replace` option, the record may be added if it cannot be replaced (i.e. does not exist), or replaced if a complex set of conditions dictate it should not be simply added. This commit aims to simplify this logic by: * Removing the `:replace` option from `add_to_target`. * Renaming `replace_on_target` to a low-level `add_record` method, supporting a `:replace` option. The name `add_record` was chosen for parity with the similarly low-level `remove_records` method. * Replacing the complex conditions of `replace_on_target` with an "implicit target records" concept. Implicit target records are those that have been associated via inversing but not explicitly added. Implicit target records exist in the `@target` array, and are also tracked via an `@implicit_target_records` array. Whenever an implicit target record is explicitly added, it is always replaced in the `@target` array to prevent a double occurrence, and then removed from the `@implicit_target_records` array. (Thus if the record is explicitly added a 2nd time, a 2nd occurrence *will* be added to the `@target` array. We have tests that expect this.) Additionally, this commit provides a very small speed increase, according to the following benchmark: ```ruby require "benchmark" ActiveRecord::Base.logger = nil ActiveRecord::Schema.define do create_table :authors, force: true do |t| end create_table :posts, force: true do |t| t.references :author end end class Author < ActiveRecord::Base has_many :posts end class Post < ActiveRecord::Base belongs_to :author end POST_COUNTS = [1, 10, 100, 1000] ITERATIONS = 100 CALLS_PER_ITERATION = 100 eval <<~RUBY def ms_per_build(posts) Benchmark.ms { #{"posts.build;" * CALLS_PER_ITERATION} }.to_f / CALLS_PER_ITERATION end def ms_per_shovel(posts, post) Benchmark.ms { #{"posts << post;" * CALLS_PER_ITERATION} }.to_f / CALLS_PER_ITERATION end RUBY def report(label, &block) [false, true].each do |has_many_inversing| ActiveRecord::Base.has_many_inversing = has_many_inversing POST_COUNTS.each do |post_count| Post.delete_all Post.insert_all(post_count.times.map { { author_id: 1 } }) block.call # warmup ms = ITERATIONS.times.sum(&block).to_f / ITERATIONS puts "has_many_inversing == #{has_many_inversing}, posts.count == #{post_count}".ljust(50) + "=> `#{label}` @ #{"%.3f" % ms}ms" end end end author = Author.create!(id: 1) report("posts.build") do posts = author.reload.posts.tap(&:load_target) ms_per_build(posts) end report("posts << post") do posts = author.reload.posts.tap(&:load_target) post = posts[posts.length / 2] ms_per_shovel(posts, post) end ``` Results with `main` branch: ``` has_many_inversing == false, posts.count == 1 => `posts.build` @ 0.154ms has_many_inversing == false, posts.count == 10 => `posts.build` @ 0.142ms has_many_inversing == false, posts.count == 100 => `posts.build` @ 0.141ms has_many_inversing == false, posts.count == 1000 => `posts.build` @ 0.144ms has_many_inversing == true, posts.count == 1 => `posts.build` @ 0.143ms has_many_inversing == true, posts.count == 10 => `posts.build` @ 0.142ms has_many_inversing == true, posts.count == 100 => `posts.build` @ 0.142ms has_many_inversing == true, posts.count == 1000 => `posts.build` @ 0.147ms has_many_inversing == false, posts.count == 1 => `posts << post` @ 0.192ms has_many_inversing == false, posts.count == 10 => `posts << post` @ 0.192ms has_many_inversing == false, posts.count == 100 => `posts << post` @ 0.202ms has_many_inversing == false, posts.count == 1000 => `posts << post` @ 0.196ms has_many_inversing == true, posts.count == 1 => `posts << post` @ 0.209ms has_many_inversing == true, posts.count == 10 => `posts << post` @ 0.204ms has_many_inversing == true, posts.count == 100 => `posts << post` @ 0.221ms has_many_inversing == true, posts.count == 1000 => `posts << post` @ 0.195ms ``` Results with this branch: ``` has_many_inversing == false, posts.count == 1 => `posts.build` @ 0.144ms has_many_inversing == false, posts.count == 10 => `posts.build` @ 0.132ms has_many_inversing == false, posts.count == 100 => `posts.build` @ 0.133ms has_many_inversing == false, posts.count == 1000 => `posts.build` @ 0.135ms has_many_inversing == true, posts.count == 1 => `posts.build` @ 0.132ms has_many_inversing == true, posts.count == 10 => `posts.build` @ 0.133ms has_many_inversing == true, posts.count == 100 => `posts.build` @ 0.132ms has_many_inversing == true, posts.count == 1000 => `posts.build` @ 0.140ms has_many_inversing == false, posts.count == 1 => `posts << post` @ 0.200ms has_many_inversing == false, posts.count == 10 => `posts << post` @ 0.187ms has_many_inversing == false, posts.count == 100 => `posts << post` @ 0.188ms has_many_inversing == false, posts.count == 1000 => `posts << post` @ 0.191ms has_many_inversing == true, posts.count == 1 => `posts << post` @ 0.187ms has_many_inversing == true, posts.count == 10 => `posts << post` @ 0.188ms has_many_inversing == true, posts.count == 100 => `posts << post` @ 0.197ms has_many_inversing == true, posts.count == 1000 => `posts << post` @ 0.189ms ```
77889e8
to
ed79453
Compare
def add_to_target(record, skip_callbacks: false, replace: false, &block) | ||
replace_on_target(record, skip_callbacks, replace: replace || association_scope.distinct_value, &block) | ||
def add_to_target(record, skip_callbacks: false, &block) | ||
add_record(record, replace: association_scope.distinct_value, skip_callbacks: skip_callbacks, &block) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this changes the behavior on build
when replace:
was being passed as true
? Now it will use association_scope.distinct_value
to determine if it should replace or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, replace: true
was added in 50e4aaa because the following line ends up calling replace_on_target
twice:
add_to_target(build_record(attributes, &block), replace: true) |
The 1st time via target=
inside the build_record
call, and the 2nd time via add_to_target
.
For the 2nd call, replace: true
causes the following conditional to compute index
:
rails/activerecord/lib/active_record/associations/collection_association.rb
Lines 449 to 452 in ec4558a
def replace_on_target(record, skip_callbacks, replace:, inversing: false) | |
if replace && (!record.new_record? || @replaced_or_added_targets.include?(record)) | |
index = @target.index(record) | |
end |
However, a "fallback" was also added in 6f6c441 to compute index
anyway:
rails/activerecord/lib/active_record/associations/collection_association.rb
Lines 464 to 466 in ec4558a
if !index && @replaced_or_added_targets.include?(record) | |
index = @target.index(record) | |
end |
In other words, I believe replace: true
inside build
is no longer necessary even on main
.
That said, this PR changes how targets are tracked. With this PR, CollectionAssociation#inversed_from
will override Association#inversed_from
, so build_record
inside build
will call add_record
with implicit: true
instead of target=
.
Thus the 2nd call to add_to_target
/ add_record
will find the record in @implicit_target_records
and replace it at index
:
rails/activerecord/lib/active_record/associations/collection_association.rb
Lines 458 to 462 in ed79453
was_implicit = @implicit_target_records&.delete(record) | |
(@implicit_target_records ||= []) << record if implicit | |
if (replace || was_implicit) && index = target.index(record) | |
target[index] = record |
It has been a long time since I looked at this code though, so I could be missing something. A while ago I asked Eileen to test this branch against the Shopify codebase, and she said there were errors but the cause wasn't clear. Unfortunately, I was not able to dig further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I just tried this PR again and there were some errors. Some of them is because we were calling add_to_target
directly with replace: true
. The others were is not clear. But points to me that some behavior is changing in this refactoring, so I was trying to see visually what would be the low hanging fruits. I can dig deeper.
Follow-up to #40379, #42524, and #43445.
The logic to associate records with a
CollectionAssociation
when usinghas_many
inversing has become increasingly complex. Prior to this commit, we defined an "add record" method (add_to_target
) that can accept areplace: true
option, and a "replace record" method (replace_on_target
) that can accept areplace: false
option. And regardless of the value of the:replace
option, the record may be added if it cannot be replaced (i.e. does not exist), or replaced if a complex set of conditions dictate it should not be simply added.This commit aims to simplify this logic by:
:replace
option fromadd_to_target
.replace_on_target
to a low-leveladd_record
method, supporting a:replace
option. The nameadd_record
was chosen for parity with the similarly low-levelremove_records
method.replace_on_target
with an "implicit target records" concept. Implicit target records are those that have been associated via inversing but not explicitly added. Implicit target records exist in the@target
array, and are also tracked via an@implicit_target_records
array. Whenever an implicit target record is explicitly added, it is instead removed from the@implicit_target_records
array, to prevent a double occurrence in the@target
array. (Though if the record is explicitly added a 2nd time, a 2nd occurrence will be added to the@target
array. We have tests that expect this.)Additionally, this commit provides a very small speed increase, according to the following benchmark:
Benchmark
Results with
main
branch:Results with this branch:
/cc @ghiculescu since you authored #42524, and @jstncarvalho since you authored #43445. If you have time, would mind testing out this branch and verifying that it does not cause a regression (performance or otherwise) for you?