You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
O(N^2) operation in CollectionAssociation#replace_common_records_in_memory and **another** O(N^2) operation in ActiveStorage::Attached::Many#attach#50848
Open
malavbhavsar opened this issue
Jan 23, 2024
· 1 comment
Apologies for not following the template. I wasn't sure how to include my half-baked fix with it... I have full repro and additional information here: malavbhavsar/rails#1
Execution time of #attach should not depend on how big a collection is.
Actual behavior
On a big has_many_attached collection, #attach takes a long time. If there are 1000 existing attachments, it will take 5 seconds to attach a new one.
System configuration
Rails version: main
Ruby version: 3.1.4
Explanation
When we call #attach on an already big activestorage collection, it first calls record.public_send("#{name}=", blobs + attachables.flatten) # e.g. record.highlights=.... This eventually ends up calling CollectionAssociation#replace_common_records_in_memory which has been discovered as a performance problem in #46652. It ends up calling Array#indexn times and #==(n*n1)/2 times. In this case, 499500 times.
For ActiveRecord has_many collections, this is not a huge problem because, in my experience, post.comments = new_comments is not a common pattern. The general use case is post.comments << new_comment, which does pretty well performance-wise.
Unfortunately for has_many_attached collection, calling #attach is a common use case and it calls record.things_attachments= under the hood. Aside - seems like people are running into this problem.
As flamegraph shows, there is another O(N^2) in #attach. That one is coming from Attached::Changes::CreateOneOfMany#find_attachment. I haven't figured out a possible solution for it... I don't understand the change tracking(?) active storage is doing but if someone can help me understand I can try fixing it. I assume this will probably need a new Attached::Changes::AttachMany and Attached::Changes::AttachOne?
Workaround
I have found that creating blobs and attachments manually gets rid of BOTH problems and doesn't leave highlights_attachments and highlights_blobs stale.
user system total real
attach performance without fix 4.711782 0.013564 4.725346 ( 4.760098)
user system total real
attach performance with half-ish fix 1.855833 0.009120 1.864953 ( 1.901090)
user system total real
attach performance manual 0.024790 0.002624 0.027414 ( 0.030269)
This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 7-1-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.
Steps to reproduce
Apologies for not following the template. I wasn't sure how to include my half-baked fix with it... I have full repro and additional information here: malavbhavsar/rails#1
Expected behavior
Execution time of
#attach
should not depend on how big a collection is.Actual behavior
On a big
has_many_attached
collection,#attach
takes a long time. If there are 1000 existing attachments, it will take 5 seconds to attach a new one.System configuration
Rails version:
main
Ruby version:
3.1.4
Explanation
When we call
#attach
on an already big activestorage collection, it first callsrecord.public_send("#{name}=", blobs + attachables.flatten) # e.g. record.highlights=...
. This eventually ends up callingCollectionAssociation#replace_common_records_in_memory
which has been discovered as a performance problem in #46652. It ends up callingArray#index
n times and#==
(n*n1)/2 times. In this case,499500
times.For ActiveRecord has_many collections, this is not a huge problem because, in my experience,
post.comments = new_comments
is not a common pattern. The general use case ispost.comments << new_comment
, which does pretty well performance-wise.Unfortunately for
has_many_attached
collection, calling#attach
is a common use case and it callsrecord.things_attachments=
under the hood. Aside - seems like people are running into this problem.Flamegraph
Possible solutions
@target_index_map
in Fix performance of CollectionAssociation replace_on_target and include #46652, I tried the same approach with a local variable and it helped.ANOTHER problem
As flamegraph shows, there is another O(N^2) in
#attach
. That one is coming fromAttached::Changes::CreateOneOfMany#find_attachment
. I haven't figured out a possible solution for it... I don't understand the change tracking(?) active storage is doing but if someone can help me understand I can try fixing it. I assume this will probably need a newAttached::Changes::AttachMany
andAttached::Changes::AttachOne
?Workaround
I have found that creating blobs and attachments manually gets rid of BOTH problems and doesn't leave
highlights_attachments
andhighlights_blobs
stale.Final performance stats
cc: @jonathanhefner, @jeffcarbs, @danny-pflughoeft
The text was updated successfully, but these errors were encountered: