Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

O(N^2) operation in CollectionAssociation#replace_common_records_in_memory and **another** O(N^2) operation in ActiveStorage::Attached::Many#attach #50848

Open
malavbhavsar opened this issue Jan 23, 2024 · 1 comment

Comments

@malavbhavsar
Copy link
Contributor

malavbhavsar commented Jan 23, 2024

Steps to reproduce

Apologies for not following the template. I wasn't sure how to include my half-baked fix with it... I have full repro and additional information here: malavbhavsar/rails#1

user_1 = User.create!(name: "Jason")
user_1.highlights.attach(
  1000.times.map do |i|
    {
      io: StringIO.new("Example string inside text_file_#{i}"),
      filename: "text_file_#{i}.txt",
      content_type: "text/plain",
    }
  end
)
user_1.save!
user_1.reload

# Below operation takes 5 seconds
Benchmark.bm(30) do |x|
  x.report("attach performance without fix") do
    user_1.highlights.attach(
      {
        io: StringIO.new("another text file. wow."),
        filename: "text_file_another.txt",
        content_type: "text/plain",
      }
    )
  end
end

Expected behavior

Execution time of #attach should not depend on how big a collection is.

Actual behavior

On a big has_many_attached collection, #attach takes a long time. If there are 1000 existing attachments, it will take 5 seconds to attach a new one.

System configuration

Rails version: main

Ruby version: 3.1.4

Explanation

When we call #attach on an already big activestorage collection, it first calls record.public_send("#{name}=", blobs + attachables.flatten) # e.g. record.highlights=.... This eventually ends up calling CollectionAssociation#replace_common_records_in_memory which has been discovered as a performance problem in #46652. It ends up calling Array#index n times and #== (n*n1)/2 times. In this case, 499500 times.

For ActiveRecord has_many collections, this is not a huge problem because, in my experience, post.comments = new_comments is not a common pattern. The general use case is post.comments << new_comment, which does pretty well performance-wise.

Unfortunately for has_many_attached collection, calling #attach is a common use case and it calls record.things_attachments= under the hood. Aside - seems like people are running into this problem.

Flamegraph

flamegraph_1

Possible solutions

ANOTHER problem

As flamegraph shows, there is another O(N^2) in #attach. That one is coming from Attached::Changes::CreateOneOfMany#find_attachment. I haven't figured out a possible solution for it... I don't understand the change tracking(?) active storage is doing but if someone can help me understand I can try fixing it. I assume this will probably need a new Attached::Changes::AttachMany and Attached::Changes::AttachOne?

Workaround

I have found that creating blobs and attachments manually gets rid of BOTH problems and doesn't leave highlights_attachments and highlights_blobs stale.

user_3 = User.create!(name: "Lauren")
user_3.highlights.attach(
  1000.times.map do |i|
    {
      io: StringIO.new("Example string inside text_file_#{i}"),
      filename: "text_file_#{i}.txt",
      content_type: "text/plain",
    }
  end
)
user_3.save!
user_3.reload

# Below operation takes 0.02 second
ApplicationRecord.transaction do
  blob = ActiveStorage::Blob.create_and_upload!(
    io: StringIO.new("another text file. wow."),
    filename: "text_file_another.txt",
    content_type: "text/plain",
  )
  user_3.highlights_attachments.create!(
    blob_id: blob.id,
    name: 'highlights',
  )
  user_3.save!
end

Final performance stats

                                              user     system      total        real
attach performance without fix            4.711782   0.013564   4.725346 (  4.760098)
                                              user     system      total        real
attach performance with half-ish fix      1.855833   0.009120   1.864953 (  1.901090)
                                              user     system      total        real
attach performance manual                 0.024790   0.002624   0.027414 (  0.030269)

cc: @jonathanhefner, @jeffcarbs, @danny-pflughoeft

@rails-bot
Copy link

rails-bot bot commented Apr 22, 2024

This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 7-1-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@rails-bot rails-bot bot added the stale label Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants