Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #12537 performance regression when preloading has_many_through association #19423

Conversation

yuroyoro
Copy link
Contributor

Fix #12537

I profiled the benchmark code that preloads has_many_through association,
and found that most of time spent by calling CollectionAssociation#reader method in associated_records_by_owner.

Here is a result of profiling associated_records_by_owner method.

CollectionAssociation#reader creates new CollectionProxy object if it doesn't exists.
But, instantiate CollectionProxy is expensive operation.
Because creating the Relation object by calling Association#scope is expensive,
and merging it into CollectionProxy is also.

c86a32d#commitcomment-5384529

In this case, the association's records are already loaded by the preloader.
It just read records by Association#target that returns loaded records,
instead of reader method.

Here are benchmark results.

================================================================================
3.2.21 - postgresql
================================================================================

Calculating -------------------------------------
  limit 100 has_many    12.000  i/100ms
limit 100 has_many_through
                         5.000  i/100ms
limit 100 double has_many_through
                         3.000  i/100ms
 limit 1000 has_many     1.000  i/100ms
limit 1000 has_many_through
                         1.000  i/100ms
limit 1000 double has_many_through
                         1.000  i/100ms
-------------------------------------------------
  limit 100 has_many    129.401  (± 3.9%) i/s -    648.000
limit 100 has_many_through
                         53.440  (± 7.5%) i/s -    270.000
limit 100 double has_many_through
                         31.399  (± 6.4%) i/s -    156.000
 limit 1000 has_many     15.410  (± 6.5%) i/s -     77.000
limit 1000 has_many_through
                          6.097  (± 0.0%) i/s -     31.000
limit 1000 double has_many_through
                          3.520  (± 0.0%) i/s -     18.000

================================================================================
5.0.0.alpha - postgresql
================================================================================

Calculating -------------------------------------
  limit 100 has_many     8.000  i/100ms
limit 100 has_many_through
                         1.000  i/100ms
limit 100 double has_many_through
                         1.000  i/100ms
 limit 1000 has_many     1.000  i/100ms
limit 1000 has_many_through
                         1.000  i/100ms
limit 1000 double has_many_through
                         1.000  i/100ms
-------------------------------------------------
  limit 100 has_many     92.615  (± 4.3%) i/s -    464.000
limit 100 has_many_through
                          8.574  (± 0.0%) i/s -     43.000
limit 100 double has_many_through
                          5.300  (± 0.0%) i/s -     27.000
 limit 1000 has_many     12.712  (± 0.0%) i/s -     64.000
limit 1000 has_many_through
                          0.892  (± 0.0%) i/s -      5.000  in   5.610751s
limit 1000 double has_many_through
                          0.516  (± 0.0%) i/s -      3.000  in   5.814872s

================================================================================
5.0.0.alpha (after) - postgresql
================================================================================

Calculating -------------------------------------
  limit 100 has_many     8.000  i/100ms
limit 100 has_many_through
                         3.000  i/100ms
limit 100 double has_many_through
                         1.000  i/100ms
 limit 1000 has_many     1.000  i/100ms
limit 1000 has_many_through
                         1.000  i/100ms
limit 1000 double has_many_through
                         1.000  i/100ms
-------------------------------------------------
  limit 100 has_many     91.012  (± 4.4%) i/s -    456.000
limit 100 has_many_through
                         31.822  (± 3.1%) i/s -    159.000
limit 100 double has_many_through
                         19.298  (± 5.2%) i/s -     97.000
 limit 1000 has_many     10.621  (± 0.0%) i/s -     54.000
limit 1000 has_many_through
                          3.767  (± 0.0%) i/s -     19.000
limit 1000 double has_many_through
                          2.144  (± 0.0%) i/s -     11.000

In the benchmark case "limit 1000 double has_many_through", it still slower than 3.2, but faster then master.

This is a benchmarks code that I used.

https://gist.github.com/yuroyoro/89fae1150e285658db45

@@ -18,7 +18,8 @@ def associated_records_by_owner(preloader)
through_records = owners.map do |owner|
association = owner.association through_reflection.name

[owner, Array(association.reader)]
center = association.loaded? ? association.target : association.reader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can extract this logic into a method and use it at the both the places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a method target_records_from_association and put that line into it.

@kuldeepaggarwal
Copy link
Contributor

@yuroyoro Can we add a test case for this?

@yuroyoro
Copy link
Contributor Author

I wrote test that asserts the reader method should never called when preloading has_many_through records.

I think this assertion will be failed when changing behavior of preloader on future.
So I commented why the preloader avoids calling reader in this case.
For helping decision that this assertion should be modified or removed .

@kuldeepaggarwal
Copy link
Contributor

Please squash your commits.

@yuroyoro yuroyoro force-pushed the fix_performance_regression_of_preloading_has_many_through_relation branch from aa292fd to 97a6aba Compare March 20, 2015 09:02
@yuroyoro
Copy link
Contributor Author

Sorry, I squashed

@amatsuda
Copy link
Member

@sgrif This looks great. Can you take a look?

For performance, Avoid instantiate CollectionProxy.
Fixes rails#12537
@yuroyoro yuroyoro force-pushed the fix_performance_regression_of_preloading_has_many_through_relation branch from 97a6aba to 842f5c2 Compare April 17, 2015 02:20
@mrbrdo
Copy link
Contributor

mrbrdo commented Dec 7, 2015

@rafaelfranca this is quite old, what is the status? has this already been fixed?

tenderlove added a commit that referenced this pull request Dec 19, 2015
…_preloading_has_many_through_relation

Fix #12537 performance regression when preloading has_many_through association
@tenderlove tenderlove merged commit 914a45b into rails:master Dec 19, 2015
@@ -89,6 +90,10 @@ def through_scope

scope
end

def target_records_from_association(association)
association.loaded? ? association.target : association.reader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuroyoro why do you check against loaded? here? It seems that association is always loaded in this case and all tests remain green when I delete it. Can you add a test that fails without it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Major performance regression when preloading has_many_through association
7 participants