Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Perfomance regression using 'includes' on 3.1-stable and master #1873

Closed
slbug opened this Issue Jun 28, 2011 · 13 comments

Comments

Projects
None yet
4 participants
Contributor

slbug commented Jun 28, 2011

I have complex hmt associations and using 'includes' to load it. Every request takes about 15 seconds to complete. In logs i see not slow queries (max 100ms), but high cpu and memory load. When i removed 'includes' it responses in about 200ms (vs 15s).

P.S. Identity map enabled. Postgres adapter

If you need sample app with data, i can povide it.

Contributor

dyba commented Jun 28, 2011

Yeah, you should post that information so that we know what you are talking about.

Contributor

slbug commented Jun 28, 2011

https://github.com/slbug/rails-issue-1873

test application. database dump in dump.sql. You can see issue using rails console.

Without includes

Group.select("DISTINCT ON(\"#{Group.table_name}\".\"id\") \"#{Group.table_name}\".*").joins({ :attributes => :products }).where({ :attributes => { :products => { :id => 1 } } })

With includes

Group.select("DISTINCT ON(\"#{Group.table_name}\".\"id\") \"#{Group.table_name}\".*").joins({ :attributes => :products }).where({ :attributes => { :products => { :id => 1 } } }).includes({ :attributes => :products })

If you run those commands in console you will see difference in perfomance.

Contributor

slbug commented Jun 28, 2011

Also you can just open site (not console) to see this issue, just uncomment line in product model. But if there is only 1-2 products, then it will be ok, but 15+ products slows it. So i suggest to restore my dump. May be it postgres adapter issue, i'm not sure.

@jonleighton jonleighton was assigned Jun 28, 2011

Member

jonleighton commented Jun 28, 2011

Thanks for the report. I rewrote the association preloader for 3.1 and it looks like that introduced this performance regression. I can repro with the more simple query of Group.includes(:attributes => :products).to_a, but Group.includes(:attributes).to_a is much quicker so it looks like nested preloading is introducing some exponential complexity. I'll fix this ASAP.

@tenderlove tenderlove was assigned Jun 30, 2011

@jonleighton jonleighton added a commit that referenced this issue Jul 7, 2011

@jonleighton jonleighton Don't construct association scope in initializer. This yields a big p…
…erformance gain for cases where the association is never used to load the target, for example with preloading. Related: #1873.
fe2167f

@jonleighton jonleighton added a commit that referenced this issue Jul 7, 2011

@jonleighton jonleighton Don't construct association scope in initializer. This yields a big p…
…erformance gain for cases where the association is never used to load the target, for example with preloading. Related: #1873.
86390c3
Member

jonleighton commented Jul 7, 2011

This is fixed now. There were multiple issues and it ended up being a joint effort between @tenderlove and myself, but you should find that the 3-1-stable and master branches now actually perform better than 3-0-stable. Thanks for reporting the issue.

@jonleighton jonleighton closed this Jul 7, 2011

Contributor

slbug commented Jul 7, 2011

@jonleighton, not sure about this. just updated my bundle (rails master 86390c3)

With includes - "Completed 200 OK in 67758ms (Views: 61647.7ms | ActiveRecord: 6092.3ms" (including only associations that will be used for sure, so it won't load unnecessary data) And still huge memory/cpu consumption.
Without - "Completed 200 OK in 932ms (Views: 783.9ms | ActiveRecord: 141.7ms)"

P.S. Tested on my database (about 40 times bigger than in my example) application. Dev env.

Contributor

slbug commented Jul 7, 2011

Since i can't reopen issue i hope you have got my message...
/cc @jonleighton

@tenderlove tenderlove reopened this Jul 7, 2011

Member

jonleighton commented Jul 7, 2011

@slbug I got your message, I just don't sit around all day responding to Github issues ;)

It looks like most of the time in your test is spent in the view, which seems to indicate it is not an Active Record issue (if the speed is slower that 3.0). Please can you test the same code against 3.0.9 and tell me how long it takes? Is 3.1 slower? Also, please compare the actual Active Record query speed in 3.0 vs 3.1 via the console (so avoiding any other components that may or may not have separate issues).

It might be helpful to chat via IM - you can add me to GTalk/jabber/whatever as jonleighton@jabber.org

Contributor

slbug commented Jul 7, 2011

61647.7ms vs 783.9ms in views for same data. not sure if it's views problem. going to create 3.0.9 test app with same data to test it.

Contributor

slbug commented Jul 8, 2011

Using this in console

Group.select("DISTINCT ON(\"#{Group.table_name}\".\"id\") \"#{Group.table_name}\".*").joins({ :attributes => :products }).where({ :attributes => { :products => { :id => 16958 } } }).includes({ :attributes => :products })

Rails 3.1(master) console:
with includes -

Group Load (2.6ms)
Attribute Load (5.7ms)
Specification Load (6510.0ms)
Product Load (44.8ms)

Full log - https://gist.github.com/1071096
Real execution time was about 45 seconds with 100% cpu load(4 cores) and about 6GB memory used... Even waiting for first log output (Group Load(2.6ms)) tooks about 6-7 seconds.


Rails 3.0.9

without includes - Completed 200 OK in 699ms (Views: 560.2ms | ActiveRecord: 121.9ms)
with - segfault on lib/active_record/dynamic_finder_match.rb:14

rails console:

with includes -
SystemStackError: stack level too deep
from /home/bug/.rvm/gems/ruby-1.9.2-p180/gems/activerecord-3.0.9/lib/active_record/reflection.rb:202

Contributor

slbug commented Jul 8, 2011

Another issue found - preloading should be done via inner joins, not via separate queries

/cc @jonleighton

Contributor

slbug commented Jul 9, 2011

Seems like it not gets rights ids from first query and just tries to load all associated records with Attribute model, then Specification model. And simply ignoring product_id. So it gives huge queries (e.g. loading all products which have some attribute) and this is wrong.

Member

jonleighton commented Jul 9, 2011

Okay, I have looked at this some more. I found two remaining issues.

Firstly, your code essentially looks like this:

Foo.joins(:bar).where(:bar => { :id => X }).includes(:bar)

Given that you reference :bar in the where condition, it is actually a bit meaningless to have joins(:bar) as well as includes(:bar), because the latter should force the loading to be done via joins (rather than separate queries) anyway.

However, you hit upon a bug whereby the presence of both includes and joins would be that the query is done with joins, but then subsequent queries are still executed for the includes.

I have fixed that in the above commit, so your code should now only result in a single query.

The second issue that has been exposed is that, supposing Foo has a :through association named :bar, Foo.includes(:bar).all.map { |f| f.bar } is actually slower (in terms of CPU time within Rails) than Foo.all.map { |f| f.bar }.

I have realised that the reason for this is that the implementation of preloading for :through associations is currently sub-optimal. It will basically preload all the through records, and then iterate over the through records and preload the source records on the through records. This results in two queries rather than one query with a join, and means that all the through records are unnecessarily loaded into memory with the CPU overhead that that incurs.

The solution to this problem is a bit more complicated, so is unlikely to get into 3.1 (as it's not a regression).

I have created #2020 to track this second issue. (It doesn't affect your code though.)

@jonleighton jonleighton closed this Jul 9, 2011

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment