Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve bundler/setup performance again by not deduplicating intermediate results #5533

Merged
merged 2 commits into from
May 16, 2022

Conversation

deivid-rodriguez
Copy link
Member

@deivid-rodriguez deivid-rodriguez commented May 13, 2022

What was the end-user or developer problem that led to this PR?

The performance of bundler/setup is still slow.

What is your fix for the problem, implemented in this PR?

On a different patch, it was noticed by @ngan that we are calling LazySpecification#hash many times, and simply memoizing that led to a very considerable performance improvement in his app.

I noticed though that we shouldn't be calling LazySpecification#hash that many times, and I located the culprit at SpecSet#for where we were deduplicating the partial aggregated result on every iteration. It is enough to do it just once at the end.

This leads on a 12% speedup on Rails repository Gemfile vs the previous 8% I was getting from memoizing LazySpecification#hash. Also, after this patch memoizing LazySpecification#hash has no effect in performance anymore.

Make sure the following tasks are checked

On a different patch, it was noticed Ngam Pham that we are calling
`LazySpecification#hash` many times, and simply memoizing that led to a
very considerable performance improvement in his app.

I noticed though that we shouldn't be calling `LazySpecification#hash`
that many times, and I located the culprit at `SpecSet#for` where we
were deduplicating the partial aggregated result on every iteration. It
is enough to do it just once at the end.

This leads on a 12% speedup on Rails repository Gemfile vs the previous
8% I was getting from memoizing `LazySpecification#hash`. Also, after
this patch memoizing `LazySpecification#hash` has no effect in
performance anymore.

Co-authored-by: Ngan Pham <ngan@users.noreply.github.com>
On `rails/rails` repository Gemfile, running the following script

```
# script.rb
require "bundler/setup"
```

#### Before

```
➜  rails git:(main) ✗ BUNDLER_VERSION=2.4.0.dev ruby-memory-profiler --pretty --no-detailed --allocated-strings=0 --retained-strings=0 script.rb
Total allocated: 24.37 MB (207937 objects)
Total retained:  2.98 MB (34152 objects)
```

#### After

```
➜  rails git:(main) ✗ BUNDLER_VERSION=2.4.0.dev ruby-memory-profiler --pretty --no-detailed --allocated-strings=0 --retained-strings=0 script.rb
Total allocated: 22.27 MB (206856 objects)
Total retained:  2.98 MB (34152 objects)
```

Co-authored-by: Josh Nichols <josh.nichols@gusto.com>
@technicalpickles
Copy link
Contributor

Compared this to 2.3.9 and 2.3.13 with our Gemfile:

# output is bundler version, and the output of `Benchmark.measure`
2.3.9
  2.789033   0.243557   3.032590 (  3.830557)

2.3.13
  1.328539   0.277676   1.606215 (  2.631212)

2.4.0.dev
  0.923131   0.097992   1.021123 (  1.024666)

So, looks good!

technicalpickles added a commit to technicalpickles/rubygems that referenced this pull request May 13, 2022
Looking over flamegraphs for `require 'bundler/setup'`, I'm seeing
`Bundler::DepProxy#name` show up quite often. The method itself is really
simple, delegating to the dependency's proxy.

I suspect it is getting called enough that memoizing the value improves
will improve performance by saving a method call, in exchange for saving
the value in memory.

When testing with this patch plus rubygems#5533
I saw time go from 0.92s to 0.75s.
technicalpickles added a commit to technicalpickles/rubygems that referenced this pull request May 13, 2022
…andled deps

I was looking at (yet another) flamegraph in speedscope, and used the
'left hand heavy' and was shocked to realize that 0.5s of the 1.7s
is spent in DepProxy#name. This method _only_ delegates the name to an
underlying spec, so it's not complex at all.

It seems to be of how often this line ends up calling it:

     next if handled.any?{|d| d.name == dep.name && (match_current_platform || d.__platform == dep.__platform) } || dep.name == "bundler"

The `handled` array is built up as dependencies are handled, so this get
slower as more dependencies are installed.

This change changes how `handled` is track. Instead of just an array, I've
tried using a Hash, with the key being a dep's name, and the value being
a list of deps with that name. This means it's constant time to find
the dependencies with the same name.

I saw a drop from 1.7s to 1.0s against master, and from 0.95s to 0.24s
when used with rubygems#5533
@deivid-rodriguez deivid-rodriguez merged commit b0958db into master May 16, 2022
@deivid-rodriguez deivid-rodriguez deleted the alt-speedup branch May 16, 2022 08:24
deivid-rodriguez pushed a commit to technicalpickles/rubygems that referenced this pull request May 16, 2022
…andled deps

I was looking at (yet another) flamegraph in speedscope, and used the
'left hand heavy' and was shocked to realize that 0.5s of the 1.7s
is spent in DepProxy#name. This method _only_ delegates the name to an
underlying spec, so it's not complex at all.

It seems to be of how often this line ends up calling it:

     next if handled.any?{|d| d.name == dep.name && (match_current_platform || d.__platform == dep.__platform) } || dep.name == "bundler"

The `handled` array is built up as dependencies are handled, so this get
slower as more dependencies are installed.

This change changes how `handled` is track. Instead of just an array, I've
tried using a Hash, with the key being a dep's name, and the value being
a list of deps with that name. This means it's constant time to find
the dependencies with the same name.

I saw a drop from 1.7s to 1.0s against master, and from 0.95s to 0.24s
when used with rubygems#5533
@deivid-rodriguez deivid-rodriguez changed the title Improve bundler/setup performance again Improve bundler/setup performance again by not deduplicating intermediate results May 18, 2022
deivid-rodriguez added a commit that referenced this pull request May 18, 2022
Improve `bundler/setup` performance again

(cherry picked from commit b0958db)
technicalpickles added a commit to technicalpickles/rubygems that referenced this pull request May 29, 2022
…andled deps

I was looking at (yet another) flamegraph in speedscope, and used the
'left hand heavy' and was shocked to realize that 0.5s of the 1.7s
is spent in DepProxy#name. This method _only_ delegates the name to an
underlying spec, so it's not complex at all.

It seems to be of how often this line ends up calling it:

     next if handled.any?{|d| d.name == dep.name && (match_current_platform || d.__platform == dep.__platform) } || dep.name == "bundler"

The `handled` array is built up as dependencies are handled, so this get
slower as more dependencies are installed.

This change changes how `handled` is track. Instead of just an array, I've
tried using a Hash, with the key being a dep's name, and the value being
a list of deps with that name. This means it's constant time to find
the dependencies with the same name.

I saw a drop from 1.7s to 1.0s against master, and from 0.95s to 0.24s
when used with rubygems#5533
deivid-rodriguez pushed a commit to technicalpickles/rubygems that referenced this pull request Jun 18, 2022
…andled deps

I was looking at (yet another) flamegraph in speedscope, and used the
'left hand heavy' and was shocked to realize that 0.5s of the 1.7s
is spent in DepProxy#name. This method _only_ delegates the name to an
underlying spec, so it's not complex at all.

It seems to be of how often this line ends up calling it:

     next if handled.any?{|d| d.name == dep.name && (match_current_platform || d.__platform == dep.__platform) } || dep.name == "bundler"

The `handled` array is built up as dependencies are handled, so this get
slower as more dependencies are installed.

This change changes how `handled` is track. Instead of just an array, I've
tried using a Hash, with the key being a dep's name, and the value being
a list of deps with that name. This means it's constant time to find
the dependencies with the same name.

I saw a drop from 1.7s to 1.0s against master, and from 0.95s to 0.24s
when used with rubygems#5533
matzbot pushed a commit to ruby/ruby that referenced this pull request Jun 19, 2022
…ing hash lookup of handled deps

I was looking at (yet another) flamegraph in speedscope, and used the
'left hand heavy' and was shocked to realize that 0.5s of the 1.7s
is spent in DepProxy#name. This method _only_ delegates the name to an
underlying spec, so it's not complex at all.

It seems to be of how often this line ends up calling it:

     next if handled.any?{|d| d.name == dep.name && (match_current_platform || d.__platform == dep.__platform) } || dep.name == "bundler"

The `handled` array is built up as dependencies are handled, so this get
slower as more dependencies are installed.

This change changes how `handled` is track. Instead of just an array, I've
tried using a Hash, with the key being a dep's name, and the value being
a list of deps with that name. This means it's constant time to find
the dependencies with the same name.

I saw a drop from 1.7s to 1.0s against master, and from 0.95s to 0.24s
when used with rubygems/rubygems#5533

rubygems/rubygems@844dac30d4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants