Skip to content
This repository has been archived by the owner on Apr 14, 2021. It is now read-only.

No longer automatically subtract 1 from user-specified --jobs #5808

Closed

Conversation

tjschuck
Copy link

@tjschuck tjschuck commented Jun 22, 2017

This argument comes in two parts, but luckily, the first one is both easier to understand and hopefully to agree with.

Background: Bundler 1.4.0 added support for parallel installation via the --jobs param. Soon after, this blog post (probably greatly amplified by this Thoughtbot blog post) popularized the recommendation "set --jobs to nproc - 1".

Not long after, probably also inspired by the popularity of this tip, this "n - 1 jobs" advice got codified into Bundler itself: 66acd02

However, my assertion here is Bundler should not do that.

The first argument (the easy one) is just that it's not doing what the user asks for. For all the people following the (seemingly popular) tip to set their jobs to nproc - 1, they're actually ending up with the probably-worse - 2. Even worse than that, if a user does a conservative --jobs 2, they're getting no benefit — Bundler is quietly taking their parallelization back down to "no parallelization".

Hopefully that's a sufficient argument on its own, but the part II is that this blanket advice is probably out-of-date anyway.

Using this script, I repeatedly installed 29 gems (installing them to a vendor/ dir and deleting it in between runs). I averaged the time over 10 runs per --jobs value, but the trend holds regardless of how many runs you do.

Note that these numbers are for a machine with 2 physical cores and 4 virtual ones (a Mac, reporting 2 and 4 respectively from sysctl -n hw.physicalcpu and sysctl -n hw.ncpu, the latter corresponding to Linux's nproc).

~/Code/tmp/bundler_jobs_bench ☠  ./bundler_jobs_bench.sh

Installing 29 gems repeatedly...

===============================================
Using Bundler version 1.15.1 (current release version)
===============================================

--jobs 1  5.58435780 seconds on average (across 10 runs)
--jobs 2  5.35010690 seconds on average (across 10 runs)
--jobs 3  3.93493610 seconds on average (across 10 runs)
--jobs 4  3.86082760 seconds on average (across 10 runs)
--jobs 5  3.24673650 seconds on average (across 10 runs)
--jobs 6  3.49340190 seconds on average (across 10 runs)
--jobs 7  3.26473430 seconds on average (across 10 runs)
--jobs 8  3.34560500 seconds on average (across 10 runs)

===============================================
Using development version (no more n - 1 jobs)
===============================================

--jobs 1  4.32629540 seconds on average (across 10 runs)
--jobs 2  3.48100690 seconds on average (across 10 runs)
--jobs 3  3.30937880 seconds on average (across 10 runs)
--jobs 4  3.30868200 seconds on average (across 10 runs)
--jobs 5  3.54932920 seconds on average (across 10 runs)
--jobs 6  3.36123440 seconds on average (across 10 runs)
--jobs 7  3.96490630 seconds on average (across 10 runs)
--jobs 8  3.39955640 seconds on average (across 10 runs)

From the above, you can see:

  1. In the first block, no notable change between --jobs 1 and --jobs 2 — that's because they're currently the same.

  2. In both, a best time corresponding to the value that (effectively) matches nproc, not nproc - 1.

  3. Regardless of nproc coming out best in this run, there is close enough performance among the range of nproc - 1 through to nproc * 2 that it doesn't seem like anything in particular (like the - 1 removed in this commit) should be codified — people seeking to particularly optimize their bundler runtimes should do their own tweaking of the value, and it should be respected as given.

This argument comes in two parts, but luckily, the first one is both easier to understand and hopefully to agree with.

Background: Bundler 1.4.0 added support for parallel installation via the `--jobs` param.  Soon after, [this blog post](http://archlever.blogspot.com/2013/09/lies-damned-lies-and-truths-backed-by.html) (probably greatly amplified by [this Thoughtbot blog post](https://robots.thoughtbot.com/parallel-gem-installing-using-bundler)) popularized the recommendation "set `--jobs` to `nproc - 1`".

Not long after, probably also inspired by the popularity of this tip, this "n - 1 jobs" advice got codified into Bundler itself: rubygems@66acd02

However, my assertion here is _Bundler should not do that_.

The first argument (the easy one) is just that it's not doing what the user asks for.  For all the people following the (seemingly popular) tip to set their jobs to `nproc - 1`, they're actually ending up with the probably-worse `- 2`.  Even worse than that, if a user does a conservative `--jobs 2`, they're getting _no benefit_ — Bundler is quietly taking their parallelization back down to "no parallelization".

Hopefully that's a sufficient argument on its own, but the part II is that this blanket advice is probably out-of-date anyway.

Using [this script](https://gist.github.com/tjschuck/ca1d37a8869d1cc01313171b4b318094), I repeatedly installed 29 gems (installing them to a `vendor/` dir and deleting it in between runs).  I averaged the time over 10 runs per --jobs value, but the trend holds regardless of how many runs you do.

Note that these numbers are for a machine with 2 physical cores and 4 virtual ones (a Mac, reporting 2 and 4 respectively from `sysctl -n hw.physicalcpu` and `sysctl -n hw.ncpu`, the latter corresponding to Linux's `nproc`).

```
~/Code/tmp/bundler_jobs_bench ☠  ./bundler_jobs_bench.sh

Installing 29 gems repeatedly...

===============================================
Using Bundler version 1.15.1 (current release version)
===============================================

--jobs 1  5.58435780 seconds on average (across 10 runs)
--jobs 2  5.35010690 seconds on average (across 10 runs)
--jobs 3  3.93493610 seconds on average (across 10 runs)
--jobs 4  3.86082760 seconds on average (across 10 runs)
--jobs 5  3.24673650 seconds on average (across 10 runs)
--jobs 6  3.49340190 seconds on average (across 10 runs)
--jobs 7  3.26473430 seconds on average (across 10 runs)
--jobs 8  3.34560500 seconds on average (across 10 runs)

===============================================
Using development version (no more n - 1 jobs)
===============================================

--jobs 1  4.32629540 seconds on average (across 10 runs)
--jobs 2  3.48100690 seconds on average (across 10 runs)
--jobs 3  3.30937880 seconds on average (across 10 runs)
--jobs 4  3.30868200 seconds on average (across 10 runs)
--jobs 5  3.54932920 seconds on average (across 10 runs)
--jobs 6  3.36123440 seconds on average (across 10 runs)
--jobs 7  3.96490630 seconds on average (across 10 runs)
--jobs 8  3.39955640 seconds on average (across 10 runs)
```

From the above, you can see:

1. In the first block, no notable change between `--jobs 1` and `--jobs 2` — that's because they're currently the same.

2. In both, a best time corresponding to the value that (effectively) matches nproc, _not_ nproc - 1.

3. Regardless of nproc coming out best in this run, there is close enough performance among the range of `nproc - 1` through to `nproc * 2` that it doesn't seem like anything in particular (like the `- 1` removed in this commit) should be codified — people seeking to particularly optimize their bundler runtimes should do their own tweaking of the value, and it should be respected as given.
@ghost
Copy link

ghost commented Jun 22, 2017

Thanks for opening a pull request and helping make Bundler better! Someone from the Bundler team will take a look at your pull request shortly and leave any feedback. Please make sure that your pull request has tests for any changes or added functionality.

We use Travis CI to test and make sure your change works functionally and uses acceptable conventions, you can review the current progress of Travis CI in the PR status window below.

If you have any questions or concerns that you wish to ask, feel free to leave a comment in this PR or join our #bundler channel on Slack.

For more information about contributing to the Bundler project feel free to review our CONTRIBUTING guide

@segiddins
Copy link
Member

@indirect I think this PR is yours to review

The previous version of this test was slightly broken in a few ways:

1. The worker output (the preceeding `"x: "` that was being checked by `expect(out).to match(/[1-3]: /)`) is 0-indexed, so it should have at the very least been `[0-3]`.

2. However, the tests only checked that _any_ worker had _any_ output value of 1, 2, or 3. They could have all been "1" and the test would still pass.  Specifically, in the old version of this test that checked `expect(out).to match(/[1-3]: /)`, "3" was never output.

3. The final `expect(out).to match(/: "4"/)` test is only testing the echoing back of the jobs param, and was not actually testing that any parallelism occurred.

This commit lowers the `jobs` param down to 2 instead of 4 to try to guarantee that the test will always hit "full" parallelism, and then checks that there are actual workers for both `0` and `1`, not just "any" workers.
@tjschuck
Copy link
Author

I just pushed up 9831fe9 to actually red/green test this as well. I couldn't figure out the tests at first, because it turned out they were subtly broken in a way that made it non-obvious that this n-1ing was happening at all.

The previous version of this test was slightly broken in a few ways:

  1. The worker output (the preceeding "x: " that was being checked by expect(out).to match(/[1-3]: /)) is 0-indexed, so it should have at the very least been [0-3].

  2. However, the tests only checked that any worker had any output value of 1, 2, or 3. They could have all been "1" and the test would still pass. Specifically, in the old version of this test that checked expect(out).to match(/[1-3]: /), "3" was never output.

  3. The final expect(out).to match(/: "4"/) test is only testing the echoing back of the jobs param, and was not actually testing that any parallelism occurred.

9831fe9 lowers the jobs param down to 2 instead of 4 to try to guarantee that the test will always hit "full" parallelism, and then checks that there are actual workers for both 0 and 1, not just "any" workers.

@indirect
Copy link
Member

I agree with the argument made here—let's use the value of jobs as given in 2.0, rather than subtracting 1.

Relatedly, I think 2.0 should default jobs to either the number of processors or nproc-1, changing parallel installation from opt in to opt out instead.

@segiddins
Copy link
Member

@indirect something like this?

diff --git a/lib/bundler/feature_flag.rb b/lib/bundler/feature_flag.rb
index b07ea4967..733dc265d 100644
--- a/lib/bundler/feature_flag.rb
+++ b/lib/bundler/feature_flag.rb
@@ -15,6 +15,7 @@ module Bundler
     (1..10).each {|v| define_method("bundler_#{v}_mode?") { major_version >= v } }
 
     settings_flag(:allow_offline_install) { bundler_2_mode? }
+    settings_flag(:auto_config_jobs) { bundler_2_mode? }
     settings_flag(:error_on_stderr) { bundler_2_mode? }
     settings_flag(:init_gems_rb) { bundler_2_mode? }
     settings_flag(:only_update_to_newer_versions) { bundler_2_mode? }
diff --git a/lib/bundler/installer.rb b/lib/bundler/installer.rb
index 3212a6821..eafaf64fd 100644
--- a/lib/bundler/installer.rb
+++ b/lib/bundler/installer.rb
@@ -164,11 +164,18 @@ module Bundler
     def install(options)
       load_plugins
       force = options["force"]
-      jobs = 1
-      jobs = [Bundler.settings[:jobs].to_i, 1].max if can_install_in_parallel?
       install_in_parallel jobs, options[:standalone], force
     end
 
+    def jobs
+      return 1 unless can_install_in_parallel?
+      if set_jobs = Bundler.settings[:jobs]
+        [set_jobs, 1].max
+      elsif Bundler.feature_flag.auto_config_jobs?
+        SharedHelpers.processor_count
+      end
+    end
+
     def load_plugins
       Bundler.rubygems.load_plugins
 
diff --git a/lib/bundler/settings.rb b/lib/bundler/settings.rb
index e0eb660a8..9efc2de8e 100644
--- a/lib/bundler/settings.rb
+++ b/lib/bundler/settings.rb
@@ -8,6 +8,7 @@ module Bundler
 
     BOOL_KEYS = %w[
       allow_offline_install
+      auto_config_jobs
       auto_install
       cache_all
       cache_all_platforms
@@ -35,6 +36,7 @@ module Bundler
     ].freeze
 
     NUMBER_KEYS = %w[
+      jobs
       redirect
       retry
       ssl_verify_mode
diff --git a/lib/bundler/shared_helpers.rb b/lib/bundler/shared_helpers.rb
index 39f0423f7..224a310c4 100644
--- a/lib/bundler/shared_helpers.rb
+++ b/lib/bundler/shared_helpers.rb
@@ -76,7 +76,7 @@ module Bundler
 
       keys.each {|key| ENV.delete(key) }
 
-      block.call
+      yield
     ensure
       keys.each {|key| ENV[key] = old_env[key] }
     end
@@ -111,13 +111,13 @@ module Bundler
     def filesystem_access(path, action = :write, &block)
       # Use block.call instead of yield because of a bug in Ruby 2.2.2
       # See https://github.com/bundler/bundler/issues/5341 for details
-      block.call(path.dup.untaint)
+      yield(path.dup.untaint)
     rescue Errno::EACCES
       raise PermissionError.new(path, action)
     rescue Errno::EAGAIN
       raise TemporaryResourceError.new(path, action)
     rescue Errno::EPROTO
-      raise VirtualProtocolError.new
+      raise VirtualProtocolError
     rescue Errno::ENOSPC
       raise NoSpaceOnDeviceError.new(path, action)
     rescue *[const_get_safely(:ENOTSUP, Errno)].compact
@@ -153,7 +153,7 @@ module Bundler
 
     def trap(signal, override = false, &block)
       prior = Signal.trap(signal) do
-        block.call
+        yield
         prior.call unless override
       end
     end
@@ -176,6 +176,13 @@ module Bundler
         "\nEither installing with `--full-index` or running `bundle update #{spec.name}` should fix the problem."
     end
 
+    def processor_count
+      require "etc"
+      Etc.nprocessors
+    rescue
+      1
+    end
+
   private
 
     def find_gemfile(order_matters = false)

@indirect
Copy link
Member

@segiddins yeah, that looks about right to me. I was thinking we might need to borrow this CPU counter, but I'm not sure how it compares to the one inside etc: https://github.com/celluloid/celluloid/blob/f8e48978f4a299148df65ec2402aaabcdceb7607/lib/celluloid/internals/cpu_counter.rb

@tjschuck
Copy link
Author

It looks like Etc.nprocessors might only be for *nix-y systems and might not work well with Windows, but I don't have a Windows machine to check.

The linked Celluloid implementation also has a special implementation for JRuby, but it looks like JRuby has natively implemented Etc::nprocessors.

@segiddins
Copy link
Member

I'm fine with the auto-discovery only working where Etc.nprocessors is implemented, since if it doesn't work the user can always just configure jobs and I'd rather we not introduce platform-specific code that we won't be testing on CI

@indirect
Copy link
Member

indirect commented Jun 22, 2017

I feel really bad about the state of our windows support, but I'm willing to admit that is unrelated to this ticket. :P Using Etc seems fine. 👍🏻

@tjschuck
Copy link
Author

tjschuck commented Jun 22, 2017

Sorry, just to clarify, this is a PR, not just an issue :)

Re:

let's use the value of jobs as given in 2.0, rather than subtracting 1.

Is master currently targeting 2.0, or are you saying this shouldn't be pulled into the current working version and should be reserved for just 2.0?

I agree that "default to nproc in 2.0" is a separate issue, and I can fix that separately if you'd like (using roughly what @segiddins shared above, unless he's already on it), but I think the "don't automatically n-1 the user-provided value" should go into whatever the upcoming version is, 2.0 or otherwise.

Or are you suggesting I push the "default to nproc in 2.0" directly to this PR? I can do that as well.

@indirect
Copy link
Member

@tjschuck sorry, I did realize this is a PR! Thank you for sending it. 😁 I'm not comfortable with changing the way we handle JOBS in a minor release, especially given that it has been the way it is now for many years.

However, your argument is convincing, and I think we should change how we treat the JOBS setting in 2.0. The master branch is currently accepting changes for 2.0 via feature-flagging, so this is a reasonable place to send your PR.

I am also suggesting that, along with changing how we handle JOBS, it should default directly to nproc in 2.0. If you want to implement that as well, that would be 💯.

@bundlerbot
Copy link
Collaborator

☔ The latest upstream changes (presumably #5800) made this pull request unmergeable. Please resolve the merge conflicts.

@segiddins
Copy link
Member

Please see #5986 for an alternative implementation

bundlerbot added a commit that referenced this pull request Sep 6, 2017
[2.0] Auto-configure job count

Closes #5808.

The description of that issue, copied verbatim:

This argument comes in two parts, but luckily, the first one is both easier to understand and hopefully to agree with.

Background: Bundler 1.4.0 added support for parallel installation via the `--jobs` param.  Soon after, [this blog post](http://archlever.blogspot.com/2013/09/lies-damned-lies-and-truths-backed-by.html) (probably greatly amplified by [this Thoughtbot blog post](https://robots.thoughtbot.com/parallel-gem-installing-using-bundler)) popularized the recommendation "set `--jobs` to `nproc - 1`".

Not long after, probably also inspired by the popularity of this tip, this "n - 1 jobs" advice got codified into Bundler itself: 66acd02

However, my assertion here is _Bundler should not do that_.

The first argument (the easy one) is just that it's not doing what the user asks for.  For all the people following the (seemingly popular) tip to set their jobs to `nproc - 1`, they're actually ending up with the probably-worse `- 2`.  Even worse than that, if a user does a conservative `--jobs 2`, they're getting _no benefit_ — Bundler is quietly taking their parallelization back down to "no parallelization".

Hopefully that's a sufficient argument on its own, but the part II is that this blanket advice is probably out-of-date anyway.

Using [this script](https://gist.github.com/tjschuck/ca1d37a8869d1cc01313171b4b318094), I repeatedly installed 29 gems (installing them to a `vendor/` dir and deleting it in between runs).  I averaged the time over 10 runs per --jobs value, but the trend holds regardless of how many runs you do.

Note that these numbers are for a machine with 2 physical cores and 4 virtual ones (a Mac, reporting 2 and 4 respectively from `sysctl -n hw.physicalcpu` and `sysctl -n hw.ncpu`, the latter corresponding to Linux's `nproc`).

```
~/Code/tmp/bundler_jobs_bench ☠  ./bundler_jobs_bench.sh

Installing 29 gems repeatedly...

===============================================
Using Bundler version 1.15.1 (current release version)
===============================================

--jobs 1  5.58435780 seconds on average (across 10 runs)
--jobs 2  5.35010690 seconds on average (across 10 runs)
--jobs 3  3.93493610 seconds on average (across 10 runs)
--jobs 4  3.86082760 seconds on average (across 10 runs)
--jobs 5  3.24673650 seconds on average (across 10 runs)
--jobs 6  3.49340190 seconds on average (across 10 runs)
--jobs 7  3.26473430 seconds on average (across 10 runs)
--jobs 8  3.34560500 seconds on average (across 10 runs)

===============================================
Using development version (no more n - 1 jobs)
===============================================

--jobs 1  4.32629540 seconds on average (across 10 runs)
--jobs 2  3.48100690 seconds on average (across 10 runs)
--jobs 3  3.30937880 seconds on average (across 10 runs)
--jobs 4  3.30868200 seconds on average (across 10 runs)
--jobs 5  3.54932920 seconds on average (across 10 runs)
--jobs 6  3.36123440 seconds on average (across 10 runs)
--jobs 7  3.96490630 seconds on average (across 10 runs)
--jobs 8  3.39955640 seconds on average (across 10 runs)
```

From the above, you can see:

1. In the first block, no notable change between `--jobs 1` and `--jobs 2` — that's because they're currently the same.

2. In both, a best time corresponding to the value that (effectively) matches nproc, _not_ nproc - 1.

3. Regardless of nproc coming out best in this run, there is close enough performance among the range of `nproc - 1` through to `nproc * 2` that it doesn't seem like anything in particular (like the `- 1` removed in this commit) should be codified — people seeking to particularly optimize their bundler runtimes should do their own tweaking of the value, and it should be respected as given.
@swrobel
Copy link

swrobel commented Apr 13, 2018

I would just like to point out how supremely infuriating it is to learn that bundler takes an argument where I very clearly specify a number, and then says "no, you don't actually want that number, what you really want is that number minus one, because REASONS! oh, but not really because I'm not even telling you that I'm doing this!!!"

@indirect
Copy link
Member

@swrobel really? you came here to be mad about something that we already agree is bad, and already fixed six months ago?

Additionally, your tone here is outside what's allowed by our code of conduct. Please stop, so we won't have to take any enforcement action. Thanks.

@swrobel
Copy link

swrobel commented Apr 13, 2018

I just read your entire code of conduct, republished here below:

The use of sexualized language or imagery and unwelcome sexual attention or advances, including when simulated online. The only exception to sexual topics is channels/spaces specifically for topics of sexual identity.
Trolling, insulting/derogatory comments, and personal or political attacks.
Casual mention of slavery or indentured servitude and/or false comparisons of one’s occupation or situation to slavery. Please consider using or asking about alternate terminology when referring to such metaphors in technology.
Making light of/making mocking comments about trigger warnings and content warnings.
Public or private harassment, deliberate intimidation, or threats.
Publishing others’ private information, such as a physical or electronic address, without explicit permission. This includes any sort of “outing” of any aspect of someone’s identity without their consent.
Publishing screenshots or quotes, especially from identity channels, without all quoted users’ explicit consent.
Publishing or telling others that a member belongs to a particular identity channel without asking their consent first.
Publishing of non-harassing private communication.
Any of the above even when presented as “ironic” or “joking”.
Any attempt to present “reverse-ism” versions of the above as violations. Examples of reverse-isms are “reverse racism”, “reverse sexism”, “heterophobia”, and “cisphobia”.
Unsolicited explanations under the assumption that someone doesn’t already know it. Ask before you teach! Don’t assume what people’s knowledge gaps are.
Feigning or exaggerating surprise when someone admits to not knowing something.
“Well-actuallies”
Other conduct which could reasonably be considered inappropriate in a professional or community setting.

Not sure how any of what I said could possibly be considered in violation of that. Did I somehow make a casual mention of slavery or indentured servitude? 😂

Sorry man, I was really annoyed because I was late to this party. I apologize for venting here.

@indirect
Copy link
Member

@swrobel I appreciate the apology, and I understand that software can definitely be frustrating when it behaves in unexpected and undesired ways.

Since you seem to have missed it completely, let me reproduce the exact part of the code of conduct that you were violating:

Examples of behavior that contributes to creating a positive environment include:

Using welcoming and inclusive language.
Being respectful of differing viewpoints and experiences.
Gracefully accepting constructive feedback.
Focusing on what is best for the community.
Showing empathy and kindness towards other community members.

Responding to a warning with "did I somehow mention slavery 😂" counts as doubling down, so you're getting a 24 hour block. Any future violations will result in a permanent block.

@bhartvigsen
Copy link

This is a perfect example of someone letting the tiniest bit of power go to their head.

colby-swandale pushed a commit that referenced this pull request Sep 22, 2018
[2.0] Auto-configure job count

Closes #5808.

The description of that issue, copied verbatim:

This argument comes in two parts, but luckily, the first one is both easier to understand and hopefully to agree with.

Background: Bundler 1.4.0 added support for parallel installation via the `--jobs` param.  Soon after, [this blog post](http://archlever.blogspot.com/2013/09/lies-damned-lies-and-truths-backed-by.html) (probably greatly amplified by [this Thoughtbot blog post](https://robots.thoughtbot.com/parallel-gem-installing-using-bundler)) popularized the recommendation "set `--jobs` to `nproc - 1`".

Not long after, probably also inspired by the popularity of this tip, this "n - 1 jobs" advice got codified into Bundler itself: 66acd02

However, my assertion here is _Bundler should not do that_.

The first argument (the easy one) is just that it's not doing what the user asks for.  For all the people following the (seemingly popular) tip to set their jobs to `nproc - 1`, they're actually ending up with the probably-worse `- 2`.  Even worse than that, if a user does a conservative `--jobs 2`, they're getting _no benefit_ — Bundler is quietly taking their parallelization back down to "no parallelization".

Hopefully that's a sufficient argument on its own, but the part II is that this blanket advice is probably out-of-date anyway.

Using [this script](https://gist.github.com/tjschuck/ca1d37a8869d1cc01313171b4b318094), I repeatedly installed 29 gems (installing them to a `vendor/` dir and deleting it in between runs).  I averaged the time over 10 runs per --jobs value, but the trend holds regardless of how many runs you do.

Note that these numbers are for a machine with 2 physical cores and 4 virtual ones (a Mac, reporting 2 and 4 respectively from `sysctl -n hw.physicalcpu` and `sysctl -n hw.ncpu`, the latter corresponding to Linux's `nproc`).

```
~/Code/tmp/bundler_jobs_bench ☠  ./bundler_jobs_bench.sh

Installing 29 gems repeatedly...

===============================================
Using Bundler version 1.15.1 (current release version)
===============================================

--jobs 1  5.58435780 seconds on average (across 10 runs)
--jobs 2  5.35010690 seconds on average (across 10 runs)
--jobs 3  3.93493610 seconds on average (across 10 runs)
--jobs 4  3.86082760 seconds on average (across 10 runs)
--jobs 5  3.24673650 seconds on average (across 10 runs)
--jobs 6  3.49340190 seconds on average (across 10 runs)
--jobs 7  3.26473430 seconds on average (across 10 runs)
--jobs 8  3.34560500 seconds on average (across 10 runs)

===============================================
Using development version (no more n - 1 jobs)
===============================================

--jobs 1  4.32629540 seconds on average (across 10 runs)
--jobs 2  3.48100690 seconds on average (across 10 runs)
--jobs 3  3.30937880 seconds on average (across 10 runs)
--jobs 4  3.30868200 seconds on average (across 10 runs)
--jobs 5  3.54932920 seconds on average (across 10 runs)
--jobs 6  3.36123440 seconds on average (across 10 runs)
--jobs 7  3.96490630 seconds on average (across 10 runs)
--jobs 8  3.39955640 seconds on average (across 10 runs)
```

From the above, you can see:

1. In the first block, no notable change between `--jobs 1` and `--jobs 2` — that's because they're currently the same.

2. In both, a best time corresponding to the value that (effectively) matches nproc, _not_ nproc - 1.

3. Regardless of nproc coming out best in this run, there is close enough performance among the range of `nproc - 1` through to `nproc * 2` that it doesn't seem like anything in particular (like the `- 1` removed in this commit) should be codified — people seeking to particularly optimize their bundler runtimes should do their own tweaking of the value, and it should be respected as given.

(cherry picked from commit 4d5246d)
colby-swandale pushed a commit that referenced this pull request Sep 22, 2018
[2.0] Auto-configure job count

Closes #5808.

The description of that issue, copied verbatim:

This argument comes in two parts, but luckily, the first one is both easier to understand and hopefully to agree with.

Background: Bundler 1.4.0 added support for parallel installation via the `--jobs` param.  Soon after, [this blog post](http://archlever.blogspot.com/2013/09/lies-damned-lies-and-truths-backed-by.html) (probably greatly amplified by [this Thoughtbot blog post](https://robots.thoughtbot.com/parallel-gem-installing-using-bundler)) popularized the recommendation "set `--jobs` to `nproc - 1`".

Not long after, probably also inspired by the popularity of this tip, this "n - 1 jobs" advice got codified into Bundler itself: 66acd02

However, my assertion here is _Bundler should not do that_.

The first argument (the easy one) is just that it's not doing what the user asks for.  For all the people following the (seemingly popular) tip to set their jobs to `nproc - 1`, they're actually ending up with the probably-worse `- 2`.  Even worse than that, if a user does a conservative `--jobs 2`, they're getting _no benefit_ — Bundler is quietly taking their parallelization back down to "no parallelization".

Hopefully that's a sufficient argument on its own, but the part II is that this blanket advice is probably out-of-date anyway.

Using [this script](https://gist.github.com/tjschuck/ca1d37a8869d1cc01313171b4b318094), I repeatedly installed 29 gems (installing them to a `vendor/` dir and deleting it in between runs).  I averaged the time over 10 runs per --jobs value, but the trend holds regardless of how many runs you do.

Note that these numbers are for a machine with 2 physical cores and 4 virtual ones (a Mac, reporting 2 and 4 respectively from `sysctl -n hw.physicalcpu` and `sysctl -n hw.ncpu`, the latter corresponding to Linux's `nproc`).

```
~/Code/tmp/bundler_jobs_bench ☠  ./bundler_jobs_bench.sh

Installing 29 gems repeatedly...

===============================================
Using Bundler version 1.15.1 (current release version)
===============================================

--jobs 1  5.58435780 seconds on average (across 10 runs)
--jobs 2  5.35010690 seconds on average (across 10 runs)
--jobs 3  3.93493610 seconds on average (across 10 runs)
--jobs 4  3.86082760 seconds on average (across 10 runs)
--jobs 5  3.24673650 seconds on average (across 10 runs)
--jobs 6  3.49340190 seconds on average (across 10 runs)
--jobs 7  3.26473430 seconds on average (across 10 runs)
--jobs 8  3.34560500 seconds on average (across 10 runs)

===============================================
Using development version (no more n - 1 jobs)
===============================================

--jobs 1  4.32629540 seconds on average (across 10 runs)
--jobs 2  3.48100690 seconds on average (across 10 runs)
--jobs 3  3.30937880 seconds on average (across 10 runs)
--jobs 4  3.30868200 seconds on average (across 10 runs)
--jobs 5  3.54932920 seconds on average (across 10 runs)
--jobs 6  3.36123440 seconds on average (across 10 runs)
--jobs 7  3.96490630 seconds on average (across 10 runs)
--jobs 8  3.39955640 seconds on average (across 10 runs)
```

From the above, you can see:

1. In the first block, no notable change between `--jobs 1` and `--jobs 2` — that's because they're currently the same.

2. In both, a best time corresponding to the value that (effectively) matches nproc, _not_ nproc - 1.

3. Regardless of nproc coming out best in this run, there is close enough performance among the range of `nproc - 1` through to `nproc * 2` that it doesn't seem like anything in particular (like the `- 1` removed in this commit) should be codified — people seeking to particularly optimize their bundler runtimes should do their own tweaking of the value, and it should be respected as given.

(cherry picked from commit 4d5246d)
colby-swandale pushed a commit that referenced this pull request Sep 22, 2018
[2.0] Auto-configure job count

Closes #5808.

The description of that issue, copied verbatim:

This argument comes in two parts, but luckily, the first one is both easier to understand and hopefully to agree with.

Background: Bundler 1.4.0 added support for parallel installation via the `--jobs` param.  Soon after, [this blog post](http://archlever.blogspot.com/2013/09/lies-damned-lies-and-truths-backed-by.html) (probably greatly amplified by [this Thoughtbot blog post](https://robots.thoughtbot.com/parallel-gem-installing-using-bundler)) popularized the recommendation "set `--jobs` to `nproc - 1`".

Not long after, probably also inspired by the popularity of this tip, this "n - 1 jobs" advice got codified into Bundler itself: 66acd02

However, my assertion here is _Bundler should not do that_.

The first argument (the easy one) is just that it's not doing what the user asks for.  For all the people following the (seemingly popular) tip to set their jobs to `nproc - 1`, they're actually ending up with the probably-worse `- 2`.  Even worse than that, if a user does a conservative `--jobs 2`, they're getting _no benefit_ — Bundler is quietly taking their parallelization back down to "no parallelization".

Hopefully that's a sufficient argument on its own, but the part II is that this blanket advice is probably out-of-date anyway.

Using [this script](https://gist.github.com/tjschuck/ca1d37a8869d1cc01313171b4b318094), I repeatedly installed 29 gems (installing them to a `vendor/` dir and deleting it in between runs).  I averaged the time over 10 runs per --jobs value, but the trend holds regardless of how many runs you do.

Note that these numbers are for a machine with 2 physical cores and 4 virtual ones (a Mac, reporting 2 and 4 respectively from `sysctl -n hw.physicalcpu` and `sysctl -n hw.ncpu`, the latter corresponding to Linux's `nproc`).

```
~/Code/tmp/bundler_jobs_bench ☠  ./bundler_jobs_bench.sh

Installing 29 gems repeatedly...

===============================================
Using Bundler version 1.15.1 (current release version)
===============================================

--jobs 1  5.58435780 seconds on average (across 10 runs)
--jobs 2  5.35010690 seconds on average (across 10 runs)
--jobs 3  3.93493610 seconds on average (across 10 runs)
--jobs 4  3.86082760 seconds on average (across 10 runs)
--jobs 5  3.24673650 seconds on average (across 10 runs)
--jobs 6  3.49340190 seconds on average (across 10 runs)
--jobs 7  3.26473430 seconds on average (across 10 runs)
--jobs 8  3.34560500 seconds on average (across 10 runs)

===============================================
Using development version (no more n - 1 jobs)
===============================================

--jobs 1  4.32629540 seconds on average (across 10 runs)
--jobs 2  3.48100690 seconds on average (across 10 runs)
--jobs 3  3.30937880 seconds on average (across 10 runs)
--jobs 4  3.30868200 seconds on average (across 10 runs)
--jobs 5  3.54932920 seconds on average (across 10 runs)
--jobs 6  3.36123440 seconds on average (across 10 runs)
--jobs 7  3.96490630 seconds on average (across 10 runs)
--jobs 8  3.39955640 seconds on average (across 10 runs)
```

From the above, you can see:

1. In the first block, no notable change between `--jobs 1` and `--jobs 2` — that's because they're currently the same.

2. In both, a best time corresponding to the value that (effectively) matches nproc, _not_ nproc - 1.

3. Regardless of nproc coming out best in this run, there is close enough performance among the range of `nproc - 1` through to `nproc * 2` that it doesn't seem like anything in particular (like the `- 1` removed in this commit) should be codified — people seeking to particularly optimize their bundler runtimes should do their own tweaking of the value, and it should be respected as given.

(cherry picked from commit 4d5246d)
colby-swandale pushed a commit that referenced this pull request Oct 5, 2018
[2.0] Auto-configure job count

Closes #5808.

The description of that issue, copied verbatim:

This argument comes in two parts, but luckily, the first one is both easier to understand and hopefully to agree with.

Background: Bundler 1.4.0 added support for parallel installation via the `--jobs` param.  Soon after, [this blog post](http://archlever.blogspot.com/2013/09/lies-damned-lies-and-truths-backed-by.html) (probably greatly amplified by [this Thoughtbot blog post](https://robots.thoughtbot.com/parallel-gem-installing-using-bundler)) popularized the recommendation "set `--jobs` to `nproc - 1`".

Not long after, probably also inspired by the popularity of this tip, this "n - 1 jobs" advice got codified into Bundler itself: 66acd02

However, my assertion here is _Bundler should not do that_.

The first argument (the easy one) is just that it's not doing what the user asks for.  For all the people following the (seemingly popular) tip to set their jobs to `nproc - 1`, they're actually ending up with the probably-worse `- 2`.  Even worse than that, if a user does a conservative `--jobs 2`, they're getting _no benefit_ — Bundler is quietly taking their parallelization back down to "no parallelization".

Hopefully that's a sufficient argument on its own, but the part II is that this blanket advice is probably out-of-date anyway.

Using [this script](https://gist.github.com/tjschuck/ca1d37a8869d1cc01313171b4b318094), I repeatedly installed 29 gems (installing them to a `vendor/` dir and deleting it in between runs).  I averaged the time over 10 runs per --jobs value, but the trend holds regardless of how many runs you do.

Note that these numbers are for a machine with 2 physical cores and 4 virtual ones (a Mac, reporting 2 and 4 respectively from `sysctl -n hw.physicalcpu` and `sysctl -n hw.ncpu`, the latter corresponding to Linux's `nproc`).

```
~/Code/tmp/bundler_jobs_bench ☠  ./bundler_jobs_bench.sh

Installing 29 gems repeatedly...

===============================================
Using Bundler version 1.15.1 (current release version)
===============================================

--jobs 1  5.58435780 seconds on average (across 10 runs)
--jobs 2  5.35010690 seconds on average (across 10 runs)
--jobs 3  3.93493610 seconds on average (across 10 runs)
--jobs 4  3.86082760 seconds on average (across 10 runs)
--jobs 5  3.24673650 seconds on average (across 10 runs)
--jobs 6  3.49340190 seconds on average (across 10 runs)
--jobs 7  3.26473430 seconds on average (across 10 runs)
--jobs 8  3.34560500 seconds on average (across 10 runs)

===============================================
Using development version (no more n - 1 jobs)
===============================================

--jobs 1  4.32629540 seconds on average (across 10 runs)
--jobs 2  3.48100690 seconds on average (across 10 runs)
--jobs 3  3.30937880 seconds on average (across 10 runs)
--jobs 4  3.30868200 seconds on average (across 10 runs)
--jobs 5  3.54932920 seconds on average (across 10 runs)
--jobs 6  3.36123440 seconds on average (across 10 runs)
--jobs 7  3.96490630 seconds on average (across 10 runs)
--jobs 8  3.39955640 seconds on average (across 10 runs)
```

From the above, you can see:

1. In the first block, no notable change between `--jobs 1` and `--jobs 2` — that's because they're currently the same.

2. In both, a best time corresponding to the value that (effectively) matches nproc, _not_ nproc - 1.

3. Regardless of nproc coming out best in this run, there is close enough performance among the range of `nproc - 1` through to `nproc * 2` that it doesn't seem like anything in particular (like the `- 1` removed in this commit) should be codified — people seeking to particularly optimize their bundler runtimes should do their own tweaking of the value, and it should be respected as given.

(cherry picked from commit 4d5246d)
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Dec 17, 2018
pkgsr change
* Remove @Prefix@ from ALTERNATIVES file.

## 1.17.2 (2018-12-11)

 - Add compatability for bundler merge with Ruby 2.6

## 1.17.1 (2018-10-25)

 - Convert `Pathname`s to `String`s before sorting them, fixing #6760 and #6758 ([#6761](rubygems/bundler#6761), @alexggordon)

## 1.17.0 (2018-10-25)

No new changes.

## 1.17.0.pre.2 (2018-10-13)

Features:

  - Configure Bundler home, cache, config and plugin directories with `BUNDLE_USER_HOME`, `BUNDLE_USER_CACHE`, `BUNDLE_USER_CONFIG` and `BUNDLE_USER_PLUGIN` env vars ([#4333](rubygems/bundler#4333), @gwerbin)
  - Add `--all` option to `bundle binstubs` that will generate an executable file for all gems with commands in the bundle
  - Add `bundle remove` command to remove gems from the Gemfile via the CLI
  - Improve checking file permissions and asking for `sudo` in Bundler when it doesn't need to
  - Add error message to `bundle add` to check adding duplicate gems to the Gemfile
  - When asking for `sudo`, Bundler will show a list of folders/files that require elevated permissions to write to.

The following new features are available but are not enabled by default. These are intended to be tested by users for the upcoming release of Bundler 2.

  - Improve deprecation warning message for `bundle show` command
  - Improve deprecation warning message for the `--force` option in `bundle install`

## 1.17.0.pre.1 (2018-09-24)

Features:

  - Check folder/file permissions of the Bundle home directory in the `bundle doctor` command ([#5786](rubygems/bundler#5786), @ajwann)
  - Remove compiled gem extensions when running `bundle clean` ([#5596](rubygems/bundler#5596), @akhramov)
  - Add `--paths` option to `bundle list` command ([#6172](rubygems/bundler#6172), @colby-swandale)
  - Add base error class to gems generated from `bundle gem` ([#6260](rubygems/bundler#6260), @christhekeele)
  - Correctly re-install gem extensions with a git source when running `bundle pristine` ([#6294](rubygems/bundler#6294), @wagenet)
  - Add config option to disable platform warnings ([#6124](rubygems/bundler#6124), @agrim123)
  - Add `--skip-install` option to `bundle add` command to add gems to the Gemfile without installation ([#6511](rubygems/bundler#6511), @agrim123)
  - Add `--only-explicit` option to `bundle outdated` to list only outdated gems in the Gemfile ([#5366](rubygems/bundler#5366), @peret)
  - Support adding multiple gems to the Gemfile with `bundle add` ([#6543](rubygems/bundler#6543), @agrim123)
  - Make registered plugin events easier to manage in the Plugin API (@jules2689)
  - Add new gem install hooks to the Plugin API (@jules2689)
  - Add `--optimistic` and `--strict` options to `bundle add` ([#6553](https://github.com/bundler/bundler/issues/6553), @agrim123)
  - Add `--without-group` and `--only-group` options to `bundle list` ([#6564](rubygems/bundler#6564), @agrim123)
  - Add `--gemfile` option to the `bundle exec` command ([#5924](rubygems/bundler#5924), @ankitkataria)

The following new features are available but are not enabled by default. These are intended to be tested by users for the upcoming release of Bundler 2.

  - Make `install --path` relative to the current working directory ([#2048](rubygems/bundler#2048), @igorbozato)
  - Auto-configure job count ([#5808](rubygems/bundler#5808), @segiddins)
  - Use the Gem Version Promoter for major gem updates ([#5993](rubygems/bundler#5993), @segiddins)
  - Add config option to add the Ruby scope to `bundle config path` when configured globally (@segiddins)

## 1.16.6 (2018-10-05)

Changes:

  - Add an error message when adding a gem with `bundle add` that's already in the bundle ([#6341](rubygems/bundler#6341), @agrim123)
  - Add Homepage, Source Code and Chanagelog URI metadata fields to the `bundle gem` gemspec template (@walf443)

Bugfixes:

  - Fix issue where updating a gem resulted in the gem's version being downgraded when `BUNDLE_ONLY_UPDATE_TO_NEWER_VERSIONS` was set ([#6529](rubygems/bundler#6529), @theflow)
  - Fix some rescue calls that don't specifiy error type (@utilum)
  - Fix an issue when the Lockfile would contain platform-specific gems that it didn't need ([#6491](rubygems/bundler#6491), @segiddins)
  - Improve handlding of adding new gems with only a single group to the Gemfile in `bundle add` (@agrim123)
  - Refactor check for OpenSSL in `bundle env` (@voxik)
  - Remove an unnecessary assignment in Metadata (@voxik)

Documentation:

  - Update docs to reflect revised guidance to check in Gemfile.lock into version control for gems ([#5879](https://github.com/bundler/bundler/issues/5879), @arbonap)
  - Add documentation for the `--all` flag in `bundle update` (@agrim123)
  - Update README to use `bundle add` in usage examples (@hdf1986)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants