Configure Bundler for parallel gem installs #166

Merged
merged 1 commit into from Dec 3, 2013

Projects

None yet

9 participants

@croaky
thoughtbot, inc. member
  • Set it globally for OS X.
  • Determine number of cores dynamically.
  • Pick one number less than number of cores to avoid deadlock errors.
@mxie mxie and 1 other commented on an outdated diff Nov 24, 2013
@@ -119,7 +119,11 @@ fancy_echo "Updating to latest Rubygems version ..."
gem update --system
fancy_echo "Installing critical Ruby gems for Rails development ..."
- gem install bundler pg rails unicorn --no-document
+ gem install bundler pg rails unicorn --no-document --pre
@mxie
mxie Nov 24, 2013

Do we want to install the pre-releases of all of these gems? Or just bundler so that we can use the parallel feature?

@croaky
croaky Nov 24, 2013

Good point. Pushed 7c3ed30 to fix.

@mxie
thoughtbot, inc. member

Looks ok to me.

@croaky
thoughtbot, inc. member

Sweet. @djcp You want to take a crack at the Linux version as part of this PR? We could combine our avatars when squashing and merging...

http://robots.thoughtbot.com/how-to-create-github-avatars-for-pairs/

@samnang samnang and 2 others commented on an outdated diff Nov 30, 2013
@@ -118,8 +118,15 @@ fancy_echo "Setting Ruby 2.0.0-p247 as global default Ruby ..."
fancy_echo "Updating to latest Rubygems version ..."
gem update --system
-fancy_echo "Installing critical Ruby gems for Rails development ..."
- gem install bundler pg rails unicorn --no-document
+fancy_echo "Installing Bundler to install project-specific Ruby gems ..."
+ gem install bundler --no-document --pre
+
+fancy_echo "Configuring Bundler for faster, parallel gem installation ..."
+ number_of_cores=`sysctl -n hw.ncpu`
+ bundle config --global jobs `expr $number_of_cores - 1`
@samnang
samnang Nov 30, 2013

Why do we have to do minus 1 instead of using all cpus here?

@gabebw
gabebw Nov 30, 2013

According to the PR description:

Pick one number less than number of cores to avoid deadlock errors.

@croaky
croaky Dec 1, 2013

The theory is:

Bundler suffers a (relative) drop in efficiency if it has to swap out use of one of the CPU cores between bundling Gems and everything else.

@crossaidi

What about Bundler parallelization on Ubuntu?
Is nproc correct replacement of sysctl -n hw.ncpu on Mac?

@croaky
thoughtbot, inc. member

@crossaidi That does appear to be the correct command. @djcp How does ef81f6e look to you?

@croaky
thoughtbot, inc. member

I ended up splitting up the number of cores logic into two files, mac-components/bundler and linux-components/bundler, which as I'm typing feels sort of strange. Do we have a convention for placing functions somewhere? I just found this method in ruby-build that we could potentially re-use:

num_cpu_cores() {
  local num=""
  if [ "Darwin" = "$(uname -s)" ]; then
    num="$(sysctl -n hw.ncpu 2>/dev/null || true)"
  elif [ -r /proc/cpuinfo ]; then
    num="$(grep ^processor /proc/cpuinfo | wc -l)"
    [ "$num" -gt 0 ] || num=""
  fi
  echo "${num:-2}"
}
@croaky croaky Configure Bundler for parallel gem installs
* Set it globally for OS X.
* Determine number of cores dynamically.
* Pick one number less than number of cores to avoid deadlock errors.
  http://archlever.blogspot.com/2013/09/lies-damned-lies-and-truths-backed-by.html
* Only install `--pre` for Bundler.
* Remove `pg` and `unicorn` gems as they will be installed during `bundle`
  for a Rails project.
8b21a94
@sgrif

👍

@croaky croaky merged commit 8b21a94 into master Dec 3, 2013
@croaky croaky deleted the dc-bundler-parallelization branch Dec 3, 2013
@pbrisbin pbrisbin commented on the diff Dec 3, 2013
fancy_echo "Installing GitHub CLI client ..."
curl http://hub.github.com/standalone -sLo ~/.bin/hub
chmod +x ~/.bin/hub
### end common-components/ruby-environment
+fancy_echo "Configuring Bundler for faster, parallel gem installation ..."
+ number_of_cores=`nproc`
+ bundle config --global jobs `expr $number_of_cores - 1`
@pbrisbin
pbrisbin Dec 3, 2013

FWIW:

'expr' is a program used in ancient shell code to do math. In Posix shells like bash, use
$(( expression )). In bash and ksh93, you can also use '(( expression ))' or 'let
expression' if you don't need to use the result in an expansion.

The backquote (`) is used in the old-style command substitution, e.g. foo=`command`. The
foo=$(command) syntax is recommended instead. Backslash handling inside $() is less
surprising, and $() is easier to nest. See http://mywiki.wooledge.org/BashFAQ/082

I wouldn't worry about it now, but I'd like to change these when I rebase the bash branch.

@croaky
croaky Dec 3, 2013
@as-cii

Why setting the number of bundler "threads" to n_cores - 1 should avoid deadlocks? Is it something related to bundler or am I missing something obvious about multithreading?

Thank you :)

@exalted

Anyone got an answer to @as-cii please?

@as-cii

@croaky thank you for your answer.

That post highlights the efficiency of the number of cores - 1 but it never mentions deadlocks or anything similar (except that bundler might use of one of the CPU cores to do some extra work, which then cause contention and probably context switching).

So can we assume this change was introduced exclusively for performance reasons and not for deadlock errors (as mentioned on the PR description)?

@croaky
thoughtbot, inc. member

@as-cii Sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment