Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

nokogiri 1.6.0 is very large when installed #952

Closed
danp opened this Issue Aug 1, 2013 · 6 comments

Comments

Projects
None yet
5 participants

danp commented Aug 1, 2013

This is probably really a general rubygems issue but nokogiri 1.6.0 is an extreme case.

After running gem install nokogiri -v 1.6.0 (or letting bundler do the equivalent), the installed nokogiri-1.6.0 directory is very large. On a Heroku dyno using ruby 1.9.3 it is 120M. This is for a couple reasons:

  • the ports directory contains tarballs for libxml and libxslt, as included in the gem
  • the ports directory contains extracted contents of those tarballs
  • the ext directory contains a duplicate set of extracted contents of those tarballs
  • the ext directory contains build artifacts for libxml, libxslt, and nokogiri itself

Here's du output for the nokogiri-1.6.0 directory on the same Heroku dyno:

~/vendor/bundle/ruby/1.9.1/gems/nokogiri-1.6.0 $ du --max-depth=1 | sort -n
8   ./bin
16  ./tasks
804 ./test
896 ./lib
22592   ./ports
98364   ./ext
122836  .

This translates to 120M of data not necessary for actually using nokogiri hanging around. In Heroku's case this data goes into the compiled slug which isn't ideal. There are likely other cases outside building slugs for Heroku where this extra data causes subtle size-related issues.

If possible, it would be great if nokogiri could clean up after itself a bit after building its extensions. I am not immediately sure what that would look like within current gem conventions but I am happy to help figure it out.

Ref heroku/heroku-buildpack-ruby#122

Contributor

donpdonp commented Aug 2, 2013

What would help a lot is documentation on how to specify in the Gemfile that nokogiri should use system libraries. Currently I have to use: "$ NOKOGIRI_USE_SYSTEM_LIBRARIES=true bundle install --path gems"

@ghost ghost assigned knu Aug 8, 2013

Owner

knu commented Aug 8, 2013

The development branch I mentioned in #923 includes a measure for this issue. Please check out.

Contributor

donpdonp commented Aug 23, 2013

Im not sure what's applicable in #923 but I remembered that a Gemfile is ruby so this is a simple way to avoid having lib built

source :rubygems

ENV['NOKOGIRI_USE_SYSTEM_LIBRARIES']="true"
gem 'nokogiri'
Owner

knu commented Oct 22, 2013

Merged the static_clean branch in ab984ce.

@knu knu closed this Oct 22, 2013

rkh commented Jan 31, 2014

When will this make it into a release?

Phrogz commented Mar 16, 2014

Related: the 7+ MB of libxml2 and libxslt documentation installed in ports/…/share/doc/ are generally undesirable in the case when the libraries need to be compiled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment