Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Performance fix to avoid loading every gemspec #435

Merged
merged 5 commits into from Apr 17, 2013

Conversation

Projects
None yet
7 participants
Contributor

jonleighton commented Jan 25, 2013

Problem

Rubygems currently loads every single gemspec in basically all situations. This is slow and gets slower as you install more gems.

Previous solutions

  • #376 avoids evalling the gemspec by keeping a marshalled copy of it. This was deemed to be quite complex to get right in all edge cases, and didn't seem to show a significant improvement.
  • #396 infers the gem name and version from the name of the gemspec. This data can then be used to compare gem versions before needing to load their gemspecs. This was deemed to be too brittle. #396 also only improves Gem::Dependency#matching_specs, which means that other code paths (for instance via Kernel#require) can still end up loading all gemspecs and therefore not being any faster.

This solution

My patch is a proof of concept. It is not production ready but I want to get feedback on the approach first, and will then make all the code nice and pretty and non-hacky.

It's a continuation of #396, taking forward some of the suggestions in there. It also tackles the Kernel#require code path.

When a gem is installed, the gemspec is written out with the second line like this:

# stub: rails 3.2.11 ruby lib

The second line contains: name, version, platform, require_paths. require_paths are delimited by NUL bytes. (One limitation is that require_paths are assumed not to contain any newlines in the path.)

There is now a Gem::Specification.stubs method which returns a Gem::StubSpecification for each gemspec on the system. The StubSpecification objects can be used to perform a bunch of work without having to actually load the gemspec. This provides significant performance improvements.

My aim is to completely remove the use of Gem::Specification._all from the codebase, but I stop short of that in this PoC. As explained on the comments, I am currently leaning on Gem::Specification to provide some behaviour to StubSpecification. I would intend to extract this shared behaviour into a module which will be nicer and will make it more explicit about what code is or isn't used by StubSpecification.

At the moment all the test pass except for two to do with the uninstall command (I haven't looked at why, but the point is it mostly works).

How to try it

$ cd /path/to/rubygems-repo
$ export RUBYOPT="-Ilib"
$ gem pristine --all

Now run some stuff

My tests

I have 348 gems installed:

$ ls `ruby -e "puts Gem.dir"`/specifications | wc -l
348

I experimented with two tests:

Test One

Before

$ time ruby -e ""

real    0m0.120s
user    0m0.063s
sys 0m0.041s

$ time m --help > /dev/null

real    0m0.450s
user    0m0.373s
sys 0m0.057s

After

$ time m --help > /dev/null

real    0m0.177s
user    0m0.100s
sys 0m0.061s

Test Two

Before

$ time ruby -r active_record -e ""

real    0m0.853s
user    0m0.750s
sys 0m0.080s

After

$ time ruby -r active_record -e ""

real    0m0.656s
user    0m0.554s
sys 0m0.085s

Conclusion

This speed up will be most beneficial for small command line scripts. But it will also give a modest boost to bigger tasks like booting rails.

I would like to get feedback on whether this is an acceptable approach. If so I will polish up and finish off the patch and hopefully get it merged.

Thanks!

Owner

evanphx commented Jan 25, 2013

Given that this PR implements my suggestion in #396 I'm all for it. It doesn't have the brittle issues associated with extra files and can easily degrade by simply loading the whole Spec if the summary line is missing (which will be all gems when someone upgrades).

@evanphx evanphx and 1 other commented on an outdated diff Jan 25, 2013

lib/rubygems/stub_specification.rb
+ def platform
+ stub_spec.platform
+ end
+
+ ##
+ # Require paths of the gem
+
+ def require_paths
+ stub_spec.require_paths
+ end
+
+ ##
+ # A Gem::Specification for this gem containing only the stub data
+ # (name, version, platform, require_paths)
+
+ def stub_spec
@evanphx

evanphx Jan 25, 2013

Owner

This method needs to deal with files not having the stub prefix in the file. In that cause, it should just fallback to calling spec.

@jonleighton

jonleighton Jan 25, 2013

Contributor

Actually it does, but it's kinda unobvious. Here data may return either a StubLine or a full Gem::Specification, depending on whether the stubline exists.

But I'm probably going to get rid of the #stub_spec method in due course anyway...

jonleighton added some commits Jan 25, 2013

@jonleighton jonleighton Add a Gem::StubSpecification object to parse specification stublines 8fc2d18
@jonleighton jonleighton Write stubline out to gemspecs f32a727
@jonleighton jonleighton Avoid loading all gemspecs
Parse out "stublines", where available, to retrieve the information
necessary to determine whether to activate a particular gemspec.

This requires sharing some code between Gem::Specification and
Gem::StubSpecification, so I created an abstract BasicSpecification
class to achieve this.
46f0acf
@jonleighton jonleighton DRY up Specification._all/.stubs
Populate _all from stubs.
bd6a143
@jonleighton jonleighton Memoize StubSpecification values
Prevents us parsing the stub line / building the objects on each call.
2bbe0c4
Contributor

jonleighton commented Feb 8, 2013

@evanphx I've finished this off now, and the build is passing. Looking forward to your feedback! Cheers

@jonleighton jonleighton referenced this pull request Feb 8, 2013

Merged

Perf tweaks #447

Contributor

grosser commented Feb 18, 2013

👍 Let's get this merged, won't get any cleaner then this :)

👍 on getting this merged as well.

Contributor

jonleighton commented Feb 22, 2013

@evanphx is there anything else I can do to get this merged? thanks.

Owner

evanphx commented Feb 22, 2013

I'd like @drbrain to weigh in about the additional superclass and the worry about the Marshal format (which I believe is unaffected, but I'd like to do double check.)

Owner

drbrain commented Feb 25, 2013

I was hoping to review this today, but due to various ruby 2.0/rubygems 2.0 bugs my review may be delayed.

Contributor

tenderlove commented Mar 26, 2013

bump

👍 This looks fairly clean and I'm ready for the speed bump :)

@evanphx evanphx added a commit that referenced this pull request Apr 17, 2013

@evanphx evanphx Merge pull request #435 from jonleighton/speed
Performance fix to avoid loading every gemspec
fea4786

@evanphx evanphx merged commit fea4786 into rubygems:master Apr 17, 2013

1 check passed

default The Travis build passed
Details
Contributor

grosser commented Apr 17, 2013

❤️

@marknadig marknadig referenced this pull request in rails/spring Aug 20, 2013

Closed

readme - update for Rubygems performance problem #183

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment