Perf tweaks #447

Merged
merged 2 commits into from Jun 26, 2013

3 participants

@jonleighton

Here are a couple of performance tweaks to address hot spots that are much more noticeable after #435 is applied.

master

$ time ruby -r active_record -e ""

real    0m0.603s
user    0m0.507s
sys 0m0.078s
$ time ruby -r active_record -e ""

real    0m0.603s
user    0m0.507s
sys 0m0.079s
$ time ruby -r active_record -e ""

real    0m0.602s
user    0m0.510s
sys 0m0.073s

master + #435

$ time ruby -r active_record -e ""

real    0m0.426s
user    0m0.336s
sys 0m0.075s
$ time ruby -r active_record -e ""

real    0m0.425s
user    0m0.340s
sys 0m0.069s
$ time ruby -r active_record -e ""

real    0m0.429s
user    0m0.327s
sys 0m0.085s

master + #435 + these patches

$ time ruby -r active_record -e ""

real    0m0.378s
user    0m0.284s
sys 0m0.079s
$ time ruby -r active_record -e ""

real    0m0.387s
user    0m0.297s
sys 0m0.075s
$ time ruby -r active_record -e ""

real    0m0.378s
user    0m0.291s
sys 0m0.071s
jonleighton added some commits Feb 8, 2013
@jonleighton jonleighton Cache Gem::Version instances for each version string
This prevents having numerous instances of the same value object. In
particular, the use of String#scan in Version#segments is a hot spot. We
could perform the caching there but it seems reasonable to cache the
entire object.
c4223a2
@jonleighton jonleighton Implement Gem::Specification#to_ary
This is a perf fix. Array#flatten is called in places which contain
nested arrays of spec objects. (For example in
Gem::Specification.find_in_unresolved.) Due to Ruby's implementation of
Array#flatten, it will hit method_missing to try to dispatch the #to_ary
method, which is slow. Implementing a #to_ary like this prevents
method_missing getting called.
54cf153
@evanphx evanphx commented on the diff Feb 8, 2013
lib/rubygems/version.rb
@@ -183,6 +186,12 @@ def self.create input
end
end
+ @@all = {}
+
+ def self.new version
@evanphx
RubyGems member
evanphx added a note Feb 8, 2013

Using a Version cache has far reaching implications that I'm not sure we're ready for. It's been discussed before.

Would you accept a global cache for the result of Version#segments as a first step? That's where the real hotspot is.

Also, what sort of implications are you referring to? Out of interest. (Happy to read the discussion if you can point me to it.)

I guess both have thread safety issues. I'm not clear about what the thread safety requirements are for rubygems. Can you fill me in?

@evanphx
RubyGems member
evanphx added a note Feb 8, 2013

There are a couple of considerations:

1) Versions weren't always immutable. I'm not certain we have finished cleaning things up so they are. If they're not immutable, a global cache is bad news.
2) For anything that is long lived, this will cause a slowly ballooning memory because there is no way to clean up all the Version objects even if they're not in use.

Both could be dealt with by having an explicit VersionCache object that is used to memoize the Version objects into. It could be that by default, there is no such object and we just create a new object each time. But if one is installed into Specification, it is used. This would allow gem loading to create one and install it for use.

@evanphx
RubyGems member
evanphx added a note Feb 9, 2013

Per, I mean that Version, the class, could be told "use this VersionCache object", not Specification.

Regarding (1), I am pretty sure it's not an issue. From looking at the code, Version is immutable so long as you don't do crazy things like version.hash << "wtf". Incidentally you can see I changed #version to return a dup of the @version as not doing this did lead to a test failure.

Regarding (2), I agree that this is a problem in theory, although in practise I can't see a situation where any application will use an infinite number of different version objects. However, people do do unexpected things so I'm not averse to addressing this concern. I think the simplest thing to do would be to just make the cache bounded to a certain size. We'd have to do some research to find out what a good default for this would be but that would prevent the memory usage growing unbounded. Regarding your VersionCache proposal, I don't see how this would prevent the memory leak specifically? It seems more to do with how to structure the code?

Anyway, maybe we should just return to these when #435 is done, as that's certainly the lowest hanging fruit right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@drbrain drbrain added a commit that referenced this pull request Apr 4, 2013
@drbrain drbrain Update missed tests for rebase of #447 410e5f3
@drbrain drbrain added a commit that referenced this pull request Apr 4, 2013
@drbrain drbrain Merge branch 'pietro-key_passphrase'
* pietro-key_passphrase: (154 commits)
  Update missed tests for rebase of #447
  Fixed pull request number type for #461
  Improve documentation of DependencyInstaller
  Alphabetize Gem::DependencyInstaller
  Removed commented out DependencyInstaller code
  Alter #489 to use GEM_SPEC_CACHE
  fix tests when GEM_SPEC is set in environment
  add support for ENV GEM_SPEC, fix #430
  Updated history for #443
  Don't alter Gem::Specification.dirs during install
  Default to Gem.dir as late as possible.
  Updated history for #455
  Update history for #456
  Update history for #462
  Add tests and alter output of #514
  add PATH to gem env
  Update History for #514
  Restore backwards-compatibility for #517
  Alphabetize RequestSet
  Undent RequestSet
  ...

Conflicts:
	test/rubygems/test_gem_commands_cert_command.rb
	test/rubygems/test_gem_package.rb
03f371e
@evanphx evanphx was assigned Jun 14, 2013
@evanphx evanphx merged commit 34fe654 into rubygems:master Jun 26, 2013

1 check passed

Details default The Travis build passed
@jonleighton

Sweet, thanks for the merge.

@drbrain drbrain added a commit that referenced this pull request Jul 8, 2013
@drbrain drbrain Added #447 to history 1ae1ace
@drbrain
RubyGems member

Due to problems in ruby 1.8, I had to revert @54cf153 in @8ec0591 (but I typo'd the pull request reference there). The commit message describes the reversion.

@drbrain drbrain added a commit that referenced this pull request Feb 4, 2014
@drbrain drbrain Restore Gem::Version::new behavior from < 2.1
Gem::Version::new used to return the same class as the input object but
now returns a Gem::Version if called with the same input from a
subclass.  This broke backward compatibility.

This broke from #447 which was a performance improvement change.
This commit maintains the same behavior except when Gem::Version was
subclassed and the version cache (which reduces GC) is skipped.

See #447 for the original commits

Fixes #805
81d806d
@drbrain drbrain added a commit that referenced this pull request Feb 4, 2014
@drbrain drbrain Restore Gem::Version::new behavior from < 2.1
Gem::Version::new used to return the same class as the input object but
now returns a Gem::Version if called with the same input from a
subclass.  This broke backward compatibility.

This broke from #447 which was a performance improvement change.
This commit maintains the same behavior except when Gem::Version was
subclassed and the version cache (which reduces GC) is skipped.

See #447 for the original commits

Fixes #805
77f75f4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment