Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could boot performance be improved? Caching default gems perhaps? #3799

Open
casperisfine opened this issue Jul 8, 2020 · 4 comments
Open
Labels

Comments

@casperisfine
Copy link

Sorry if this was discussed elsewhere, I tried to search for previous discussions but given how generic the subject is I couldn't find anything.

Context

As many people know, Rubygems initialization is the biggest part of MRI's boot:

$ time ruby -e '1'

real	0m0.076s
user	0m0.051s
sys	0m0.021s
$ time ruby --disable-gems -e '1'

real	0m0.029s
user	0m0.011s
sys	0m0.015s

So (on my machine) "raw" MRI initialization is 29ms, and rubygems adds 47ms to it.

When profiling, we can see that a large part of that comes from loading default gems:

==================================
  Mode: cpu(10)
  Samples: 545 (52.98% miss rate)
  GC: 25 (4.59%)
==================================
     TOTAL    (pct)     SAMPLES    (pct)     FRAME
       237  (43.5%)         190  (34.9%)     Gem::Specification.load
       419  (76.9%)          40   (7.3%)     <require:/opt/rubies/2.8.0-dev/lib/ruby/2.8.0/rubygems.rb>
        66  (12.1%)          34   (6.2%)     <require:/opt/rubies/2.8.0-dev/lib/ruby/2.8.0/did_you_mean.rb>
        70  (12.8%)          33   (6.1%)     <require:/opt/rubies/2.8.0-dev/lib/ruby/2.8.0/rubygems/specification.rb>
        30   (5.5%)          30   (5.5%)     Gem::BasicSpecification#internal_init
        22   (4.0%)          22   (4.0%)     (sweeping)
        93  (17.1%)          21   (3.9%)     <module:Gem>
       106  (19.4%)          16   (2.9%)     Kernel#require
        15   (2.8%)          14   (2.6%)     Gem::Specification#files
        10   (1.8%)          10   (1.8%)     Gem::PathSupport#expand
        11   (2.0%)           9   (1.7%)     <require:/opt/rubies/2.8.0-dev/lib/ruby/2.8.0/did_you_mean/spell_checker.rb>
        10   (1.8%)           9   (1.7%)     <require:/opt/rubies/2.8.0-dev/lib/ruby/2.8.0/rubygems/user_interaction.rb>
        11   (2.0%)           9   (1.7%)     <require:/opt/rubies/2.8.0-dev/lib/ruby/2.8.0/did_you_mean/spell_checkers/name_error_checkers.rb>
         8   (1.5%)           8   (1.5%)     Dir.glob
        14   (2.6%)           7   (1.3%)     Gem::Specification#add_dependency_with_type
        17   (3.1%)           7   (1.3%)     <require:/opt/rubies/2.8.0-dev/lib/ruby/2.8.0/rubygems/specification_policy.rb>
        23   (4.2%)           5   (0.9%)     Gem.register_default_spec
        10   (1.8%)           4   (0.7%)     <class:Specification>
         5   (0.9%)           4   (0.7%)     <require:/opt/rubies/2.8.0-dev/lib/ruby/2.8.0/rubygems/core_ext/kernel_require.rb>
         5   (0.9%)           4   (0.7%)     <require:/opt/rubies/2.8.0-dev/lib/ruby/2.8.0/did_you_mean/spell_checkers/require_path_checker.rb>
         5   (0.9%)           4   (0.7%)     <require:/opt/rubies/2.8.0-dev/lib/ruby/2.8.0/rubygems/dependency.rb>

If I comment out Gem::Specification.load_defaults, the overhead is down to 26ms:

$ time ruby -e '1'

real	0m0.058s
user	0m0.034s
sys	0m0.020s
$ time ruby --disable-gems  -e '1'

real	0m0.032s
user	0m0.012s
sys	0m0.016s

And most of that is in each_spec([Gem.default_specifications_dir]).

Ideas

This makes me wonder what kind of assumptions we could make about default gems. Can we assume they are immutable? If so couldn't we cache all these specs in a single marshal dump file rather than evaluate all the default gems (55 on 2.8.0-dev) ?

I tried a very crude hack on my machine:

  def self.load_defaults
    cached_specs.each do |spec|
      # #load returns nil if the spec is bad, so we just ignore
      # it at this stage
      Gem.register_default_spec(spec)
    end
  end

  def self.cached_specs
    Marshal.load(File.read('/tmp/default_specs.marshal'))
  rescue SystemCallError
    specs = []
    each_spec([Gem.default_specifications_dir]) { |s| specs << s }
    File.write('/tmp/default_specs.marshal', Marshal.dump(specs))
    specs
  end

With the following result:

$ time ruby -e '1'

real	0m0.060s
user	0m0.036s
sys	0m0.020s

So 16ms faster, which is about 34% of rubygems load time.

Questions

So the main question here is what would it take to move this from a crude hack to an acceptable patch.

  • Can we assume default gems are immutable?
  • If not could we do a mtime comparison? on the directory or all individual specs?
  • Is there an existing location where it would be acceptable to store that cache?

@deivid-rodriguez any thoughts on this?

@deivid-rodriguez
Copy link
Member

So funny, I'm working on that exact path 🤣. I have this, which I believe is the same idea:

commit 0441af11bcf6d1418f52f0c0dd2c87c60d694e8b
Author: David Rodríguez <deivid.rodriguez@riseup.net>
Date:   Wed Jul 8 09:09:31 2020 +0200

    Load a marshaled version of the default specs map if available

diff --git a/lib/rubygems.rb b/lib/rubygems.rb
index 8bc0ef4b74..12936671db 100644
--- a/lib/rubygems.rb
+++ b/lib/rubygems.rb
@@ -1242,8 +1242,14 @@ def register_default_spec(spec)
     # Loads the default specifications. It should be called only once.
 
     def load_default_specs
-      Gem::Specification.each_spec([default_specifications_dir]) do |spec|
-        register_default_spec(spec)
+      if File.exist?(marshaled_default_spec_file)
+        @path_to_default_spec_map = Marshal.load read_binary(marshaled_default_spec_file)
+      else
+        Gem::Specification.each_spec([default_specifications_dir]) do |spec|
+          register_default_spec(spec)
+        end
+
+        File.open(marshaled_default_spec_file, "w") {|io| Marshal.dump @path_to_default_spec_map, io }
       end
     end
 
@@ -1319,6 +1325,9 @@ def default_gem_load_paths
       @default_gem_load_paths ||= $LOAD_PATH[load_path_insert_index..-1]
     end
 
+    def marshaled_default_spec_file
+      File.join(dir, "default_specs.#{marshal_version}")
+    end
   end
 
   ##

commit 1ee905c8eea7efa2257f1acdde3a27074c5ac3e5
Author: David Rodríguez <deivid.rodriguez@riseup.net>
Date:   Tue Jul 7 20:11:30 2020 +0200

    Move `Gem::Specification.load_defaults` to `Gem.load_default_specs`
    
    And deprecate the former. Makes sense I believe for upcoming work.

diff --git a/lib/rubygems.rb b/lib/rubygems.rb
index 1a70ee9d0a..8bc0ef4b74 100644
--- a/lib/rubygems.rb
+++ b/lib/rubygems.rb
@@ -1238,6 +1238,15 @@ def register_default_spec(spec)
       end
     end
 
+    ##
+    # Loads the default specifications. It should be called only once.
+
+    def load_default_specs
+      Gem::Specification.each_spec([default_specifications_dir]) do |spec|
+        register_default_spec(spec)
+      end
+    end
+
     ##
     # Find a Gem::Specification of default gem from +path+
 
@@ -1358,7 +1367,7 @@ def default_gem_load_paths
 
 ##
 # Loads the default specs.
-Gem::Specification.load_defaults
+Gem.load_default_specs
 
 require 'rubygems/core_ext/kernel_gem'
 require 'rubygems/core_ext/kernel_require'
diff --git a/lib/rubygems/specification.rb b/lib/rubygems/specification.rb
index e6ecc7649a..3c83494b8e 100644
--- a/lib/rubygems/specification.rb
+++ b/lib/rubygems/specification.rb
@@ -845,15 +845,14 @@ def self._resort!(specs) # :nodoc:
     end
   end
 
-  ##
-  # Loads the default specifications. It should be called only once.
+  def self.load_defaults # :nodoc:
+    Gem.load_default_specs
+  end
 
-  def self.load_defaults
-    each_spec([Gem.default_specifications_dir]) do |spec|
-      # #load returns nil if the spec is bad, so we just ignore
-      # it at this stage
-      Gem.register_default_spec(spec)
-    end
+  class << self
+    extend Gem::Deprecate
+
+    rubygems_deprecate :load_defaults, "Gem.load_default_specs"
   end

@casperisfine
Copy link
Author

Oh! That's awesome. I'll ignore that area then, and focus on other ideas. Should I close this issue?

@deivid-rodriguez
Copy link
Member

deivid-rodriguez commented Jul 8, 2020

I'll definitely tag you on the PR, which will close this, so we call leave this opened 👍.

I believe we can assume default specs are immutable, yeah.

Regarding the cache location, we could use Gem.spec_cache_dir, and put ruby version specific files in there like 2.7.1/default_specs.4.8, or something like that. I think that could make things align quite nicely.

@eregon
Copy link
Contributor

eregon commented Feb 3, 2022

Refs: https://bugs.ruby-lang.org/issues/18568 and #4199

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants