Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Put all compilation cache files in separate directories.

There are still two things to do for this:

1. Use a separate thread for writing compilation cache files.
2. Prune the contents of ~/.rbx when it exceeds a threshold.

Background:

Rubinius uses .rbc files to cache on disk the compiled bytecode for a Ruby
source code file. Typically, these cache files exist alongside the
corresponding .rb file, however, it is possible to collect all the cache files
into a single directory (and subdirectories) by hashing the full path to the
Ruby source file as a key to find a file in the cache directory.

In Rubinius 2.0, we have multiple language modes. The bytecode for 1.8
language mode differs from the bytecode for 1.9 language mode. The .rbc file
format was extended to include language version. This ensures that running the
same Ruby file in different modes will not use the wrong version of bytecode.

When Rubinius is installed, we pre-compile all the Ruby files in the lib/
directory. This ensures that if Rubinius is installed to a directory where a
user does not have write access, the cache files will still be created and can
be used to speed loading of standard library code.

If the .rbc files are placed alongside the .rb files, the existing arrangement
must be changed to provide different .rbc files depending on language mode. In
other words, just versioning the .rbc file is no longer sufficient as the
version of the .rbc files created for lib/**/*.rb files would be one or the
other. The same situation exists for the pre-installed gems, which are not
split into different gem directories for 1.8 and 1.9 mode.

An additional problem with creating .rbc files alongside the .rb files is that
people object to cluttering their source with the cache files.

There have repeatedly been requests for distributing Ruby applications without
Ruby source code. The existing .rbc files can be used for this, but are quite
primitive and don't provide easy ability to abstract other storage
configurations (eg encryption).

Finally, there are potentially numerous advantages to storing the compilation
cache in a proper database that would permit storing a great deal of
additional metadata for building tools for Ruby. Abstracting the cache from
the existing .rbc files to the directory of files using the -Xrbc.db option is
a good first step.

To summarize the problems with the existing .rbc mechanism:

1. Multiple different files are required to permit .rbc files in different
language modes to exist alongside a single .rb file, as is the case with
pre-compiling the standard library files on install.

2. People object to the files cluttering their source code.

3. The files don't easily permit extending them to store additional, valuable
metadata.

4. Related to 3, the files don't provide a suitably powerful mechanism for
distributing Ruby applications without source code.

The existing -Xrbc.db option is a direct replacement for storing the .rbc
files alongside the .rb files and immediately solves problem #1 above. One
issue with the rbx.db option is what to provide for a default value. This is
my proposal:

1. If the user explicitly provides a path with -Xrbc.db, cache all files in
   that path.

2. If the user does not provide a path, use two separate paths as follows:

  a. on boot, record the current working directory (referred to as CWD below).

  b. if the file being loaded has CWD as a prefix, store the cache for the
  file in CWD/.rbx/<wherever>

  c. if the file being loaded does not have CWD as a prefix, store the cache
  for the file in ~/.rbx/<wherever>

3. When hashing the file path to determine the cache file, add the language
   mode so that 1.8 and 1.9 files are separated. This does not replace the use
   of the language version information embedded in the .rbc format, but avoids
   recompile thrashing for e.g. running the specs under 1.8 mode and then
   under 1.9 mode.

4. Only read and write to the cache if the cache directory is owned by the
   user. This avoids a potential security hole where a superuser could be
   running bytecode that was put into the cache maliciously and prevents the
   superuser from creating files that the user would not be able to overwrite.

With these changes above, we have a reasonable default for all files. The
standard library files cache would exist in ~/.rbx/, which is reasonable for a
file installed with Rubinius that isn't going to be changing. The application
files would by default be cached with the application directory, but would not
liter files where source code files are. If the user explicitly requests a
rbc.db directory, all files are written there, but are still segregated based
on language version.

As a related but separate change, since we have full Ruby concurrency in
Rubinius 2.0, I propose making the Writer stage of the bytecode compiler use a
separate thread. Once the CompiledMethod is created, it is enqueued for
writing to the cache and immediately returned. The program can start executing
the method while the separate cache thread figures out where to put it and
marshals the contents to disk.
  • Loading branch information...
commit 9c094f667cd94248d4dbdfc6bd7d6b39b61562fa 1 parent 5308a44
@brixen brixen authored
View
2  kernel/delta/codeloader.rb
@@ -131,7 +131,7 @@ def load_file(wrap=false)
else
compiled_name = Compiler.compiled_name @load_path
- if File.exists? compiled_name
+ if compiled_name and File.exists? compiled_name
if @stat.mtime > File.mtime(compiled_name)
cm = compile_file @load_path, compiled_name
else
View
30 lib/compiler/compiler.rb
@@ -15,26 +15,32 @@ def self.compiler_error(msg, orig)
end
end
- if @object_db = Rubinius::Config['rbc.db']
- @object_db = ".rbx" unless @object_db.kind_of? String
- end
+ if RBC_DB = Rubinius::Config['rbc.db']
+ def self.compiled_name(file)
+ full = "#{File.expand_path(file)}#{Rubinius::RUBY_LIB_VERSION}"
+ hash = Rubinius.invoke_primitive :sha1_hash, full
+ dir = hash[0,2]
- def self.compiled_name(file)
- if file.suffix? ".rb"
- path = file + "c"
- else
- path = file + ".compiled.rbc"
+ path = "#{RBC_DB}/#{dir}/#{hash}"
end
+ else
+ def self.compiled_name(file)
+ name = File.expand_path file
- if db = @object_db and !File.exists?(path)
- full = File.expand_path(file)
+ if name.prefix? Rubinius::OS_STARTUP_DIR
+ db = "#{Rubinius::OS_STARTUP_DIR}/.rbx"
+ else
+ db = File.expand_path "~/.rbx"
+ end
+
+ return if File.exists?(db) and !File.owned?(db)
+
+ full = "#{name}#{Rubinius::RUBY_LIB_VERSION}"
hash = Rubinius.invoke_primitive :sha1_hash, full
dir = hash[0,2]
path = "#{db}/#{dir}/#{hash}"
end
-
- path
end
def self.compile(file, output=nil, line=1, transforms=:default)
View
2  lib/compiler/stages.rb
@@ -70,7 +70,7 @@ def initialize(compiler, last)
end
def run
- @name = "#{@input.file}c" unless @name
+ return @input unless @name
dir = File.dirname(@name)
unless File.directory?(dir)
View
20 spec/core/kernel/load_spec.rb
@@ -1,20 +0,0 @@
-require File.expand_path('../../../spec_helper', __FILE__)
-require File.expand_path('../../../ruby/fixtures/code_loading', __FILE__)
-require File.expand_path('../../../fixtures/code_loading', __FILE__)
-require File.expand_path('../shared/load', __FILE__)
-
-describe "Kernel#load" do
- it_behaves_like :rbx_kernel_load, :load, CodeLoadingSpecs::Method.new
-end
-
-describe "Kernel#load" do
- it_behaves_like :rbx_kernel_load_no_ext, :load, CodeLoadingSpecs::Method.new
-end
-
-describe "Kernel.load" do
- it_behaves_like :rbx_kernel_load, :load, Kernel
-end
-
-describe "Kernel.load" do
- it_behaves_like :rbx_kernel_load_no_ext, :load, Kernel
-end
View
58 spec/core/kernel/shared/load.rb
@@ -1,58 +0,0 @@
-describe :rbx_kernel_load, :shared => true do
- before :each do
- CodeLoadingSpecs.spec_setup
- @rb, @rbc = CodeLoadingSpecs.rbc_fixture "load_fixture.rb"
- end
-
- after :each do
- CodeLoadingSpecs.spec_cleanup
- rm_r @rb, @rbc
- end
-
- it "saves a .rbc file when loading a .rb file" do
- rm_r @rbc
- @object.send(@method, @rb).should be_true
- File.exists?(@rbc).should be_true
- end
-
- it "loads a .rbc file if it is newer than the related .rb file" do
- touch(@rb) { |f| f.puts "ScratchPad << :not_loaded" }
-
- now = Time.now
- File.utime now, now, @rbc
-
- @object.send(@method, @rb).should be_true
- ScratchPad.recorded.should == [:loaded]
- end
-end
-
-describe :rbx_kernel_load_no_ext, :shared => true do
- before :each do
- CodeLoadingSpecs.spec_setup
- end
-
- after :each do
- CodeLoadingSpecs.spec_cleanup
- rm_r @rb, @rbc
- end
-
- it "saves a Ruby source file with no extension to <name>.compiled.rbc" do
- @rb = tmp("no_ext_fixture")
- @rbc = @rb + ".compiled.rbc"
-
- touch(@rb) { |f| f.puts "ScratchPad << :loaded_no_ext" }
- @object.load(@rb).should be_true
- ScratchPad.recorded.should == [:loaded_no_ext]
- File.exists?(@rbc).should be_true
- end
-
- it "saves a Ruby source file with arbitrary extension to <name>.compiled.rbc" do
- @rb = tmp("no_ext_fixture.ext")
- @rbc = @rb + ".compiled.rbc"
-
- touch(@rb) { |f| f.puts "ScratchPad << :loaded_no_ext" }
- @object.load(@rb).should be_true
- ScratchPad.recorded.should == [:loaded_no_ext]
- File.exists?(@rbc).should be_true
- end
-end
View
2  spec/fixtures/code_loading.rb
@@ -9,7 +9,7 @@ def self.rbc_fixture(name)
touch(rb) { |f| f.puts "ScratchPad << :loaded" }
end
- Rubinius::Compiler.compile rb
+ Rubinius::Compiler.compile rb, rbc
return rb, rbc
end
end
Please sign in to comment.
Something went wrong with that request. Please try again.