Making Rubinius .rbc Files Disappear
Rubinius is rather unusual as a Ruby implementation. It both compiles Ruby source code to bytecode and saves the compiled code to a cache so it does not need to recompile unless the source code changes. This can be great for utilities that are run often from the command line (including IRB). Rubinius merely reloads the cached file and runs the bytecode directly rather than needing to parse and compile the file. Sounds like a real win!
Unfortunately, it is not that simple. We need some place to store that cache and this is where the thorns on that pretty rose start poking us in the thumbs. The solution we have been using since forever is to store the cached file alongside the source file in the same directory, like so:
$ echo 'puts "hello!"' > hello.rb $ ls hello.* hello.rb $ rbx hello.rb hello! $ ls hello.* hello.rb hello.rbc
That doesn't look too crazy, but it can get more complicated:
$ mv hello.rb hello $ rbx hello $ ls hello.* hello.compiled.rbc hello.rbc
Whoa, what is
hello did not have an extension,
we add that longer
compiled.rbc to make it clear which file the cache is
for. Also, note that we have that
hello.rbc hanging about even though the
hello.rb is gone.
To summarize the issues with our caching scheme:
- It requires an additional file for every Ruby source file.
- It requires some potentially complicated naming scheme to associate the cache file with the source and not clash with other names.
- Removing or renaming the Ruby source file leaves the cache file behind.
Again, the advantage of the cache file is that you do not have to wait for Rubinius to recompile the file if you have not changed the source. Let's see if we can get all the advantages with none of the disadvantages. That old saying comes to mind, Having your cake and eating it, too, so we may not be successful, but it is worth a shot.
First, let's take a step back. This issue is not unique to Rubinius. Python
.pyo files. Java has
.class files. C/C++ has
Lots of things need a place to store a compiled or cached representation of
some data. Every SCM worth mention has some mechanism to ignore the files you
don't want to track. The same is generally true of editors. So in some sense,
this is a solved problem. However, we have always received complaints about
.rbc files, so we thought we would try to make other, hopefully better,
Solution 1: No Cache
One simple solution is just to never ever ever create the compiled cache files in any form anywhere. We have an option for that:
$ ls hello.* hello.rb $ rbx -Xcompiler.no_rbc hello.rb hello! $ ls hello.* hello.rb
Win! Not one lousy
.rbc file in sight. Although, that's quite the option to
type. Never fear, we have a solution to that below.
Here is our scorecard for solution 1:
Use Case: Use when you never want any compiler cache files created. For example, on a server where startup time is not really a concern.
.rbc files at all.
Cons: Startup will be slightly slower depending on what Ruby code you are running. It will be more noticeable in a Rails application, for example. However, the Rubinius bytecode compiler is several times faster than it was a couple years ago so it may not be an issue for you.
Solution 2: Cache Database
What if we could put all the compilation data in a single cache location, something like a database? We have an option for that.
This option is a little more complex, so let's take it in two steps.
$ ls hello.* hello.rb $ rbx -Xrbc.db hello.rb hello! $ ls hello.* hello.rb $ ls -R .rbx 60 .rbx/60: 60c091c3ed34c1b93ffbb33d82d810772902d3f9
.rbc files here. But what's with all the numbers in the
directory and how did that directory get there?
-Xrbc.db option without any argument will store the compilation cache in
.rbx directory in the current working directory. The cache files
themselves are split into subdirectories to avoid creating too many entries
for the file system to handle in one directory.
What if you have a special location where you would prefer all compilation
cache files be saved? No problem, just give
-Xrbc.db a path as follows:
$ ls hello.* hello.rb $ rbx -Xrbc.db=$HOME/.my_special_place hello.rb hello! $ ls hello.* hello.rb $ ls -R $HOME/.my_special_place 60 /Users/brian/.my_special_place/60: 60c091c3ed34c1b93ffbb33d82d810772902d3f9
If you primarily work with projects, putting the
.rbx directory in the
current working directory may be the best solution because it keeps the
compilation cache with the project. It is easy to add an SCM ignore for the
directory and easy to remove the directory to clear the cache (e.g. in a clean
However, if you are frequently running scripts in many directories, you may
not want to litter
.rbx directories everywhere. In this case, putting the
directory in your
$HOME dir or
/tmp may be preferable. Additionally,
/tmp may be cleared on every reboot so you will not accumulate many stale
Note that, right now, Rubinius does not clear the cache directory. It will happily continue adding to it indefinitely. However, this may not be an issue unless you are cycling through a bunch of Ruby files, for example, working on a number of Ruby projects in series. In that case, using a per-project (per current working directory) cache is probably the best option.
Here is how solution 2 shakes out:
Use Case: You want to combine all compilation cache files in one location.
.rbc files mixed in with the rest of your files.
Cons: You may still need a per-project or per-working-directory cache directory. However, you can easily specify where to put that directory.
Using RBXOPT for Options
As mentioned above, the
-X options can get a little long and you certainly
don't want to retype them constantly. We have added support for the
environment variable, which is an analog of the
RUBYOPT environment variable
that we already support.
RBXOPT to specify
-X options that Rubinius should use. For example:
You can check out all the
-X options with
rbx -Xconfig.print or
-Xconfig.print=2 for more verbose output. If you want to use multiple
RBXOPT, use quotes and separate the options with a space:
export RBXOPT='-Xrbc.db -Xagent.start'
Rubinius saves a compilation cache for compiled Ruby code to avoid wasting time and resources recompiling source that has not changed. However, we need some place to store the cache. Rubinius provides options for omitting the cache altogether or for storing it in a directory of your choosing. Note that the format of the compilation cache is an implementation detail and we reserve the right to change it at any time, so please don't rely on it being in any particular format.
We have not turned on
-Xrbc.db by default yet because we don't know what a
good default is. So give us feedback on your use cases and what you would find
Finally, whenever we discuss the compilation cache we are inevitably asked if you can run directly from the cache and not use the Ruby source at all after it has been compiled. The short answer is "Yes", the long answer is "It depends". I will be writing a post exploring this question in detail shortly. For now, get out there and write more Ruby code!