Biogem is a Ruby code generator for bioinformatics. It generates a plugin, in the form of a gem, which is published automatically on both github and rubygems.org.
In this document we discuss ways to modify Biogem, so you can generate your own code, and avoid repetitious work. The design of the biogem code generator is based on templates, and there are accessible ways to hack it, or even add your own templates.
This document is divided into two sections. In the first section we will create a directory, generate a file through a template, and add a test through a helper. In the second section we will modify some undesired behaviour in biogem through meta-programming.
To change biogem, checkout the source tree to your local machine. E.g.
git clone https://github.com/helios/bioruby-gem.git
cd bioruby-gem
bundle
Make sure you are running a supported version of Ruby (check the README). Now you can invoke biogem with
bundle exec ./bin/biogem foo
which will create the bioruby-foo plugin for testing. Every time you rerun biogem, make sure to remove the bioruby-foo directory first
rm -rf bioruby-foo
Recommended biogem switches are --with-bin and --rspec, e.g.
bundle exec ./bin/biogem --with-bin --rspec --cucumber --no-create-repo foo
For a foreign function interface use
bundle exec ./bin/biogem --with-bin --rspec --cucumber --with-ffi foo
There are options for ActiveRecord, Ruby on Rails etc. Naturally, you can see all available options with
bundle exec ./bin/biogem --help
Note that we are running biogem here within bundler. This may cause a rakefile error after generating the plugin (nesting bundler is not a good idea). This can simply be fixed by running the rake by hand in the plugin directory after generating
cd bioruby-foo
bundle
bundle exec rake version:write
bundle exec rake gemspec
Alternatively, install an updated version of biogem on your machine as a gem, and run biogem without bundler with
bundle exec rake install
biogem --with-bin --rspec foo
In the file ./bin/biogem rake, jeweler and bundler support are loaded and Bio::Gem::Generator::Application invoked, which generates the new directory and files. After generating code biogem changes directory and runs some rake commands in the newly generated plugin.
In the first step we want to add a switch to the biogem command line. For our purpose we will add --with-ffi, a switch which will create a template for a foreign function interface. Switches are defined in options.rb. We add a switch with
o.on('--with-ffi', 'generate a foreign function interface (FFI)') do
self[:biogem_ffi] = true
end
This switch will be available as options[:biogem_ffi] further on.
In the method create_files in Biogem jeweler.rb directories and files get created. For example the plugin library file is generated with
# Fill lib/bio-plugin.rb with some default comments
output_template_in_target_generic 'lib', File.join(lib_dir, lib_filename)
which also creates the directory. We explicitly add a directory to store C source files and headers with
create_ffi_structure if options[:biogem_ffi]
and
def create_ffi_structure
# create ./ext/src and ./ext/include for the .c and .h files
mkdir_in_target(ext_dir)
mkdir_in_target(File.join(ext_dir,"src"))
# create ./lib/ffi for the Ruby ffi
mkdir_in_target(File.join(lib_dir,"ffi"))
end
Templates are stored in lib/bio-gem/templates. We create a template for our C extension named ext.c, e.g. the C function
int add_one(int number) {
return number + 1;
}
which gets copied into the plugins ./ext/src directory with
output_template_in_target_generic File.join('ffi','ext.c'), File.join(src_dir, "ext.c" )
Likewise, an include file ext.h gets copied, a Makefile, and the Ruby ffi file, which defines the bindings to ext.c.
(to be continued)
Similar to Ruby on Rails, we use erb to modify templates based on input parameters. Erb is part of the Ruby standard library. To use erb we create helper functions. A good example is the main library file that gets included by everyone using your plugin using
require 'bio-myawesomeplugin'
The generated file is in your module lib directory. The template for that file can be found here. Another example generates the binary from a template, when generating with the --with-bin switch. It contains the line
require '<%= project_name %>'
project_name is a helper, a method which is defined in lib/bio-gem/mod/jeweler.rb as
alias original_project_name project_name
def project_name
name = original_project_name
return 'bio-'+name if name !~ /^bio-/
name
end
Here, original_project_name is the original method in jeweler. The main thing to note is that you can easily create your own helpers - they are available in the erb based templates when they exist in the Jeweler::Generator namespace.
For our FFI (DRY) the Rakefile needs to be adapted to compile the C file(s) properly. For this implementatin I am adding a working example of a C binding, similar to what we do for tests.
(to be continued)
The following section discusses surgical changes to biogem.
''Warning, the rest of this document is about Ruby meta-programming. It is not for the faint of heart.''
Biogem builds on Jeweler.
jeweler comes with a library for managing and releasing RubyGem projects, and a scaffold generator for starting new RubyGem projects. Using typical Ruby overrides of jeweler methods, also known as meta-programming, Biogem subverts Jeweler for our bioinformatics needs (see jeweler::Generator.new example below).
In the file ./bin/biogem rake, jeweler and bundler support are loaded and Bio::Gem::Generator::Application invoked, which generates the new directory and files. Thereafter biogem changes directory and runs some rake commands.
First Jeweler::Generator.run is run, so the basic scaffolding exists for Rake, tests etc. Nothing special so far. Where it gets interesting is that biogem overrides Jeweler classes in ./lib/bio-gem/mod/jeweler.rb. In this file, at runtime, Jeweler::Generator.new is replaced with our own version, which calls the original first, but continues to plug in information. Any time jeweler::Generator.new is called, our edition is called. Even from within jeweler!
It is important to check out this file, as many overrides are defined here. Also have a look at the create_files function. That is where directories and files are generated from templates.
The application generator is programmed from biogem command line options. These options are listed in jeweler/options.rb.
Biogem templates are listed in ./lib/bio-gem/templates. These templates use erb to tune content within.
Templates are by in the jeweler.rb override (described above). For example the Rakefile is generated with
output_template_in_target 'Rakefile'
it is all fairly straightforward.
From the above you can see how we reprogram jeweler for our needs. To find new ways of generating code, we strongly suggest to also check out the jeweler source code. The jeweler code base is well thought out, and stable.
Just as an example we are going to override code generated by Jeweler. Jeweler generates a dependency for rcov, a Ruby code coverage analyzer. We are going to remove this dependency, without touching the Jeweler code base.
In the Jeweler source code tree rcov is used in two files:
grep -r rcov *
jeweler/generator.rb: development_dependencies << ["rcov", ">= 0"]
jeweler/templates/other_tasks.erb:RSpec::Core::RakeTask.new(:rcov) do |spec|
jeweler/templates/other_tasks.erb: spec.rcov = true
jeweler/templates/other_tasks.erb:Micronaut::RakeTask.new(:rcov) do |examples|
jeweler/templates/other_tasks.erb: examples.rcov_opts = '-Ilib -I<%= test_dir %>'
jeweler/templates/other_tasks.erb: examples.rcov = true
jeweler/templates/other_tasks.erb:require 'rcov/rcovtask'
jeweler/templates/other_tasks.erb: <%= test_task %>.rcov_opts << '--exclude "gems/*"'
The first step is to remove the rcov entry from development_dependencies. This can be done by adding a line in Biogems lib/bio-gem/mod/jeweler.rb. Change it to
class Jeweler
class Generator
alias original_initialize initialize
def initialize(options = {})
original_initialize(options)
development_dependencies << ["bio", ">= 1.4.2"]
development_dependencies.delete_if { |k,v| k == "rcov" }
(...)
You can see here that BioRuby support is always added. The next step is to change the behaviour of jeweler/templates/other_tasks.erb. The code to generate the Rakefile lists is
<% case testing_framework %>
<% when :rspec %>
(...)
<% when :micronaut %>
(...)
<% else %>
require 'rcov/rcovtask'
Rcov::RcovTask.new do |<%= test_task %>|
(...)
end
<% end %>
and, annoyingly, shows that rcov is always added by default (in the final 'else'). We should communicate with the author of Jeweler to fix this. However, we also have the option to override the Rakefile generator. The jeweler Rakefile template has the form
require 'rubygems'
<%= render_template 'bundler_setup.erb' %>
require 'rake'
<%= render_template 'jeweler_tasks.erb' %>
<%= render_template 'other_tasks.erb' %>
The two important functions in jeweler.rb are:
def render_template(source)
template_contents = File.read(File.join(template_dir, source))
template = ERB.new(template_contents, nil, '<>')
# squish extraneous whitespace from some of the conditionals
template.result(binding).gsub(/\n\n\n+/, "\n\n")
end
def output_template_in_target(source, destination = source)
final_destination = File.join(target_dir, destination)
template_result = render_template(source)
File.open(final_destination, 'w') {|file| file.write(template_result)}
$stdout.puts "\tcreate\t#{destination}"
end
these find the templates and render them through ERB.
Naturally, Biogem has needed some overriding behaviour. In this case Biogems jeweler.rb has
def output_template_in_target_generic_update(source, destination = source, template_dir = template_dir_biogem)
final_destination = File.join(target_dir, destination)
template_result = render_template_generic(source, template_dir)
File.open(final_destination, 'a') {|file| file.write(template_result)}
$stdout.puts "\tcreate\t#{destination}"
end
and, in the case of the --with-db option, the Rakefile already gets modified by Biogem
output_template_in_target_generic 'rakefile', 'Rakefile', template_dir_biogem
So, what would be the best route here, to change biogem behaviour? We have to rewrite the Rakefile template to remove the rcov lines. We can change the render_template to allow rewriting the template. Unfortunately there is no existing hook for that in jeweler. So, let us inject a hook named after_render_template to a render_template override. First we open the Jeweler::Generator class and move the method to biogem jeweler.rb, renaming the original method to original_render_template:
class Jeweler
class Generator
alias original_render_template render_template
def render_template(source)
buf = original_render_template(source)
# call hook (returns edited buf)
after_render_template(source,buf)
end
# new hook for removing stuff
def after_render_template(source,buf)
if source == 'other_tasks.erb'
# remove rcov related lines
buf.gsub!(/require 'rcov/rcovtask'/,'')
(...)
end
end
you probably get the gist (the stuff you can do with Ruby meta-programming!). The solution chosen overrides original jeweler behaviour without touching jeweler itself. Naturally, if it can be handled in jeweler, it is strongly preferred. With our solution a small change in jeweler may now break biogem (in software engineering terms: the fix is brittle).
In fact, the jeweler author has responded that the default behaviour for rcov will change now. I.e. our fix will go upstream.
Still, for stuff that will not go into jeweler, this is a way of changing behaviour.
This document should help you preventing repeating yourself. Code generation can be very useful. When you have something that is useful to yourself, or others, and is bioinformatics related, add it to biogem. When it is more generic, add it to jeweler. You may make a lot of people happy.
Thanks to Ruby meta-programming we do not have to change jeweler. With another computer language, we would have cloned jeweler and modified the source code for our purposes. This would imply a fork of the code base - and the projects would have diverged irrevocably. As it stands, we can build on the existing jeweler project. Some 'brittleness' may get introduced, as explained above, but in general we should normally be able to continue adapting our code base to that of jeweler.
The Pragmatic programmers book on Ruby metaprogramming is recommended reading.
Copyright (C) 2012 Pjotr Prins pjotr.prins@thebird.nl