Skip to content
Rodrigo Botafogo edited this page Jan 9, 2015 · 1 revision

How SciCom Works

Almost everything is dynamically dispatched to Renjin. The magic is basically done by Ruby method 'method_missing'. I´ll describe a bit how the whole thing works.

Let´s begin with a simple example:

> vec = R.c(1, 2.45, 3)

Ruby will try to find method 'c' on R class, since there is no such method, Ruby calls 'method_missing' passing the argument list to it. 'method_missing' calls 'parse' to parse the arguments. In this case, every argument is an Numeric and 'parse' will basically convert the argument list to a string: "(1, 2.45, 3)". 'method_missing' then makes the following call:

# R.eval(" <method_name> + <arguments> "), which in this case is: 
> R.eval("c(1, 2.45, 3)")  giving us the desired result.

Now let´s do:

> vec2 = R.c(vec, 4, 5)

Again, 'method_missing' is called, which calls 'parse'. The first argument to parse is now a Ruby::Vector and parse does the following:

# create a temporary variable, lets call it sc_1234
# let vec.sexp be the actual java sexp for this vector, then:

> R.eval("sc_1234 = vec.sexp")  # this stores the java vector in the sc_1234 variable

# finally return all parameters as a string: "(sc_1234, 4, 5)"
# Now let Renjin do the work:

> R.eval("c(sc_1234, 4, 5)")

# return of the above is the vector [1, 2.5, 3, 4, 5] as expected
# finally, we remove the temporary variable sc_1234, from Renjin name space in the hope that
# this will allow gc to work.  

Every statement in SciCom is converted to an evaluable string that can be passed to Renjin 'eval'. Now, this adds some overhead, but unless we put a SciCom statment inside a large loop this should be reasonable. One way that integration could be improved is if there was some way of calling 'eval' on a SEXP and not on a string. Maybe you have this already, but I couldn´t find a way of doing it.

Now, if we have

> vec.mean

Then again we call 'method_missing' of the Vector class. In this case, first we check if 'mean' is a named element of the vector. If it is, then we call:

> R.eval("vec[mean]")

If mean is not a named element of the vector, then we call:

R.eval("mean(vec)")

Which calculates the mean of the vector. Again, this is just translating Ruby statements onto R evaluable strings.

Some new methods had to be defined. For instance we have method '_' in order to simulate methods such as %in%. So here we do

> vec._ :in, vec2

This is a call to the '_' method with 2 arguments ':in' and 'vec2'. This translates to the correct R call:

> R.eval("vec %in% vec2")

In place of ':in' we can put any other string.

Methods and variable that have a '.' on their names are used in SciCom with '' notation. So, read.csv in R is written as read__csv. 'method_missing' converts '' to '.' before calling eval. So, if there is any variable in R written with '__' there will be no way to access it, other than calling R.eval directly.

'method_missing' is really the great magic!