Skip to content
Rodrigo Botafogo edited this page Jan 2, 2015 · 11 revisions

Welcome to the scicom wiki!

SciCom (Scientific Computing) for Ruby brings the power of R to the Ruby community. SciCom is based on Renjin, a JVM-based interpreter for the R language for statistical computing.

R on the JVM

Over the past two decades, the R language for statistical computing has emerged as the de facto standard for analysts, statisticians, and scientists. Today, a wide range of enterprises –from pharmaceuticals to insurance– depend on R for key business uses. Renjin is a new implementation of the R language and environment for the Java Virtual Machine (JVM), whose goal is to enable transparent analysis of big data sets and seamless integration with other enterprise systems such as databases and application servers.

Renjin is still under development, but it is already being used in production for a number of client projects, and supports most CRAN packages, including some with C/Fortran dependencies.

SciCom and Renjin

SciCom integrates with Renjin and allows the use of R inside a Ruby script. In a sense, SciCom is similar to other solutions such as Rinruby, Rpy2, PipeR, etc. However, since SciCom and Renjin both target the JVM there is no need to integrate both solutions and there is no need to send data between Ruby and R, as it all resides in the same JVM. Further, installation of SciCom does not require the installation of GNU R, Renjin is the interpreter and comes with SciCom. Finally, although SciCom provides a basic interface to Renjin similar to Rinruby, a much tighter integration is also possible.

Basic Interface

SciCom provides a basic interface between Ruby and R very similar to Rinruby.

require 'scicom'

R.eval("vec = c(10, 20, 30, 40, 50)") R.eval("print(vec)")

>> [1] 10 20 30 40 50

R.eval("print(var[1])")

>> [1] 10

SciCom also provides here documents

# Variables created in Ruby can be accessed in an eval clause:
val = "10L"
R.eval <<EOF
	r.i3 = #{val}
	print(r.i3)
EOF

And a little more complex example

R.eval <<EOF

  # This dataset comes from Baseball-Reference.com.
  baseball = read.csv("baseball.csv")
  # str is bogus on Renjin
  # str(data)

  # prints the index of maximum and minimum years for the dataset
  print(which.max(baseball$Year))
  print(which.min(baseball$Year))

  # Lets look at the data available for Momeyball.
  moneyball = subset(baseball, Year < 2002)

  # Let's see if we can predict the number of wins, by looking at
  # runs allowed (RA) and runs scored (RS).  RD is the runs difference.
  # We are making a linear model from predicting wins (W) based on RD
  moneyball$RD = moneyball$RS - moneyball$RA
  WinsReg = lm(W ~ RD, data=moneyball)
  print(summary(WinsReg))

EOF

The SciCom Ruby sintax

As stated before, SciCom allows an integration with R that is much tighter than other similar solutions. The same program for baseball analysis can be written as:

require 'scicom'

# This dataset comes from Baseball-Reference.com.
baseball = R.read__csv("baseball.csv")
# Lets look at the data available for Momeyball.
moneyball = R.subset(baseball, baseball.Year < 2002)

# Let's see if we can predict the number of wins, by looking at
# runs allowed (RA) and runs scored (RS).  RD is the runs difference.
# We are making a linear model for predicting wins (W) based on RD
moneyball.RD = moneyball.RS - moneyball.RA
wins_reg = R.lm("W ~ RD", data: moneyball)
R.summary(wins_reg).pp

As can be seen, this code is just pure Ruby code. All R functions can be accessed in the R. namespace. An R programmer can easily migrate to SciCom by following some simple rules. In general, any available function in R is available to SciCom in the R. namespace. For instance, function 'lm' is called in SciCom as R.lm. Functions is R that have a '.' in their names need special treatment. For example, function 'read.csv' is accessed in SciCom by calling R.read__csv. In the next pages we will describe in more detail SciCom classes and functionality.