Skip to content
rbotafogo edited this page Nov 19, 2014 · 2 revisions

Creating a variable in R and assign a value to it.

There are two ways of assign variables in R, through method assign or with the '=' method. To retrieve an R variable just access it in the R namespace.

# The NULL value
# variable null is NULL.  Variable 'null' exists in the R namespace and can be 
# access normally in a call to 'eval'
R.eval("null = NULL")
R.eval("print(null)")

> NULL

# Basic integration with R can always be done by calling eval and passing it a valid
# R expression.  Creating variable 'r.i' in R.
R.eval("r.i = 10L")
R.eval("print(r.i)")

> [1] 10

R.eval("vec = c(10, 20, 30, 40, 50)")
R.eval("print(vec)")

> [1] 10 20 30 40 50

R.eval("print(vec[1])")

> [1] 10

Using method 'assign' and 'pull'

should "use assign and pull to set and get data from R" do

# Using method assign, to assign NULL to variable 'null' in R namespace.
R.assign("null", nil)
R.eval("print(null)")

> NULL     

# Variable 'res' is available only in the Ruby namespace and not in the R namespace.
# a NULL object in R is converted to nil in Ruby.
res = R.pull("null")
p res

> nil

Using accessor like methods to assign a value to an R variable

# Assign a value to an R variable, 'n2'.  
R.n2 = nil
R.eval("print(n2)")

> NULL

One can access variables created in R namespace by using R.. Variable in R that have a '.' such as 'r.i3' need to have the '.' substituted by '__'

R.eval("r.i3 = 10.235")
R.r__i3.pp

> [1] 10,235

Usage of "here docs"

R.eval <<EOF
  r.i2 = 10L
  print(r.i2)  
EOF

Variables created in Ruby can be accessed in an eval clause:

val = "10L"
R.eval <<EOF
  r.i3 = #{val}
  print(r.i3)
EOF

Using a linear model with the basic R interface

This example uses a dataset from Baseball-Reference.com. In it, we try to predict the number of wins of a baseball team based on the number of runs allowed (RA) and runs scored (RS). The model tries to see if the runs difference (RD), i.e, RS - RA is a good predictor of the number of wins. The dataset contains data after 2002, but we are only looking at data until 2002, which is the data used for the book Moneyball (Michael Lewis).

R.eval <<EOF

  # This dataset comes from Baseball-Reference.com.
  baseball = read.csv("baseball.csv")

  # Lets look at the data available for Momeyball.
  moneyball = subset(baseball, Year < 2002)

  # Let's see if we can predict the number of wins, by lookin at
  # runs allowed (RA) and runs scored (RS).  RD is the runs difference.
  # We are making a linear model from predicting wins (W) based on RD
  moneyball$RD = moneyball$RS - moneyball$RA
  WinsReg = lm(W ~ RD, data=moneyball)
  print(summary(WinsReg))
EOF

> Call:
> lm(data = moneyball, formula = W ~ RD)

> Residuals:
>     Min      1Q  Median      3Q     Max
>  -14,266  -2,651   0,123   2,936  11,657

>  Coefficients:
>             Estimate   Std. Error t value    Pr(>|t|)             
>  (Intercept) 80,881     0,131      616,675    <0         ***       
>           RD 0,106      0,001       81,554    <0         ***       
>  ---
>  Signif. codes:  0 '***' 0,001 '**' 0,01 '*' 0,05 '.' 0,1 ' ' 1 

> Residual standard error: 3,939 on 900 degrees of freedom
> Multiple R-squared: 0,8808,	Adjusted R-squared: 0,8807 
> F-statistic: 6.650,9926 on 1 and 900 DF,  p-value: < 0