In this blog, we will show how to run `R` code in `Julia` using the package `RCall`. The jupyter notebook for this blog can be found [here](https://raw.githubusercontent.com/Shuvomoy/blog/gh-pages/codes/running_R_code_in_Julia.ipynb)

`R` has many convenient packages for statistics and data processing, so `RCall` gives us the flexibility of doing our data processing part in `R` without leaving the `Julia` environment. Then we can do the computation extensive part in `Julia`.

**Install the `RCall` package.**

We install the package `RCall` first in Julia. I am assuming that we have already installed `R` by following the instructions from this link [https://www.r-project.org/](https://www.r-project.org/).

In [1]:
# using Pkg  # Uncomment these two lines 
# Pkg.add("RCall") # if the RCall package is not installed

In [2]:
using RCall

**The macro to run `R` code in `Julia`**

Any `R` code (it can be a large chuck of code) can be run in `Julia` using the macro 
  
  ```
  R"""
  code
  goes
  here
  """
  ```
  
  Best to understand it through example. We will run a a simple factor analysis model in `R` from `Julia`.
  
  First, we download a dataset in `R` from `Julia` for doing factor analysis. 

In [3]:
## Download dataset from R
R"""
food <- read.csv("https://userpage.fu-berlin.de/soga/300/30100_data_sets/food-texture.csv",
                 row.names = "X")
"""

RObject{VecSxp}
      Oil Density Crispy Fracture Hardness
B110 16.5    2955     10       23       97
B136 17.7    2660     14        9      139
B171 16.2    2870     12       17      143
B192 16.7    2920     10       31       95
B225 16.3    2975     11       26      143
B237 19.1    2790     13       16      189
B261 18.4    2750     13       17      114
B264 17.5    2770     10       26       63
B353 15.7    2955     11       23      123
B360 16.4    2945     11       24      132
B366 18.0    2830     12       15      121
B377 17.4    2835     12       18      172
B391 18.4    2860     14       11      170
B397 13.9    2965     12       19      169
B404 15.8    2930      9       26       65
B437 16.4    2770     15       16      183
B445 18.9    2650     14       20      114
B462 17.3    2890     12       17      142
B485 16.7    2695     13       13      111
B488 19.1    2755     14       10      140
B502 13.7    3000     10       27      177
B554 14.7    2980     10       20     

We can bring back `food` from the `R` environment to the `Julia` environment using `@rget` macro. The copied variable will have the same name as the original.

In [4]:
@rget food

Unnamed: 0_level_0,Oil,Density,Crispy,Fracture,Hardness
Unnamed: 0_level_1,Float64,Int64,Int64,Int64,Int64
1,16.5,2955,10,23,97
2,17.7,2660,14,9,139
3,16.2,2870,12,17,143
4,16.7,2920,10,31,95
5,16.3,2975,11,26,143
6,19.1,2790,13,16,189
7,18.4,2750,13,17,114
8,17.5,2770,10,26,63
9,15.7,2955,11,23,123
10,16.4,2945,11,24,132


Let us continue running our code in `R` environment. First, let us take a look at the correlation matrix of `food`.

In [5]:
R"""
Sigma <- cor(food)
"""

RObject{RealSxp}
                 Oil    Density     Crispy   Fracture    Hardness
Oil       1.00000000 -0.7500240  0.5930863 -0.5337392 -0.09604521
Density  -0.75002399  1.0000000 -0.6709460  0.5721324  0.10793720
Crispy    0.59308631 -0.6709460  1.0000000 -0.8439650  0.41109340
Fracture -0.53373917  0.5721324 -0.8439650  1.0000000 -0.37335844
Hardness -0.09604521  0.1079372  0.4110934 -0.3733584  1.00000000


Now run the factor analysis from `R` using the `factanal` function.

In [6]:
R"""
food_fa <- factanal(covmat = Sigma, factors = 2)
"""

RObject{VecSxp}

Call:
factanal(factors = 2, covmat = Sigma)

Uniquenesses:
     Oil  Density   Crispy Fracture Hardness 
   0.334    0.156    0.042    0.256    0.407 

Loadings:
         Factor1 Factor2
Oil      -0.816         
Density   0.919         
Crispy   -0.745   0.635 
Fracture  0.645  -0.573 
Hardness          0.764 

               Factor1 Factor2
SS loadings      2.490   1.316
Proportion Var   0.498   0.263
Cumulative Var   0.498   0.761

The degrees of freedom for the model is 1 and the fit was 0.006 


As we can see in the output above, the `R` object `food_fa` has two chunks: `Uniquenesses`, `Loadings`. We can take a look at them from `R` using the following command.

In [7]:
R"""
Lambda <- food_fa$loadings
D <- diag(food_fa$uniquenesses)
"""

RObject{RealSxp}
          [,1]      [,2]      [,3]      [,4]      [,5]
[1,] 0.3338599 0.0000000 0.0000000 0.0000000 0.0000000
[2,] 0.0000000 0.1555255 0.0000000 0.0000000 0.0000000
[3,] 0.0000000 0.0000000 0.0422238 0.0000000 0.0000000
[4,] 0.0000000 0.0000000 0.0000000 0.2560235 0.0000000
[5,] 0.0000000 0.0000000 0.0000000 0.0000000 0.4069459


Now we can bring back the variables: `Lambda`, `D` and `Sigma` in `Julia` using `@rget`, and perform residual analysis. 

In [8]:
@rget Lambda
@rget D
@rget Sigma

5×5 Array{Float64,2}:
  1.0        -0.750024   0.593086  -0.533739  -0.0960452
 -0.750024    1.0       -0.670946   0.572132   0.107937
  0.593086   -0.670946   1.0       -0.843965   0.411093
 -0.533739    0.572132  -0.843965   1.0       -0.373358
 -0.0960452   0.107937   0.411093  -0.373358   1.0

Now let us compare how good the residue is.

In [9]:
X = Lambda*Lambda'
Sigma_fa = X + D

5×5 Array{Float64,2}:
  1.0        -0.750025   0.595699  -0.515519  -0.0952689
 -0.750025    1.0       -0.669865   0.579672   0.108257
  0.595699   -0.669865   1.0       -0.843965   0.411088
 -0.515519    0.579672  -0.843965   1.0       -0.373391
 -0.0952689   0.108257   0.411088  -0.373391   1.0

In [10]:
Sigma_fa - Sigma

5×5 Array{Float64,2}:
 -3.32556e-7   -6.08148e-7    0.00261311   0.0182198    0.000776357
 -6.08148e-7   -3.0673e-8     0.00108144   0.00753941   0.000320227
  0.00261311    0.00108144   -1.18922e-9  -1.25235e-7  -4.98666e-6
  0.0182198     0.00753941   -1.25235e-7   1.37223e-7  -3.25606e-5
  0.000776357   0.000320227  -4.98666e-6  -3.25606e-5  -2.06679e-7

**Taking objects from `Julia` to `R`**

What if we want to do the opposite, where we have an object in `Julia` that we want to move to `R` environment and run statistical analysis on `R`? We can do that using `@rput` macro. A simple example is as follows, where we take a `Julia` matrix `X` to `R` environment and find its svd. Of course, doing svd on `Julia` will be faster, so we do not want to do that in practice; this is just for illustration.

In [11]:
@rput X

5×5 Array{Float64,2}:
  0.66614    -0.750025   0.595699  -0.515519  -0.0952689
 -0.750025    0.844474  -0.669865   0.579672   0.108257
  0.595699   -0.669865   0.957776  -0.843965   0.411088
 -0.515519    0.579672  -0.843965   0.743977  -0.373391
 -0.0952689   0.108257   0.411088  -0.373391   0.593054

In [12]:
R"""
X_svd <- svd(X)
"""

RObject{VecSxp}
$d
[1] 2.840147e+00 9.652738e-01 1.835730e-16 6.889660e-17 5.316257e-17

$u
           [,1]       [,2]        [,3]       [,4]        [,5]
[1,] -0.4318280  0.3760766  0.57192547 -0.2534798 -0.52984927
[2,]  0.4858567 -0.4246170 -0.06708423 -0.5805591 -0.49203098
[3,] -0.5613017 -0.2553929 -0.13261665 -0.6397742  0.43910886
[4,]  0.4919747  0.2420423  0.57970383 -0.2832291  0.53207255
[5,] -0.1427205 -0.7446186  0.56103230  0.3304079  0.03531956

$v
           [,1]       [,2]        [,3]        [,4]       [,5]
[1,] -0.4318280  0.3760766  0.24156539 -0.77208923 -0.1327229
[2,]  0.4858567 -0.4246170  0.65568309 -0.31382198  0.2350299
[3,] -0.5613017 -0.2553929  0.57315684  0.42583932 -0.3314618
[4,]  0.4919747  0.2420423  0.09751987  0.01599846 -0.8304278
[5,] -0.1427205 -0.7446186 -0.41678166 -0.35185069 -0.3573069



In [13]:
@rget X_svd

OrderedCollections.OrderedDict{Symbol,Any} with 3 entries:
  :d => [2.84015, 0.965274, 1.83573e-16, 6.88966e-17, 5.31626e-17]
  :u => [-0.431828 0.376077 … -0.25348 -0.529849; 0.485857 -0.424617 … -0.58055…
  :v => [-0.431828 0.376077 … -0.772089 -0.132723; 0.485857 -0.424617 … -0.3138…