# Py - `rpy2` (Python 3.6 `rpy2` 2.9.1)

https://rpy2.bitbucket.io/  
`rpy2` is an interface to R running embedded in a Python process.  
This notebook is for **Python 3.6** and **`rpy2` 2.9.1.**    
This runs locally, but not on Azure.

### Purpose:
Mix python & R code in a single notebook.  
https://stackoverflow.com/questions/39008069/r-and-python-in-one-jupyter-notebook  
https://rpy2.github.io/doc/v2.9.x/html/interactive.html?highlight=magic#overview   

#### Problems
On Azure notebooks, only works for Python 2? https://github.com/Microsoft/AzureNotebooks/issues/35 
https://stats.stackexchange.com/questions/6056/problems-with-librblas-so-on-ubuntu-with-rpy2  

### Install & load library

In [1]:
!pip install rpy2



In [2]:
import rpy2
rpy2.__path__

['C:\\Program Files (x86)\\Microsoft Visual Studio\\Shared\\Anaconda3_64\\lib\\site-packages\\rpy2-2.9.1-py3.6-win-amd64.egg\\rpy2']

In [3]:
print(rpy2.__version__)

2.9.1


In [4]:
%load_ext rpy2.ipython

### `sessionInfo`

In [6]:
%R sessionInfo()

0,1
R.version,ListVector with 14 elements.  platform  StrVector with 1 elements.  'x86_64-w64-mingw32'  arch  StrVector with 1 elements.  'x86_64'  os  StrVector with 1 elements.  'mingw32'  ...  ...  language  StrVector with 1 elements.  'R'  version.string  StrVector with 1 elements.  'R version 3.4.3 (2017-11-30)'  nickname  StrVector with 1 elements.  'Kite-Eating Tree'
platform,StrVector with 1 elements.  'x86_64-w64-mingw32'
'x86_64-w64-mingw32',
arch,StrVector with 1 elements.  'x86_64'
'x86_64',
os,StrVector with 1 elements.  'mingw32'
'mingw32',
...,...
language,StrVector with 1 elements.  'R'
'R',

0,1
platform,StrVector with 1 elements.  'x86_64-w64-mingw32'
'x86_64-w64-mingw32',
arch,StrVector with 1 elements.  'x86_64'
'x86_64',
os,StrVector with 1 elements.  'mingw32'
'mingw32',
...,...
language,StrVector with 1 elements.  'R'
'R',
version.string,StrVector with 1 elements.  'R version 3.4.3 (2017-11-30)'

0
'x86_64-w64-mingw32'

0
'x86_64'

0
'mingw32'

0
'R'

0
'R version 3.4.3 (2017-11-30)'

0
'Kite-Eating Tree'

0
'x86_64-w64-mingw32/x64 (64-bit)'

0
'LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=...

0
'default'

0
''

0
''


### `robjects`

In [7]:
import rpy2.robjects as robjects

#### Package module

In [8]:
# import rpy2's package module
import rpy2.robjects.packages as rpackages

# import R's utility package
utils = rpackages.importr('utils')

# select a mirror for R packages
utils.chooseCRANmirror(ind=1) # select the first mirror in the list

rpy2.rinterface.NULL

### Installed packages

In [5]:
%%R
ip <- as.data.frame(installed.packages()[,c(1,3:4)])
rownames(ip) <- NULL
ip <- ip[is.na(ip$Priority),1:2,drop=FALSE]

In [6]:
%R print(ip, row.names=FALSE)

Unnamed: 0,Package,Version
1,assertthat,0.2.0
3,BH,1.65.0-1
4,bindr,0.1
5,bindrcpp,0.2
6,bit,1.1-12
7,bit64,0.9-7
8,blob,1.1.0
9,cli,1.0.0
11,crayon,1.3.4
13,DBI,0.7


### Magics

https://rpy2.github.io/doc/v2.9.x/html/interactive.html?highlight=magic#module-rpy2.ipython.rmagic

In [7]:
%load_ext rpy2.ipython

The rpy2.ipython extension is already loaded. To reload it, use:
  %reload_ext rpy2.ipython


In [13]:
%R X=c(1,4,5,7)

array([ 1.,  4.,  5.,  7.])

In [14]:
%%R
X2=c(1,2,3,4)
mean(X2)

In [15]:
%R mean(X)

array([ 4.25])

In [17]:
%%R
Y = c(2,4,3,9)
s = summary(lm(Y~X))

In [18]:
%R s

0,1,2,3
call,Vector with 2 elements.  RObject  Vector,,
RObject,Vector,,
terms,"Y ~ X attr(,""variables"") list(Y, X) attr(,""factors"")  X Y 0 X 1 attr(,""term.labels"") [1] ""X"" attr(,""order"") [1] 1 attr(,""intercept"") [1] 1 attr(,""response"") [1] 1 attr(,"".Environment"") attr(,""predvars"") list(Y, X) attr(,""dataClasses"")  Y X ""numeric"" ""numeric""",,
residuals,FloatVector with 4 elements.  0.880000  -0.240000  -2.280000  1.640000,,
0.880000,-0.240000,-2.28,1.64
...,...,,
adj.r.squared,FloatVector with 1 elements.  0.548966,,
0.548966,,,
fstatistic,FloatVector with 3 elements.  4.651376  1.000000  2.000000,,
4.651376,1.000000,2.0,

0,1
RObject,Vector

0,1,2,3
0.88,-0.24,-2.28,1.64

0
0.548966

0,1,2
4.651376,1.0,2.0

0,1,2,3
1.213333,-0.226667,-0.226667,0.053333


In [19]:
%R sd(X)

array([ 2.5])

### Matloff

http://heather.cs.ucdavis.edu/~matloff/rpy2.html

In [None]:
from rpy2.robjects import r
r('x <- rnorm(100)')  # generate x at R
r('y <- x + rnorm(100,sd=0.5)')  # generate y at R

In [26]:
r('plot(x,y)')  # have R plot them

rpy2.rinterface.NULL

In [30]:
r('lmout <- lm(y~x)')  # run the regression
r('print(lmout)')  # print from R

y,x,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8
coefficients,FloatVector with 2 elements.  0.035741  1.029290,,,,,,,
0.035741,1.029290,,,,,,,
residuals,FloatVector with 100 elements.  -0.127094  -0.649659  -0.192780  0.070212  ...  -0.152575  -0.437929  0.266562  0.877631,,,,,,,
-0.127094,-0.649659,-0.19278,0.070212,...,-0.152575,-0.437929,0.266562,0.877631
effects,FloatVector with 100 elements.  -2.298883  11.303722  -0.182198  0.111087  ...  -0.291888  -0.489297  0.274761  0.885565,,,,,,,
-2.298883,11.303722,-0.182198,0.111087,...,-0.291888,-0.489297,0.274761,0.885565
...,...,,,,,,,
call,Vector with 2 elements.  RObject  Vector,,,,,,,
RObject,Vector,,,,,,,
terms,"y ~ x attr(,""variables"") list(y, x) attr(,""factors"")  x y 0 x 1 attr(,""term.labels"") [1] ""x"" attr(,""order"") [1] 1 attr(,""intercept"") [1] 1 attr(,""response"") [1] 1 attr(,"".Environment"") attr(,""predvars"") list(y, x) attr(,""dataClasses"")  y x ""numeric"" ""numeric""",,,,,,,

0,1
0.035741,1.02929

0,1,2,3,4,5,6,7,8
-0.127094,-0.649659,-0.19278,0.070212,...,-0.152575,-0.437929,0.266562,0.877631

0,1,2,3,4,5,6,7,8
-2.298883,11.303722,-0.182198,0.111087,...,-0.291888,-0.489297,0.274761,0.885565

0,1
RObject,Vector

y,x
-0.03603,0.053749
-1.928528,-1.2772


In [28]:
loclmout = r('lmout') # download lmout from R to Python
print(loclmout)  # print locally
#print(loclmout.r['coefficients'])  # print one component



Call:

lm(formula = y ~ x)



Coefficients:

(Intercept)            x  

    0.03574      1.02929  





In [29]:
loclmout

y,x,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8
coefficients,FloatVector with 2 elements.  0.035741  1.029290,,,,,,,
0.035741,1.029290,,,,,,,
residuals,FloatVector with 100 elements.  -0.127094  -0.649659  -0.192780  0.070212  ...  -0.152575  -0.437929  0.266562  0.877631,,,,,,,
-0.127094,-0.649659,-0.19278,0.070212,...,-0.152575,-0.437929,0.266562,0.877631
effects,FloatVector with 100 elements.  -2.298883  11.303722  -0.182198  0.111087  ...  -0.291888  -0.489297  0.274761  0.885565,,,,,,,
-2.298883,11.303722,-0.182198,0.111087,...,-0.291888,-0.489297,0.274761,0.885565
...,...,,,,,,,
call,Vector with 2 elements.  RObject  Vector,,,,,,,
RObject,Vector,,,,,,,
terms,"y ~ x attr(,""variables"") list(y, x) attr(,""factors"")  x y 0 x 1 attr(,""term.labels"") [1] ""x"" attr(,""order"") [1] 1 attr(,""intercept"") [1] 1 attr(,""response"") [1] 1 attr(,"".Environment"") attr(,""predvars"") list(y, x) attr(,""dataClasses"")  y x ""numeric"" ""numeric""",,,,,,,

0,1
0.035741,1.02929

0,1,2,3,4,5,6,7,8
-0.127094,-0.649659,-0.19278,0.070212,...,-0.152575,-0.437929,0.266562,0.877631

0,1,2,3,4,5,6,7,8
-2.298883,11.303722,-0.182198,0.111087,...,-0.291888,-0.489297,0.274761,0.885565

0,1
RObject,Vector

y,x
-0.03603,0.053749
-1.928528,-1.2772


### Plotting example

https://stackoverflow.com/questions/39008069/r-and-python-in-one-jupyter-notebook

In [10]:
#%R library("ggplot2", lib.loc="C:/Users/Ian Buckley/Documents/R/win-library/3.3")
%R install.packages("ggplot2")


Error in install.packages("ggplot2") : unable to install packages


PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\IANBUC~1\\AppData\\Local\\Temp\\tmpf3hdvr3s\\Rplots001.png'

In [8]:
%R library("ggplot2")


Error in library("ggplot2") : there is no package called 'ggplot2'


PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\IANBUC~1\\AppData\\Local\\Temp\\tmptz9d7_a9\\Rplots001.png'

In [None]:
# enables the %%R magic, not necessary if you've already done this
%load_ext rpy2.ipython

import pandas as pd
df = pd.DataFrame({
    'cups_of_coffee': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
    'productivity': [2, 5, 6, 8, 9, 8, 0, 1, 0, -1]
})

In [44]:
# import df from global environment
# make default figure size 5 by 5 inches with 200 dpi resolution
# Note that %%R has to be at start of cell!!!

In [None]:
%%R -i df -w 5 -h 5 --units in -r 200
install.packages("ggplot2", repos='http://cran.us.r-project.org', quiet=TRUE)
library(ggplot2)
ggplot(df, aes(x=cups_of_coffee, y=productivity)) + geom_line()

Objects can be passed back and forth between rpy2 and python via the -i -o flags in line:

In [None]:
Z = np.array([1,4,5,10])

In [None]:
%R -i Z mean(Z)

In [None]:
%R -o W W=Z*mean(Z)

In [None]:
W