## BCRA R Package in Python Flask App

- Review in notebook before pivoting to Docker!

In [38]:
import flask
from flask import request, jsonify

import pandas as pd
import os

# Allow R Package import and instal
from rpy2.robjects.packages import importr
from rpy2.robjects.vectors import StrVector
import rpy2.robjects.packages as rpackages

# Allow conversion for dataframes
import rpy2.robjects as ro
from rpy2.robjects import pandas2ri


## Using `rpy2` for R in Python


- https://pypi.org/project/rpy2/

- https://rpy2.github.io/doc/latest/html/introduction.html


- *On local Macbook I needed to set `R_HOME` in environment path*


In [31]:
pip install rpy2

Collecting rpy2
[?25l  Downloading https://files.pythonhosted.org/packages/f5/53/bebbaa532ce64c5fb9bd99c37c7ed5ec7c1fd870e5a2d0f370e8a089c72e/rpy2-3.4.4-cp37-cp37m-macosx_10_14_x86_64.whl (235kB)
[K     |████████████████████████████████| 235kB 2.5MB/s eta 0:00:01
Installing collected packages: rpy2
Successfully installed rpy2-3.4.4
Note: you may need to restart the kernel to use updated packages.


In [21]:
os.environ['R_HOME'] = '/Library/Frameworks/R.framework/Resources/'

### Install the R packages

I think I had to run this through R, not via Python ... 

- more info here: https://rpy2.github.io/doc/v2.9.x/html/robjects_rpackages.html#installing-removing-r-packages

- Helpful example here as well: https://towardsdatascience.com/an-introduction-to-working-with-r-and-python-1c51fac0b16f



In [24]:
# Choosing a CRAN Mirror
import rpy2.robjects.packages as rpackages
utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1)

# Installing required packages
from rpy2.robjects.vectors import StrVector
packages = ('BCRA')
utils.install_packages(StrVector(packages))

<rpy2.rinterface_lib.sexp.NULLType object at 0x7fe3e052ccd0> [RTYPES.NILSXP]

### Load the BCRA R Package

In [47]:
bcra = importr('BCRA')

### You can see all functions for BCRA R package

In [48]:
bcra.__dict__['_rpy2r']

{'___NAMESPACE___': '.__NAMESPACE__.',
 '___S3MethodsTable___': '.__S3MethodsTable__.',
 '_packageName': '.packageName',
 'absolute_risk': 'absolute.risk',
 'check_summary': 'check.summary',
 'error_table': 'error.table',
 'error_table_all': 'error.table.all',
 'list_constants': 'list.constants',
 'recode_check': 'recode.check',
 'relative_risk': 'relative.risk',
 'risk_summary': 'risk.summary'}

## Load our functions for R

- for this example we mostly only need `absolute_risk` but error checking is of course recommended

In [50]:
absolute_risk = bcra.absolute_risk
check_summary = bcra.check_summary
relative_risk = bcra.relative_risk
risk_summary = bcra.risk_summary

## Convert Pandas df to R dataframe

- this is sample input BCRA data for testing

In [62]:
df = pd.DataFrame({'id':1,
                            'T1':40,
                            'T2':45,
                            'N_Biop':1,
                            'HypPlas':99,
                            'AgeMen':14,
                            'Age1st':24,
                            'N_Rels':1,
                            'Race':1
                           },index=['id'])

In [81]:
pandas2ri.activate()

# Convert to R dataframe
r_dt = ro.conversion.py2rpy(df) # df is a pd.DataFrame object

# Convert back to pandas DataFrame        
# pd_dt = ro.conversion.rpy2py(r_dt)

In [82]:
r_dt

id,T1,T2,...,Age1st,N_Rels,Race
...,...,...,...,...,...,...


## Pass the `r_dt` to bcra function for absolute risk calculation:

In [78]:
try:
    print('absolute risk for provided sample df -->\n\n', absolute_risk(data=r_dt))
except:
    print('failed to get absolute risk')


absolute risk for provided sample df -->

 [1.64707903]


## Okay cool so it worked! We can also get relative risk & summary 

*Spend some time and think about containerization for this project*

- likely a slim python which we install R on top of? Or maybe there is a linux distro which is slim and we can install both on ... just need to think about the pip dependencies v the R dependencies

In [86]:
relative_risk(data=r_dt)

Unnamed: 0,RR_Star1,RR_Star2,PatternNumber
1,4.551158,3.412139,41.0


In [87]:
check_summary(r_dt)

Unnamed: 0,Variable,Label,Mean,StdDev,N,NMiss
1,Error_Ind,"If mean not 0, implies ERROR in file",0.0,,1,0
2,AbsRisk,"Abs risk(%) of BrCa in age interval [T1,T2)",1.64707903057814,,1,0
3,RR_Star1,Relative risk age lt 50,4.55115837466802,,1,0
4,RR_Star2,Relative risk age ge 50,3.41213878855182,,1,0


In [90]:
check_summary(r_dt).to_dict()

{'Variable': {'1': 'Error_Ind',
  '2': 'AbsRisk',
  '3': 'RR_Star1',
  '4': 'RR_Star2'},
 'Label': {'1': 'If mean not 0, implies ERROR in file',
  '2': 'Abs risk(%) of BrCa in age interval [T1,T2)',
  '3': 'Relative risk age lt 50',
  '4': 'Relative risk age ge 50'},
 'Mean': {'1': '0',
  '2': '1.64707903057814',
  '3': '4.55115837466802',
  '4': '3.41213878855182'},
 'StdDev': {'1': None, '2': None, '3': None, '4': None},
 'N': {'1': '1', '2': '1', '3': '1', '4': '1'},
 'NMiss': {'1': '0', '2': '0', '3': '0', '4': '0'}}

### WOW okay so now that we have this deployed ... 

- Here is out sample URL: http://0.0.0.0:5000/api/example_df?id=0&T1=40&T2=45&N_Biop=1&HypPlas=99&AgeMen=14&Age1st=24&N_Rels=1&Race=1