# Working with ECoL in Python

To do so, We're going to need a package that can translate packages written in R to Python. Here we use a package called `rpy2`.

## Setup of `rpy2`

### Installing `rpy2`

In [36]:
import sys

!{sys.executable} -m pip install rpy2



### Importing `rpy2`

In [37]:
import rpy2
print(rpy2.__version__)

3.4.5


### Verifying `rpy2.situation`

In [38]:
!{sys.executable} -m rpy2.situation


rpy2 version:
3.4.5
Python version:
3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)]
Looking for R's HOME:
    Environment variable R_HOME: C:/Program Files/R/R-4.1.1
    InstallPath in the registry: C:\Program Files\R\R-4.1.1
    Environment variable R_USER: C:\Users\steff\Documents
    Environment variable R_LIBS_USER: C:\Users\steff\Documents/R/win-library/4.1
R version:
    In the PATH: 
    Loading R library from rpy2: OK
Additional directories to load R packages from:
None
C extension compilation:


R version 4.1.1 (2021-08-10) -- "Kick Things"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.

'sh' n�o � reconhecido como um comando interno
ou externo, um programa oper�vel ou um arquivo em lotes.


## Setup of `ECoL` package

### Setting up CRAN Mirror

In [39]:
# import rpy2's package module
import rpy2.robjects.packages as rpackages

# import R's utility package
utils = rpackages.importr('utils')

# select a mirror for R packages
utils.chooseCRANmirror(ind=1)  # select the first mirror in the list

<rpy2.rinterface_lib.sexp.NULLType object at 0x000001F6905495C0> [RTYPES.NILSXP]

### Installing `ECoL` package

In [40]:
utils.install_packages("ECoL")

R[write to console]:  package 'ECoL' is in use and will not be installed



<rpy2.rinterface_lib.sexp.NULLType object at 0x000001F6905495C0> [RTYPES.NILSXP]

### Import `ECoL` package

In [41]:
ecol = rpackages.importr('ECoL')

### Testing `ECoL`

Creating a dataframe with pandas

In [42]:
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris

iris = load_iris()

data = pd.DataFrame(data=np.c_[iris['data'], iris['target']],
                     columns=iris['feature_names'] + ['target'])

Then we convert that pandas dataframe to a R dataframe

In [43]:
from rpy2.robjects import r, pandas2ri

pandas2ri.activate()

r_data = pandas2ri.py2rpy(data)

And now we can use `ECoL` to measure the complexity of `iris` dataset

In [44]:
from rpy2.robjects import Formula

fml = Formula('target ~ .')

ecol.complexity(fml, r_data)

array([0.77799435, 0.23444487, 0.25166667, 0.41527768, 0.20666667,
       0.08265206, 0.06929402, 0.01160101,        nan, 0.01000491,
       0.01400609, 0.03355705, 0.12532019, 0.11967219, 0.07268981,
       0.015     , 0.05957061, 0.00103078, 0.00987399, 0.02666667,
       0.01333333, 0.5       ])

In [45]:
ecol.complexity(fml, r_data, groups="linearity")

array([0.08265206, 0.06929402, 0.01160101,        nan, 0.00982387,
       0.01316999])