# Tutorial: INTEGRATE using ENGRO2 model

This tutorial explains how to use INTEGRATE pipeline.

In [1]:
import pandas as pd
import cobra as cb

### Step 1. getGPRsFromModel

To run the script from the Jupyter notebook:  

```python
!python pipeline/getGPRsFromModel.py
```

In alternative, to run the script from the terminal:  
```python
python pipeline/getGPRsFromModel.py
```

The produced output is:

In [2]:
dfOutput1 = pd.read_csv('./outputs/ENGRO2_GPR.csv', sep = '\t')
dfOutput1.head()

Unnamed: 0,id,rule
0,EX_lac__L_e,
1,EX_glc__D_e,
2,EX_gluIN__L_e,
3,EX_gluOUT__L_e,
4,EX_gln__L_e,


### Step 2. getRASscore

To run the script from the Jupyter notebook:  

```python
!python pipeline/getRASscore.py
```

In alternative, to run the script from the terminal:  
```python
python pipeline/getRASscore.py
```

The produced output is:

In [3]:
dfOutput1 = pd.read_csv('./outputs/ENGRO2_RAS.csv', sep = '\t')
dfOutput1.head()

Unnamed: 0,Rxn,Cellule361_Rep1,Cellule361_Rep2,Cellule361_Rep3,MDA-MB-231_Rep1,MDA-MB-231_Rep2,MDA-MB-231_Rep3,MCF-7_Rep1,MCF-7_Rep2,MCF-7_Rep3,SK-BR-3_Rep1,SK-BR-3_Rep2,SK-BR-3_Rep3,MCF102A_Rep1,MCF102A_Rep2,MCF102A_Rep3
0,EX_lac__L_e,,,,,,,,,,,,,,,
1,EX_glc__D_e,,,,,,,,,,,,,,,
2,EX_gluIN__L_e,,,,,,,,,,,,,,,
3,EX_gluOUT__L_e,,,,,,,,,,,,,,,
4,EX_gln__L_e,,,,,,,,,,,,,,,


### Step 3: getNormalizedRAS

To run the script from the Jupyter notebook:  

```python
!python pipeline/getNormalizedRAS.py
```

In alternative, to run the script from the terminal:  
```python
python pipeline/getNormalizedRAS.py
```

The produced output is:

In [4]:
dfOutput1 = pd.read_csv('./outputs/ENGRO2_wNormalizedRAS.csv', sep = '\t')
dfOutput1.head()

Unnamed: 0,Rxn,media_MDAMB361,media_MDAMB231,media_MCF7,media_SKBR3,media_MCF102A,norm_MDAMB361,norm_MDAMB231,norm_MCF7,norm_SKBR3,norm_MCF102A
0,EX_lac__L_e,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
1,EX_glc__D_e,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
2,EX_gluIN__L_e,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
3,EX_gluOUT__L_e,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
4,EX_gln__L_e,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


### Step 4: rasIntegration

To run the script from the Jupyter notebook:  

```python
!python pipeline/rasIntegration.py
```

In alternative, to run the script from the terminal:  
```python
python pipeline/rasIntegration.py
```

As proof of concept, one of the obtained SBML is: 

In [5]:
model1 = cb.io.read_sbml_model('./models/ENGRO2_MCF102A.xml')
model1

0,1
Name,dc
Memory address,0x017c588ab340
Number of metabolites,422
Number of reactions,496
Number of groups,0
Objective expression,1.0*Biomass - 1.0*Biomass_reverse_57a34
Compartments,"cytosol, mitochondrium, extracellular"


### Step 5: Models splitting
Each model needs to be converted to a mat file in order to exploit the MATLAB function to convert model from the reversible into the irreversible format

### Step 6: randomSampling

To run the script from the Jupyter notebook:  

```python
!python pipeline/randomSampling.py
```

In alternative, to run the script from the terminal:  
```python
python pipeline/randomSampling.py
```

As proof of concept, one of the obtained files is: 

In [6]:
dfOutput1 = pd.read_csv('./outputs/randomSampling_ENGRO2_nSol_5_MCF102A.csv', sep = '\t')
dfOutput1.head()

Unnamed: 0,EX_lac__L_e,EX_glc__D_e,EX_gluIN__L_e,EX_gluOUT__L_e,EX_gln__L_e,EX_asp__L_e,DM_asp__L_e,EX_co2_e,EX_h_e,EX_h2o_e,...,EX_ala_B,Transport_octanoyl_ACP_c_e,EX_octanoyl_ACP,TMDK1,THYMDt1,EX_thymd,Transport_HC00576_c_e,EX_HC00576,Transport_4abut_c_e,EX_4abut
0,2.464945e-08,-0.004107,0.003891,0.003966,-0.039431,0.04898,0.033254,126.251215,-32.214053,-62.441649,...,27.068089,1.799048e-15,0.0,3e-06,-3e-06,3e-06,0.074591,0.074591,2.073726,2.073726
1,0.001361104,-0.002574,0.002315,0.003113,-0.0325,0.037759,0.432276,112.28776,-38.757004,-80.997182,...,28.692506,7.512879e-15,0.0,1.6e-05,-1.6e-05,1.6e-05,0.977523,0.977523,2.914501,2.914501
2,0.001049598,-0.00279,0.000872,0.001581,-0.021034,0.030392,0.197926,127.319655,-62.67246,-99.36884,...,36.543349,7.288174e-15,0.0,7e-06,-7e-06,7e-06,1.892707,1.892707,7.621385,7.621385
3,0.0001251162,-0.003596,0.002493,0.002614,-0.025945,0.026382,0.43092,112.14203,-62.474618,-105.077645,...,37.093366,2.747854e-15,0.0,2.3e-05,-2.3e-05,2.3e-05,0.739704,0.739704,7.797594,7.797594
4,0.0001216471,-0.003724,0.001844,0.002004,-0.025776,0.022455,0.438283,103.890213,-48.056616,-100.932832,...,34.307676,5.749341e-17,0.0,2.1e-05,-2.1e-05,2.1e-05,1.281558,1.281558,8.426598,8.426598


### Step 7: mannWhitneyUTest

To run the script from the Jupyter notebook:  

```python
!python pipeline/mannWhitneyUTest.py nSamples
```

In alternative, to run the script from the terminal:  
```python
python pipeline/mannWhitneyUTest.py nSamples
```

As proof of concept, one of the obtained files is: 

In [7]:
dfOutput1 = pd.read_csv('./outputs/mwuTest_MCF102A_vs_SKBR3.csv', sep = '\t')
dfOutput1.head()

Unnamed: 0,Rxn,statistic_less,pvalue_less,statistic_greater,pvalue_greater,mean_MCF102A,median_MCF102A,std_MCF102A,mean_SKBR3,median_SKBR3,std_SKBR3
0,EX_lac__L_e,5.5,0.086609,5.5,0.941963,0.000532,0.00013,0.000626,0.002572,0.00263,0.002423
1,EX_glc__D_e,25.0,0.996692,25.0,0.006093,-0.003358,-0.0036,0.000652,-0.01255,-0.0121,0.001954
2,EX_gluIN__L_e,25.0,0.998115,25.0,0.003747,0.00228,0.00231,0.001098,0.0,0.0,0.0
3,EX_gluOUT__L_e,0.0,0.006093,0.0,0.996692,0.002654,0.00261,0.000938,0.029212,0.02965,0.004571
4,EX_gln__L_e,0.0,0.006093,0.0,0.996692,-0.028936,-0.02594,0.007146,-0.01508,-0.01521,0.000504
