Welcome! This is a tutorial about RASpy (Reaction Activity Scores in Python). 
In this notebook, we will show how to compute the RAS matrix, using a gene count matrix and a metabolic model.

## Load the data

Load the metabolic model

In [1]:
from cobra.io import read_sbml_model
model=read_sbml_model('../metabolic_models/RECON3_ensg.xml')
model

0,1
Name,Recon3D
Memory address,0x01726b9c86a0
Number of metabolites,5835
Number of reactions,10600
Number of groups,0
Objective expression,1.0*BIOMASS_maintenance - 1.0*BIOMASS_maintenance_reverse_5b3f9
Compartments,"cytosol, lysosome, mitochondria, endoplasmic reticulum, extracellular space, peroxisome/glyoxysome, nucleus, golgi apparatus, inner mitochondrial compartment"


Load the count matrix (h5ad format). Such a dataset are reported as TPM and was downloaded from the  EBI Single Cell Expression Atlas (https://www.ebi.ac.uk/gxa/sc/experiments/E-GEOD-86618/downloads)

In [2]:
import scanpy as sc
adata=sc.read_h5ad("../datasets/E-GEOD-86618_tpm")
adata

AnnData object with n_obs × n_vars = 540 × 23909
    obs: 'Sample Characteristic[organism]', 'Sample Characteristic Ontology Term[organism]', 'Sample Characteristic[individual]', 'Sample Characteristic Ontology Term[individual]', 'Sample Characteristic[organism part]', 'Sample Characteristic Ontology Term[organism part]', 'Sample Characteristic[cell type]', 'Sample Characteristic Ontology Term[cell type]', 'Sample Characteristic[facs marker]', 'Sample Characteristic Ontology Term[facs marker]', 'Sample Characteristic[disease]', 'Sample Characteristic Ontology Term[disease]', 'Factor Value[single cell identifier]', 'Factor Value Ontology Term[single cell identifier]', 'Factor Value[disease]', 'Factor Value Ontology Term[disease]'

## Compute RAS values

In [3]:
import sys
sys.path.insert(1, '../raspy/')

In [4]:
from ras import RAS_computation as rc

In [5]:
ras_object=rc(adata,model)

The default function used to evalute the OR and AND operators are the sum and the min function. In case of missing expression value in the count matrix, we remove such areferred to as NaN (Not a Number), for a gene joined with an AND operator in a given GPR rule, the user can choose to solve the rule ‘A AND NaN’ as A, or to disregard it tout-court (i.e., treated as NaN).

In [6]:
import numpy as np
import time
t0= time.time()
ras_adata=ras_object.compute(
                             or_expression=np.nansum,   #which operation for or_expression? default is np.nansum
                             and_expression=np.nanmin,  #which operation for and_expression? default is np.nanmin     
                             drop_na_rows=True,         #drop nan rows. Default is True
                             drop_duplicates=False      #drop duplicates values for ras.  Default is false
                            )
ras_adata

t1 = time.time()-t0
print("Time elapsed: ", t1) # CPU seconds elapsed (floating point)

100%|#########################################################################|


Time elapsed:  65.00678515434265


In [7]:
ras_adata.to_df()

REACTIONS,ATPS4mi,ATPasel_1,CYOOm2i,CYOR_u10mi,NDPK10n,NDPK1n,NDPK2n,NDPK3n,NDPK4n,NDPK5n,...,XOL7AH3ATP,XOL7AONEATP,XOLDIOLONEATP,PROD2m,HMR_1976,SR5ARr,BILDGLCURte,SPHS1Pt2e,G6PPer,NAPRT
SRR4216351,0.975053,7.255475,0.0,3.111485,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,10.728455,10.728455,10.728455,0.000000,6.759773,6.759773,10.728455,0.0,0.871777,1.995025
SRR4216352,1.061796,4.137387,0.0,0.813871,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,75.175941,75.175941,75.175941,0.000000,0.589880,0.589880,74.452377,0.0,2.447423,1.311673
SRR4216353,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,1.368465,1.368465,1.368465,0.000000,3.287407,0.000000,1.368465,0.0,7.365785,0.919579
SRR4216354,0.000000,0.000000,0.0,11.344500,559.655457,559.655457,559.655457,559.655457,559.655457,559.655457,...,6.741586,6.741586,6.741586,0.000000,1.092281,1.092281,5.089143,0.0,0.070871,0.209422
SRR4216355,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.000000,1.258906,0.990033,0.000000,0.0,131.215683,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SRR4216886,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,8.872428,8.872428,8.872428,0.000000,1.731347,1.731347,9.394437,0.0,0.000000,0.000000
SRR4216887,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,1.888376,1.888376,1.888376,0.000000,0.515009,0.515009,1.888376,0.0,0.779124,0.449395
SRR4216888,0.000000,6.421671,0.0,11.331868,38.300667,38.300667,38.300667,38.300667,38.300667,38.300667,...,180.325226,180.325226,180.325226,0.492105,3.208438,2.316352,181.016052,0.0,13.948687,7.008633
SRR4216889,0.000000,0.000000,0.0,2.386348,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,261.131287,261.131287,261.131287,2.042693,151.433182,1.971192,263.376862,0.0,7.748308,7.995377


In [8]:
ras_adata.obs

Unnamed: 0,countmatrix_Sample Characteristic[organism],countmatrix_Sample Characteristic Ontology Term[organism],countmatrix_Sample Characteristic[individual],countmatrix_Sample Characteristic Ontology Term[individual],countmatrix_Sample Characteristic[organism part],countmatrix_Sample Characteristic Ontology Term[organism part],countmatrix_Sample Characteristic[cell type],countmatrix_Sample Characteristic Ontology Term[cell type],countmatrix_Sample Characteristic[facs marker],countmatrix_Sample Characteristic Ontology Term[facs marker],countmatrix_Sample Characteristic[disease],countmatrix_Sample Characteristic Ontology Term[disease],countmatrix_Factor Value[single cell identifier],countmatrix_Factor Value Ontology Term[single cell identifier],countmatrix_Factor Value[disease],countmatrix_Factor Value Ontology Term[disease]
SRR4216351,Homo sapiens,http://purl.obolibrary.org/obo/NCBITaxon_9606,CC019,,lung,http://purl.obolibrary.org/obo/UBERON_0002048,epithelial cell,http://purl.obolibrary.org/obo/CL_0000066,"7AAD-, CD45-, CD31-, CD326+, HTII-280+",,normal,http://purl.obolibrary.org/obo/PATO_0000461,01-H12.CC019,,normal,http://purl.obolibrary.org/obo/PATO_0000461
SRR4216352,Homo sapiens,http://purl.obolibrary.org/obo/NCBITaxon_9606,CC002,,lung,http://purl.obolibrary.org/obo/UBERON_0002048,epithelial cell,http://purl.obolibrary.org/obo/CL_0000066,"7AAD-, CD45-, CD31-, CD326+, HTII-280+",,normal,http://purl.obolibrary.org/obo/PATO_0000461,01-N701-S517-A1.CC002,,normal,http://purl.obolibrary.org/obo/PATO_0000461
SRR4216353,Homo sapiens,http://purl.obolibrary.org/obo/NCBITaxon_9606,IPF009,,lung,http://purl.obolibrary.org/obo/UBERON_0002048,epithelial cell,http://purl.obolibrary.org/obo/CL_0000066,"7AAD-, CD45-, CD31-, CD326+, HTII-280+",,idiopathic pulmonary fibrosis,http://www.ebi.ac.uk/efo/EFO_0000768,01-N701-S517-B5.IPF009,,idiopathic pulmonary fibrosis,http://www.ebi.ac.uk/efo/EFO_0000768
SRR4216354,Homo sapiens,http://purl.obolibrary.org/obo/NCBITaxon_9606,IPF010,,lung,http://purl.obolibrary.org/obo/UBERON_0002048,epithelial cell,http://purl.obolibrary.org/obo/CL_0000066,"7AAD-, CD45-, CD31-, CD326+, HTII-280+",,idiopathic pulmonary fibrosis,http://www.ebi.ac.uk/efo/EFO_0000768,01-N701-S517-H1.IPF010,,idiopathic pulmonary fibrosis,http://www.ebi.ac.uk/efo/EFO_0000768
SRR4216355,Homo sapiens,http://purl.obolibrary.org/obo/NCBITaxon_9606,CC019,,lung,http://purl.obolibrary.org/obo/UBERON_0002048,epithelial cell,http://purl.obolibrary.org/obo/CL_0000066,"7AAD-, CD45-, CD31-, CD326+, HTII-280+",,normal,http://purl.obolibrary.org/obo/PATO_0000461,02-B3.CC019,,normal,http://purl.obolibrary.org/obo/PATO_0000461
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SRR4216886,Homo sapiens,http://purl.obolibrary.org/obo/NCBITaxon_9606,CC006,,lung,http://purl.obolibrary.org/obo/UBERON_0002048,epithelial cell,http://purl.obolibrary.org/obo/CL_0000066,"7AAD-, CD45-, CD31-, CD326+, HTII-280+",,normal,http://purl.obolibrary.org/obo/PATO_0000461,H9-1-C48_S10.CC006,,normal,http://purl.obolibrary.org/obo/PATO_0000461
SRR4216887,Homo sapiens,http://purl.obolibrary.org/obo/NCBITaxon_9606,IL006,,lung,http://purl.obolibrary.org/obo/UBERON_0002048,epithelial cell,http://purl.obolibrary.org/obo/CL_0000066,"7AAD-, CD45-, CD31-, CD326+, HTII-280+",,idiopathic pulmonary fibrosis,http://www.ebi.ac.uk/efo/EFO_0000768,H9-1-C48_S35.IL006,,idiopathic pulmonary fibrosis,http://www.ebi.ac.uk/efo/EFO_0000768
SRR4216888,Homo sapiens,http://purl.obolibrary.org/obo/NCBITaxon_9606,IPF012,,lung,http://purl.obolibrary.org/obo/UBERON_0002048,epithelial cell,http://purl.obolibrary.org/obo/CL_0000066,"7AAD-, CD45-, CD31-, CD326+, HTII-280+",,idiopathic pulmonary fibrosis,http://www.ebi.ac.uk/efo/EFO_0000768,H9-1-C48_S40.IPF012,,idiopathic pulmonary fibrosis,http://www.ebi.ac.uk/efo/EFO_0000768
SRR4216889,Homo sapiens,http://purl.obolibrary.org/obo/NCBITaxon_9606,IPF002,,lung,http://purl.obolibrary.org/obo/UBERON_0002048,epithelial cell,http://purl.obolibrary.org/obo/CL_0000066,"7AAD-, CD45-, CD31-, CD326+, HTII-280+",,idiopathic pulmonary fibrosis,http://www.ebi.ac.uk/efo/EFO_0000768,H9-1-C48_S41.IPF002,,idiopathic pulmonary fibrosis,http://www.ebi.ac.uk/efo/EFO_0000768


In [9]:
ras_adata.var

Unnamed: 0_level_0,common_gprs,compartments,GPR rule
REACTIONS,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
ATPS4mi,ATPS4mi,"i,m",( ( ENSG00000099624 and ENSG00000152234 and EN...
ATPasel_1,ATPasel_1,"c,l",( ( ENSG00000105929 and ENSG00000117410 and EN...
CYOOm2i,CYOOm2i,"i,m",( ENSG00000010256 and ENSG00000173660 and ENSG...
CYOR_u10mi,CYOR_u10mi,"i,m",( ENSG00000010256 and ENSG00000173660 and ENSG...
NDPK10n,"NDPK10n,NDPK1n,NDPK2n,NDPK3n,NDPK4n,NDPK5n,NDP...",n,( ENSG00000011052 and ENSG00000239672 ) or ( E...
...,...,...,...
SR5ARr,SR5ARr,r,ENSG00000277893 or ENSG00000145545
BILDGLCURte,BILDGLCURte,"c,e",ENSG00000278183 or ENSG00000023839 or ENSG0000...
SPHS1Pt2e,SPHS1Pt2e,"c,e",ENSG00000278183 or ENSG00000165029
G6PPer,G6PPer,r,ENSG00000278373 or ENSG00000131482 or ENSG0000...


## Save the results

In [10]:
from scipy.sparse import csr_matrix
ras_adata.X = csr_matrix(ras_adata.X)

In [11]:
ras_adata.write("../datasets/E-GEOD-86618_ras_adata")

... storing 'common_gprs' as categorical
... storing 'compartments' as categorical
... storing 'GPR rule' as categorical
