# 4. Model extraction methods

PipeGEM inplements several model extraction methods for genomics/transcriptomics data integration, including:

**MBA-like algorithms**
- rFASTCORMICS
- CORDA
- FASTCORE
- mCADRE
- MBA

**iMAT-like algorithms**
- INIT
- iMAT

**GIMME-like algorithms**
- RIPTiDe
- GIMME ([Becker, Scott A., and Bernhard O. Palsson. 2008](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000082))

**EFlux-like algorithms**
- E-Flux ([Colijn, Caroline, et al. 2009](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000489))
- SPOT (in the next version)

In this session, we will showcase two of them, namely rFASTCORMICS and GIMME. Other methods and more details can be found in tutorial/data_integration.

In [1]:
import sys
from pathlib import Path
sys.path.append(str(Path("../../../").resolve()))

%load_ext autoreload
%autoreload 2

In [2]:
import numpy as np
import seaborn as sns
import pipeGEM as pg

from pipeGEM.data import GeneData, get_syn_gene_data
from pipeGEM import load_remote_model

In [6]:
# load the consistent template model saved in the previous session
human1 = pg.Model("human", 
                  load_remote_model("Human-GEM", format="mat"))

Model Human-GEM is already downloaded
Set parameter Username
Academic license - for non-commercial use only - expires 2024-02-28


In [None]:
# load gene data
data = np.log2(get_syn_gene_data(human1, n_sample=3) + 1)

data_name = "sample_0"
gene_data = GeneData(data=data[data_name], 
                     data_transform=lambda x: np.log2(x), 
                     absent_expression=0)
human1.add_gene_data(data_name, gene_data)

In [None]:
# get thresholds using the data
rFASTCORMICS_th = gene_data.get_threshold("rFASTCORMICS")

exp_th, nexp_th = rFASTCORMICS_th.exp_th, rFASTCORMICS_th.non_exp_th

In [None]:
rFASTCORMICS_th.plot()

In [None]:
# load task analysis result (optional)
from pipeGEM.analysis import TaskAnalysis

# task_analysis_result = TaskAnalysis.load("")

# get supporting reactions
# task_supps = human1.get_activated_task_sup_rxns(data_name=data_name, 
#                                                task_analysis=task_analysis_result, 
#                                                score_threshold=exp_th)

task_supps = []

spon_rxns = ['MAR04740', 'MAR04250', 'MAR06875', 'MAR06876', 'MAR04840', 'MAR04771', 
             'MAR06997', 'MAR07008', 'MAR07011', 'MAR07015', 'MAR07016', 'MAR05127', 'MAR08749', 'MAR08750']

## rFASTCORMICS

In [None]:
result = human1.integrate_gene_data(data_name, 
                                    integrator="rFASTCORMICS", 
                                    protected_rxns=list(set(task_supps+spon_rxns)), 
                                    consistent_checking_method=None,
                                    predefined_threshold={"exp_th": exp_th, "non_exp_th": nexp_th})

# result model
print(result.result_model)

And... that's it. You can also save the result with `result.save("saved_folder")`. 

Some of the parameters are reusable for other integration methods.

## GIMME

In [None]:
result = human1.integrate_gene_data(data_name, 
                                    integrator="GIMME", 
                                    protected_rxns=list(set(task_supps+spon_rxns)), 
                                    high_exp=exp_th)

# result model
print(result.result_model)