# Multi-sample examples using `batch_run` from *Cobamp*

Several integration algorithms were introduced in the previous tutorials.
However, the demonstrated approach was limited to a single sample.
In some cases, multiple samples are available and the context-specific models are required for each.
Hence, making the integration of multiple samples a necessity.

`batch_run` is a function from *Cobamp* that allows multiprocessing and is fully compatible with the *Troppo* framework. 
Thus allowing the integration of multiple samples in a single run.
This function requires four parameters:

- `function`: the function that will run the reconstruction that needs to be parallelized.
- `sequence`: a list with the containers for each sample.
- `paramargs`: a dictionary with the parameters for the function.
- `threads`: the number of parallel processes to run. 

### Initial setup

In [1]:
import pandas as pd
import cobra
import re

from troppo.omics.readers.generic import TabularReader
from troppo.methods_wrappers import ReconstructionWrapper
from cobamp.utilities.parallel import batch_run

The wrappers.external_wrappers module will be deprecated in a future release in favour of the wrappers module. 
    Available ModelObjectReader classes can still be loaded using cobamp.wrappers.<class>. An appropriate model 
    reader can also be created using the get_model_reader function on cobamp.wrappers
  reader can also be created using the get_model_reader function on cobamp.wrappers''')


In [2]:
patt = re.compile('__COBAMPGPRDOT__[0-9]{1}')
replace_alt_transcripts = lambda x: patt.sub('', x)

### Load the model

In [3]:
model = cobra.io.read_sbml_model(r'data\HumanGEM_Consistent_COVID19_HAM.xml')
model

0,1
Name,HumanGEM
Memory address,1b9f8ac5108
Number of metabolites,6149
Number of reactions,10347
Number of genes,2976
Number of groups,142
Objective expression,1.0*biomass_human - 1.0*biomass_human_reverse_fb2f2
Compartments,"Cytosol, Lysosome, Endoplasmic reticulum, Extracellular, Mitochondria, Peroxisome, Golgi apparatus, Nucleus, Inner mitochondria"


In [4]:
model_wrapper = ReconstructionWrapper(model=model, ttg_ratio=9999, gpr_gene_parse_function=replace_alt_transcripts)
model_wrapper



<troppo.methods_wrappers.ReconstructionWrapper at 0x1b9ab0a7d48>

### Load the data

In [5]:
omics_data = pd.read_csv(filepath_or_buffer=r'data\Desai-GTEx_ensembl.csv', index_col=0)
omics_data = omics_data.loc[['Lung_Healthy','Lung_COVID19']]
omics_data

Unnamed: 0_level_0,ENSG00000000419,ENSG00000000460,ENSG00000000938,ENSG00000000971,ENSG00000001036,ENSG00000001084,ENSG00000001167,ENSG00000001461,ENSG00000001497,ENSG00000001561,...,ENSG00000271321,ENSG00000271605,ENSG00000272047,ENSG00000272325,ENSG00000272333,ENSG00000272414,ENSG00000272573,ENSG00000272968,ENSG00000273045,ENSG00000273079
ensembl_gene_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Lung_Healthy,5.022368,0.584963,6.444601,6.213347,4.82273,3.0,3.776104,3.336283,4.343408,3.722466,...,0.137504,3.070389,1.847997,3.432959,2.944858,3.350497,5.074677,0.378512,0.847997,0.0
Lung_COVID19,2.988018,1.551051,5.77763,7.134232,4.429446,3.593211,4.770509,3.824891,3.566066,4.433298,...,0.0,4.669531,2.331411,3.326899,4.985126,4.696205,0.0,0.0,0.0,0.381678


In [6]:
omics_container = TabularReader(path_or_df=omics_data, nomenclature='entrez_id', omics_type='transcriptomics').to_containers()
omics_container

[<troppo.omics.core.OmicsContainer at 0x1b9b248aec8>,
 <troppo.omics.core.OmicsContainer at 0x1b9b2401e48>]

### Define the function to run the reconstruction

This function uses the `run_from_omics` method from the `ReconstructionWrapper` class. This requires the following parameters:

- `omics_data`: the omics data container for the sample.
- `algorithm`: a string containing the algorithm to use for the reconstruction.
- `and_or_funcs`: a tuple with the functions to use for the AND and OR operations of the GPR.
- `integration_strategy`: a tuple with the integration strategy and the function to apply to the scores.
- `solver`: the solver to use for the optimization.
- `**kwargs`: additional parameters for the reconstruction that are specific to used algorithm.

In [7]:
def reconstruction_function_gimme(omics_container, parameters: dict):

    def score_apply(reaction_map_scores):
        return {k:0  if v is None else v for k, v in reaction_map_scores.items()}
    
    flux_threshold, obj_frac, rec_wrapper, method = [parameters[parameter] for parameter in
                                      ['flux_threshold', 'obj_frac', 'reconstruction_wrapper', 'algorithm']]

    reac_ids = rec_wrapper.model_reader.r_ids
    metab_ids = rec_wrapper.model_reader.m_ids
    AND_OR_FUNCS = (min, sum)    

    return rec_wrapper.run_from_omics(omics_data=omics_container, algorithm=method, and_or_funcs=AND_OR_FUNCS,
                                      integration_strategy=('continuous', score_apply), solver='CPLEX', obj_frac=obj_frac,
                                      objectives=[{'biomass_human': 1}], preprocess=True, flux_threshold=flux_threshold,
                                      reaction_ids=reac_ids, metabolite_ids=metab_ids)

Considering the function above, the parameters for the reconstruction are defined in a dictionary as follows:

In [8]:
parameters = {'flux_threshold': 0.8, 'obj_frac': 0.8, 'reconstruction_wrapper': model_wrapper, 'algorithm': 'gimme'}

### Run the `batch_run` function

In [9]:
batch_gimme_res = batch_run(reconstruction_function_gimme, omics_container, parameters, threads=2)
batch_gimme_res

[{'HMR_4097': True,
  'HMR_4099': True,
  'HMR_4108': True,
  'HMR_4133': True,
  'HMR_4137': False,
  'HMR_4281': True,
  'HMR_4388': True,
  'HMR_4283': True,
  'HMR_8357': True,
  'HMR_4379': True,
  'HMR_4301': True,
  'HMR_4355': True,
  'HMR_4358': True,
  'HMR_4360': False,
  'HMR_4363': True,
  'HMR_4365': True,
  'HMR_4368': True,
  'HMR_4370': True,
  'HMR_4371': True,
  'HMR_4372': True,
  'HMR_4373': True,
  'HMR_4375': True,
  'HMR_4377': True,
  'HMR_4381': True,
  'HMR_4391': False,
  'HMR_4394': True,
  'HMR_4396': True,
  'HMR_4521': True,
  'HMR_6410': True,
  'HMR_6412': True,
  'HMR_7745': True,
  'HMR_7746': True,
  'HMR_7747': True,
  'HMR_7748': True,
  'HMR_7749': True,
  'HMR_4122': True,
  'HMR_5395': True,
  'HMR_5396': True,
  'HMR_9727': True,
  'HMR_5397': True,
  'HMR_5398': True,
  'HMR_5399': True,
  'HMR_5400': True,
  'HMR_5401': True,
  'HMR_8585': True,
  'HMR_3944': False,
  'HMR_4128': True,
  'HMR_4130': True,
  'HMR_4131': True,
  'HMR_4132': Tr