# PyCoMo Basics #
PyCoMo is a **Py**thon **Co**mmunity metabolic **Mo**delling package. In this tutorial, the core features will be presented.

The expected runtime for this notebook is approximately 10-30 minutes.
## Setting up PyCoMo ##
Clone the package from github. Next, we are going to import all the packages we need in this tutorial.

In [1]:
from pathlib import Path
import sys
import cobra
import os

### Importing PyCoMo ###
As PyCoMo is currently only available as a local package, the direct path to the package directory needs to be used on import.

In [2]:
path_root = "../pycomo"  # Change path according to your PyCoMo location
sys.path.append(str(path_root))
import pycomo as pycomo

Now we will check if PyCoMo was loaded correctly. For this, we will run the help function on the PyCoMo package.

In [3]:
#help(pycomo)

## Creating a Community Model ##
The creation of a community model consists of 3 steps:
1. Loading the member models
2. Preparing the member models for merging
3. Creating a community model
### Loading the member models ###
The community model creation process starts with models of the individual members. Note that the quality of the community model heavily depends on the quality of the member models!

In this tutorial we are using metabolic models from the AGORA collection. The models were retrieved from www.vmh.life, and are stored in the data folder of the repository. The selection of models and the resulting community represents a cystic fibrosis airway community, as done by Henson et al. (www.doi.org/10.1128/mSystems.00026-19)

In [4]:
test_model_dir = "../data/use_case/henson"
named_models = pycomo.load_named_models_from_dir(test_model_dir)

The models and file names were extracted and stored in named_models. Let's check the contents:

In [5]:
named_models

{'Achromobacter_xylosoxidans_NBRC_15126': <Model Achromobacter_xylosoxidans_NBRC_15126 at 0x1d37c4ea370>,
 'Actinomyces_naeslundii_str_Howell_279': <Model Actinomyces_naeslundii_str_Howell_279 at 0x1d305463d60>,
 'Burkholderia_cepacia_GG4': <Model Burkholderia_cepacia_GG4 at 0x1d306ba0c40>,
 'Escherichia_coli_str_K_12_substr_MG1655': <Model Escherichia_coli_str_K_12_substr_MG1655 at 0x1d307a8a2e0>,
 'Fusobacterium_nucleatum_subsp_nucleatum_ATCC_25586': <Model Fusobacterium_nucleatum_subsp_nucleatum_ATCC_25586 at 0x1d30988d760>,
 'Gemella_haemolysans_ATCC_10379': <Model Gemella_haemolysans_ATCC_10379 at 0x1d309ce47c0>,
 'Granulicatella_adiacens_ATCC_49175': <Model Granulicatella_adiacens_ATCC_49175 at 0x1d30a340460>,
 'Haemophilus_influenzae_R2846': <Model Haemophilus_influenzae_R2846 at 0x1d309b374c0>,
 'Neisseria_flavescens_SK114': <Model Neisseria_flavescens_SK114 at 0x1d30b26e670>,
 'Porphyromonas_endodontalis_ATCC_35406': <Model Porphyromonas_endodontalis_ATCC_35406 at 0x1d30b9286a

### Preparing the models for merging ###
With the models loaded, the next step is preparing them for merging. This is done by creating SingleOrganismModel objects. Using them, the models will be formatted for compliance with the SBML format. Further, an exchange compartment will be generated under the name _exchg_.

One of the requirements for a community metabolic model is a common biomass function. To construct it, PyCoMo requires the biomass of each member represented as a single metabolite. This biomass metabolite ID can be specified when constructing the SingleOrganismModel objects. However, it can also be found or generated automatically, by setting the biomass reaction as the objective of the model. Let's check if the biomass function is the objective in all the models

In [6]:
for model in named_models.values():
    print(model.objective)

Maximize
1.0*biomass489 - 1.0*biomass489_reverse_62d1a
Maximize
1.0*biomass492 - 1.0*biomass492_reverse_bc961
Maximize
1.0*biomass479 - 1.0*biomass479_reverse_1d1b2
Maximize
1.0*biomass525 - 1.0*biomass525_reverse_5c178
Maximize
1.0*biomass237 - 1.0*biomass237_reverse_f032e
Maximize
1.0*biomass027 - 1.0*biomass027_reverse_af8dc
Maximize
1.0*biomass091 - 1.0*biomass091_reverse_7b6db
Maximize
1.0*biomass252 - 1.0*biomass252_reverse_f6948
Maximize
1.0*biomass339 - 1.0*biomass339_reverse_45ed6
Maximize
1.0*biomass326 - 1.0*biomass326_reverse_02060
Maximize
1.0*biomass276 - 1.0*biomass276_reverse_7f92e
Maximize
1.0*biomass345 - 1.0*biomass345_reverse_e128f
Maximize
1.0*biomass525 - 1.0*biomass525_reverse_5c178
Maximize
1.0*biomass429 - 1.0*biomass429_reverse_9caa0
Maximize
1.0*biomass042 - 1.0*biomass042_reverse_2a02b
Maximize
1.0*biomass164 - 1.0*biomass164_reverse_ca493
Maximize
1.0*biomass116 - 1.0*biomass116_reverse_02324


With the objective being the biomass function in all models, the biomass metabolite does not need to be specified.

In [7]:
single_org_models = []
for name, model in named_models.items():
    print(name)
    single_org_model = pycomo.SingleOrganismModel(model, name)
    single_org_models.append(single_org_model)

Achromobacter_xylosoxidans_NBRC_15126
Actinomyces_naeslundii_str_Howell_279
Burkholderia_cepacia_GG4
Escherichia_coli_str_K_12_substr_MG1655
Fusobacterium_nucleatum_subsp_nucleatum_ATCC_25586
Gemella_haemolysans_ATCC_10379
Granulicatella_adiacens_ATCC_49175
Haemophilus_influenzae_R2846
Neisseria_flavescens_SK114
Porphyromonas_endodontalis_ATCC_35406
Prevotella_melaninogenica_ATCC_25845
Pseudomonas_aeruginosa_NCGM2_S1
Ralstonia_sp_5_7_47FAA
Rothia_mucilaginosa_DY_18
Staphylococcus_aureus_subsp_aureus_USA300_FPR3757
Streptococcus_sanguinis_SK36
Veillonella_atypica_ACS_049_V_Sch6


### Creating a community model ###
With the member models prepared, the community model can be generated. The first step is to create a CommunityModel objects from the member models. The matching of the exchange metabolites can be achieved in two ways: matching via identical metabolite IDs, or via annotation fields. In this tutorial and as all the models come from the same source, matching via identical metabolite IDs will be used.

In [8]:
community_name = "henson_community_model"
com_model_obj = pycomo.CommunityModel(single_org_models, community_name)

The cobra model of the community will generated the first time it is needed. We can enforce this now, by calling it via .community_model

In [None]:
com_model_obj.community_model

No constrained community model set yet. Using the unconstrained model instead.
No unconstrained community model generated yet. Generating now:
Note: no products in the objective function, adding biomass to it.
Note: no products in the objective function, adding biomass to it.


Ignoring reaction 'EX__4abz_exchg' since it already exists.
Ignoring reaction 'EX_Lcyst_exchg' since it already exists.
Ignoring reaction 'EX_ac_exchg' since it already exists.
Ignoring reaction 'EX_acgam_exchg' since it already exists.
Ignoring reaction 'EX_ala_L_exchg' since it already exists.
Ignoring reaction 'EX_alaasp_exchg' since it already exists.
Ignoring reaction 'EX_alagln_exchg' since it already exists.
Ignoring reaction 'EX_alaglu_exchg' since it already exists.
Ignoring reaction 'EX_alagly_exchg' since it already exists.
Ignoring reaction 'EX_alahis_exchg' since it already exists.
Ignoring reaction 'EX_alaleu_exchg' since it already exists.
Ignoring reaction 'EX_alathr_exchg' since it already exists.
Ignoring reaction 'EX_alltn_exchg' since it already exists.
Ignoring reaction 'EX_arab_L_exchg' since it already exists.
Ignoring reaction 'EX_arbt_exchg' since it already exists.
Ignoring reaction 'EX_arg_L_exchg' since it already exists.
Ignoring reaction 'EX_asn_L_exchg' s

Note: no products in the objective function, adding biomass to it.


Ignoring reaction 'EX__26dap_M_exchg' since it already exists.
Ignoring reaction 'EX__2hyoxplac_exchg' since it already exists.
Ignoring reaction 'EX__34dhpha_exchg' since it already exists.
Ignoring reaction 'EX__34dhphe_exchg' since it already exists.
Ignoring reaction 'EX__3mop_exchg' since it already exists.
Ignoring reaction 'EX__4abz_exchg' since it already exists.
Ignoring reaction 'EX__5htrp_exchg' since it already exists.
Ignoring reaction 'EX_Lcyst_exchg' since it already exists.
Ignoring reaction 'EX_Lkynr_exchg' since it already exists.
Ignoring reaction 'EX_ac_exchg' since it already exists.
Ignoring reaction 'EX_acac_exchg' since it already exists.
Ignoring reaction 'EX_acgam_exchg' since it already exists.
Ignoring reaction 'EX_adocbl_exchg' since it already exists.
Ignoring reaction 'EX_akg_exchg' since it already exists.
Ignoring reaction 'EX_ala_D_exchg' since it already exists.
Ignoring reaction 'EX_ala_L_exchg' since it already exists.
Ignoring reaction 'EX_alaasp_e

Ignoring reaction 'EX_taur_exchg' since it already exists.
Ignoring reaction 'EX_thm_exchg' since it already exists.
Ignoring reaction 'EX_thr_L_exchg' since it already exists.
Ignoring reaction 'EX_tre_exchg' since it already exists.
Ignoring reaction 'EX_trp_L_exchg' since it already exists.
Ignoring reaction 'EX_trypta_exchg' since it already exists.
Ignoring reaction 'EX_tsul_exchg' since it already exists.
Ignoring reaction 'EX_tym_exchg' since it already exists.
Ignoring reaction 'EX_tyr_L_exchg' since it already exists.
Ignoring reaction 'EX_ura_exchg' since it already exists.
Ignoring reaction 'EX_urea_exchg' since it already exists.
Ignoring reaction 'EX_val_L_exchg' since it already exists.
Ignoring reaction 'EX_xan_exchg' since it already exists.
Ignoring reaction 'EX_zn2_exchg' since it already exists.


Note: no products in the objective function, adding biomass to it.


Ignoring reaction 'EX__12ppd_S_exchg' since it already exists.
Ignoring reaction 'EX__15dap_exchg' since it already exists.
Ignoring reaction 'EX__2ddglcn_exchg' since it already exists.
Ignoring reaction 'EX__3hpppn_exchg' since it already exists.
Ignoring reaction 'EX__4hbz_exchg' since it already exists.
Ignoring reaction 'EX_Lcyst_exchg' since it already exists.
Ignoring reaction 'EX_ac_exchg' since it already exists.
Ignoring reaction 'EX_acac_exchg' since it already exists.
Ignoring reaction 'EX_acald_exchg' since it already exists.
Ignoring reaction 'EX_acgam_exchg' since it already exists.
Ignoring reaction 'EX_adn_exchg' since it already exists.
Ignoring reaction 'EX_adocbl_exchg' since it already exists.
Ignoring reaction 'EX_akg_exchg' since it already exists.
Ignoring reaction 'EX_ala_D_exchg' since it already exists.
Ignoring reaction 'EX_ala_L_exchg' since it already exists.
Ignoring reaction 'EX_alaasp_exchg' since it already exists.
Ignoring reaction 'EX_alagln_exchg' s

Ignoring reaction 'EX_pydx_exchg' since it already exists.
Ignoring reaction 'EX_pydxn_exchg' since it already exists.
Ignoring reaction 'EX_rib_D_exchg' since it already exists.
Ignoring reaction 'EX_salcn_exchg' since it already exists.
Ignoring reaction 'EX_sbt_D_exchg' since it already exists.
Ignoring reaction 'EX_ser_D_exchg' since it already exists.
Ignoring reaction 'EX_ser_L_exchg' since it already exists.
Ignoring reaction 'EX_so4_exchg' since it already exists.
Ignoring reaction 'EX_spmd_exchg' since it already exists.
Ignoring reaction 'EX_succ_exchg' since it already exists.
Ignoring reaction 'EX_sucr_exchg' since it already exists.
Ignoring reaction 'EX_sulfac_exchg' since it already exists.
Ignoring reaction 'EX_taur_exchg' since it already exists.
Ignoring reaction 'EX_thm_exchg' since it already exists.
Ignoring reaction 'EX_thr_L_exchg' since it already exists.
Ignoring reaction 'EX_thymd_exchg' since it already exists.
Ignoring reaction 'EX_tma_exchg' since it alread

Note: no products in the objective function, adding biomass to it.


Ignoring reaction 'EX__15dap_exchg' since it already exists.
Ignoring reaction 'EX__2dmmq8_exchg' since it already exists.
Ignoring reaction 'EX__2obut_exchg' since it already exists.
Ignoring reaction 'EX__3mop_exchg' since it already exists.
Ignoring reaction 'EX_ac_exchg' since it already exists.
Ignoring reaction 'EX_acac_exchg' since it already exists.
Ignoring reaction 'EX_acgam_exchg' since it already exists.
Ignoring reaction 'EX_adocbl_exchg' since it already exists.
Ignoring reaction 'EX_ala_L_exchg' since it already exists.
Ignoring reaction 'EX_alaasp_exchg' since it already exists.
Ignoring reaction 'EX_alagln_exchg' since it already exists.
Ignoring reaction 'EX_alaglu_exchg' since it already exists.
Ignoring reaction 'EX_alagly_exchg' since it already exists.
Ignoring reaction 'EX_alahis_exchg' since it already exists.
Ignoring reaction 'EX_alaleu_exchg' since it already exists.
Ignoring reaction 'EX_alathr_exchg' since it already exists.
Ignoring reaction 'EX_arbt_exchg

The output of the community model creation contains quite some lines of info and warnings. This is to be expected. Let's have a look at the different types of info:
1. _Ignoring reaction 'EX_4abz_exchg' since it already exists._ This line will come up if a reaction is present in two different community member models under the same ID. This will only happen for exchange reactions in the exchange compartment and are therefor correct behaviour.
2. _WARNING: no annotation overlap found for matching metabolite mn2. Please make sure that the metabolite with this ID is indeed representing the same substance in all models!_ This warning comes up if exchange metabolites do not contain any matching annotation field. This can be an indicator that metabolites with the same ID are merged, but they represent different chemicals. Another common cause is that no annotation was given for this metabolite in one of the models.
3. _WARNING: matching of the metabolite CO2_EX is unbalanced (mass and/or charge). Please manually curate this metabolite for a mass and charge balanced model!_ This warning means that the formula of an exchange metabolite was different between member models. This can be due to the formula being omitted in some of the models. The other reason is that the metabolites differ in their mass or charge. As this would lead to generation or loss of matter from nothing, these issues need to be resolved for a consistent metabolic model.

### Summary and report ###
The community model object has two utility methods to display information on the model. 
- Summary behaves the same as the summary method of COBRApy, displaying the the solution of FBA and its exchange metabolites. In the CommunityModel summary, the exchange reactions of metabolites responsible for scaling the flux bounds to the community composition are hidden.
- The report function displays information on the model structure: the number of metabolites, reactions, genes, etc., but also quality control measures on mass and charge balance and internal loops.

In [None]:
com_model_obj.summary()

In [None]:
com_model_obj.report()

### Setting the growth rate ###
By default the community model object will have the structure of fixe growth rate. This means, the fractions of the community member abundance is allowed to vary during simulations, but the individual and community growth rate is set to a fixed value (default: 1.0). The next thing we will try is to set the community growth rate to a different value and do a FBA.

In [None]:
com_model_obj.apply_fixed_growth_rate(0.5)
com_model_obj.summary()

### Setting the community member composition ###
The model structure can be changed to fixed abundance, but variable growth rate. To do so, a conversion function needs to be called. Here we then change the community abundance to equal abundances.

In [None]:
com_model_obj.convert_to_fixed_abundance()
abundance_dict = com_model_obj.generate_equal_abundance_dict()
com_model_obj.apply_fixed_abundance(abundance_dict)
com_model_obj.summary()

## Saving and loading community models ##
Community model objects can be saved and loaded into SBML files. This is different from the other available option to save the cobra model of the community model objects, as the abundance fractions of the organisms are written into the file as well. Saving and loading the community model can be done like this:

In [None]:
com_model_obj.save("../data/toy/output/henson_com_model.xml")

In [None]:
com_model_obj_loaded = pycomo.CommunityModel.load("../data/toy/output/henson_com_model.xml")

In [None]:
com_model_obj_loaded

In [None]:
com_model_obj_loaded.community_model.optimize()

### Quality Checks ###
One of the quality checks that should be done is to look into all unbalanced reactions (mass and charge) in the entire model. As said before, such reactions should only exist in the case of boundary reactions, such as exchange, sink and source reactions.

In [None]:
com_model_obj.get_unbalanced_reactions()

## Analysis of community models ##
PyCoMo offers the option to calculate all potential exchange metabolites and cross-feeding interactions in a community, independent of the community composition. The example for this part will be a three member community published by Koch et al. 2019 (https://doi.org/10.1371/journal.pcbi.1006759). The three member organisms are representatives of functional guilds in a biogas community.
### Creating the community model ###
We repeat the steps as before.

In [None]:
test_model_dir = "../data/use_case/koch"
named_models = pycomo.load_named_models_from_dir(test_model_dir)

In [None]:
named_models

In [None]:
single_org_models = []
for name, model in named_models.items():
    single_org_model = pycomo.SingleOrganismModel(model, name)
    single_org_models.append(single_org_model)
    
community_name = "koch_community_model"
com_model_obj = pycomo.CommunityModel(single_org_models, community_name)

With the community model generated, we set the medium for the analysis, as done by Koch et al.

In [None]:
medium = {
    'EX_CO2_EX_exchg': 1000.0,
    'EX_Eth_EX_exchg': 1000.0,
    'EX_BM_tot_exchg': 1000.0
}
com_model_obj.medium = medium
com_model_obj.apply_medium()

# Some metabolites are not allowed to accumulate in the medium.
com_model_obj.community_model.reactions.get_by_id("EX_Form_EX_exchg").upper_bound = 0.
com_model_obj.community_model.reactions.get_by_id("EX_H2_EX_exchg").upper_bound = 0.

### Calculating potential metabolite exchange ###
All potential exchange metabolite fluxes and cross-feeding interactions can be calculated with the _potential_metabolite_exchanges_ method. This is a single FVA, but with a minimum objective of 0 and relaxed constraints. All reaction constraints are changed to include the value 0, which circumvents cases where a specific flux through a reaction is required, leading to infeasible solutions for certain community compositions.

In [None]:
com_model_obj.potential_metabolite_exchanges()

### Plotting the maxiumum growth rate over the composition space ###

In [None]:
import pandas as pd

# Iterate over the fractions in steps of 0.01
com_model_obj.convert_to_fixed_abundance()
rows = []
for i in range (0,100,1):  # fraction of D. vulgaris
    for j in range (0, 100-i, 1): # fraction of M. hungatei
        if (100-i-j) < 0:
            continue

        abundances = {"dv": i/100., "mh": j/100., "mb": (100-i-j)/100.}
        
        # Apply the abuyndances
        com_model_obj.apply_fixed_abundance(abundances)
        
        # Reapply the bound restrictions of the exchange reactions
        com_model_obj.community_model.reactions.get_by_id("EX_Form_EX_exchg").upper_bound = 0.
        com_model_obj.community_model.reactions.get_by_id("EX_H2_EX_exchg").upper_bound = 0.
        
        # Calculate the optimal growth rate
        solution = com_model_obj.community_model.optimize()
        growth = 0. if str(solution.status) == "infeasible" else solution.objective_value
        rows.append({"dv": i/100., "mh": j/100., "growth": growth})
        
growth_df = pd.DataFrame(rows)

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_theme()

# Restructure dataframe for heatmap
growth_df_pivot = growth_df.pivot("mh", "dv", "growth")

# Draw a heatmap with the numeric values in each cell
f, ax = plt.subplots(figsize=(9, 6))
sns.heatmap(growth_df_pivot, ax=ax)
ax.invert_yaxis()