# Building a universal bacterial metabolic model

To reconstruct genome-scale models out of genome data, we first need a universal metabolic model that serves as a curated scaffold for the reconstruction process. Think of the universal model as a collection of curated metabolic reactions that are known to occur among a taxonomic group (in this case, the domain bacteria). But with the advantage of being a fully functional metabolic model by itself, containing compartments, metabolites, a biomass reaction, and all necesary components to simulate growth.

In this Notebook, we are going to create a universal metabolic model representative of all bacteria that we will employ to reconstruct species-specifc models [later on](1_model_reconstructions.ipynb). We will begin with the universal metabolic model contained in the [BIGG database](http://bigg.ucsd.edu/), which we will adapt to our needs. Specifically:

1. We will remove all compartments that are exclusively eukaorytic (e.g. mitochondria, chloroplasts, etc.) and transfer those reactions to the cytosol.

2. Annotate metabolites with meta information (e.g. charge, formula, etc.)

3. Add a biomass reaction representative of bacterial growth.

These steps have been already implemented in the `GEM` class of the Python package [Phycogem](https://github.com/Robaina/Phycogem), which we will use here for convenience.

Let's go ahead and load the universal model from the BIGG database.

## Removing eukaryotic compartments and transfering reactions

In [1]:
from phycogem.reconstruction import GEM

In [2]:
unibigg = GEM("../data/carveme_universes/BIGG_universal_model/universal_model_cobrapy.xml")
unibigg

Set parameter Username

--------------------------------------------
--------------------------------------------

Academic license - for non-commercial use only - expires 2023-11-05


0,1
Name,bigg_universal
Memory address,7f8f4917a850
Number of metabolites,15638
Number of reactions,28301
Number of genes,0
Number of groups,0
Objective expression,1.0*BIOMASS_reaction - 1.0*BIOMASS_reaction_reverse_5a818
Compartments,"cytoplasm, extracellular, periplasm, mitochondrion, peroxisome, unknown, nucleus, vacuole, golgi, thylakoid, lysosome, chloroplast, eyespot, flagellum, mitochondrial intermembrane space, unknown, unknown, unknown, unknown, mitochondrial membrane, cell wall, unknown"


We see that the universal model in the BIGG database contains 28301 reactions, 15638 metabolites, and a number of compartments that are exclusively eukaryotic. We will first remove al shuttle reactions between compartments that are not prokaryotic (all but cytosol, periplasm, and extracellular)

In [3]:
unibigg.remove_shuttle_reactions(allowed_compartments={"e", "c", "p"})
unibigg

0,1
Name,bigg_universal
Memory address,7ff173b16550
Number of metabolites,15467
Number of reactions,25787
Number of genes,0
Number of groups,0
Objective expression,0
Compartments,"cytoplasm, extracellular, periplasm, mitochondrion, peroxisome, unknown, nucleus, vacuole, golgi, thylakoid, lysosome, chloroplast, eyespot, flagellum, unknown, unknown, cell wall, unknown"


We can see that the total number of reactions has decreased considerably after removing all unwanted shuttle reactions. However, we still need to deal with eukaryotic comparments. Let's remove those and transfer their reactions to the cytoplasm.

In [4]:
unibigg.move_reactions_to_cytoplasm()
unibigg

0,1
Name,bigg_universal
Memory address,7ff173b16550
Number of metabolites,11970
Number of reactions,25787
Number of genes,0
Number of groups,0
Objective expression,0
Compartments,"cytoplasm, extracellular, periplasm"


Alrigth, we can see that the final model contains only three compartments: cytosol, periplasm, and extracellular.

## Annotating metabolites

Metabolites in the universal model from the BIGG database don't contain meta infor about their chemical composition and charge. We may need this info later on (for instance, when defining a growth medium), so we will annotate metabolites with this information, using the database available in the [CarveME repo](https://github.com/cdanielmachado/carveme/tree/master/carveme/data/input). Finally, we will rename model compartments for compatibility issues with CarveME and also write the model to an xml file to be able to import it in the final, curation step.

In [13]:
cpd_annotations = "../data/compounds/mnx_compounds.tsv"
unibigg.annotate_compounds(cpd_annotations)
unibigg.prepare_for_carveme("../data/carveme_universes/universal_bacteria.csv")
unibigg.write("../data/carveme_universes/universal_bacteria.xml")

## Adding a biomass pseudo-reaction

Finally, we will curate our universal model by removing stoichiometrically unbalanced reactions, fixing proton and charge balances and removing blocked reactions and dead-end metabolites. We will also add a [biomass pseudo-reaction](https://github.com/cdanielmachado/carveme/blob/master/carveme/data/input/biomass_db.tsv) to the universal model that is representative of bacterial growth. All these steps are implemented in the `curate` function within [CarveME](https://github.com/Robaina/carveme_expanded_universe/blob/master/carveme/cli/curate_universe.py). Note, however, that we have previously eliminated and transfer reactions in eukaryotic organelles to the cytoplasm. This is because CarveME eliminates reactions in eukaryotic compartments by default. While this is a reasonable assumption, we want to keep those reactions in the cytoplasm, because many of these eukaryotic reactions are also present in prokaryotes.

In [17]:
%%bash

curate_universe \
    --input ../data/carveme_universes/universal_bacteria.xml \
    --output ../data/carveme_universes/universal_bacteria_curated.xml \
    --biomass bacteria \
    --biomass-db ../data/biomass_reactions/biomass_db.tsv --gramneg

Curating gramneg universe...
Initial model size: 11970 x 25787
Removing compartments that do not belong to gramneg..
Current model size: 11970 x 25787
Removing reactions that do not belong to gramneg..
Current model size: 11970 x 25787
Computing missing formulae...
Set parameter Username
Academic license - for non-commercial use only - expires 2023-11-05
Removing unbalanced reactions..
found 5235 reactions
Current model size: 10425 x 20055
Trying to correct proton and charge balance...
Trying to fix hydrogen stoichiometry...
Removing blocked reactions and dead-end metabolites...
Final model size: 6339 x 17002


## Testing the model

That's it. We have successfully reconstructed our bacterial universal metabolic model. Let's test it by simulating growth in a minimal medium. Note that this just to test the operability of the model, growth simulations are meaningless since the universal model does not represent any specific organism.

In [20]:
unibac = GEM("../data/carveme_universes/universal_bacteria_curated.xml")
unibac.model.summary()

Metabolite,Reaction,Flux,C-Number,C-Flux
12ppd__S_e,EX_12ppd__S_e,1000.0,3,0.91%
2obut_e,EX_2obut_e,435.8,4,0.53%
3mb_e,EX_3mb_e,435.8,5,0.66%
3mob_e,EX_3mob_e,1000.0,5,1.52%
3mop_e,EX_3mop_e,1000.0,6,1.82%
4mop_e,EX_4mop_e,564.2,6,1.03%
5fthf_e,EX_5fthf_e,0.1106,20,0.00%
5mta_e,EX_5mta_e,1000.0,11,3.34%
LalaDgluMdap_e,EX_LalaDgluMdap_e,49.59,15,0.23%
ahcys_e,EX_ahcys_e,0.1106,14,0.00%

Metabolite,Reaction,Flux,C-Number,C-Flux
2mba_e,EX_2mba_e,-1000.0,5,1.62%
2mpa_e,EX_2mpa_e,-1000.0,4,1.30%
3bcrn_e,EX_3bcrn_e,-654.5,11,2.34%
3hpp_e,EX_3hpp_e,-500.0,3,0.49%
3mba_e,EX_3mba_e,-1000.0,5,1.62%
56dura_e,EX_56dura_e,-1000.0,4,1.30%
5mdru1p_e,EX_5mdru1p_e,-1000.0,6,1.95%
HC00342_e,EX_HC00342_e,-591.6,6,1.15%
M01966_e,EX_M01966_e,-1000.0,6,1.95%
M01989_e,EX_M01989_e,-1000.0,26,8.44%
