# Reconstruction of genome-scale metabolic models

In the following exercise you are going to:

* Reconstruct a draft of a genome-scale metabolic model using [carveme](https://github.com/cdanielmachado/carveme).
* Analyze its characteristics (number of reactions, metabolites and genes; predicted growth rate etc.).
* Validate your model using [memote](https://memote.io/).
* Validate three other models of your choice (use Pubmed or Google Scholar to find publications of GSMs and then download them from the supplementary materials if available) a compare their quality metrics also with the model that you reconstructed.

***
**A.**
Use [carveme](https://github.com/cdanielmachado/carveme) to generate a draft reconstruction for a bacterium of your choice.

For example,

    carve --refseq GCF_000166295.1 --output Marinobacter-adhaerens-HP15.xml --gapfill LB --init LB

will generate a GSM for the bacterium *Marinobacter adhaerens* HP15 by accessing its [sequence](https://www.ncbi.nlm.nih.gov/nuccore/NC_017506.1) on [RefSeq](https://www.ncbi.nlm.nih.gov/refseq/) (`GCF_000166295.1` is its RefSeq accession number). After "carving" out a model from the universal reaction model, carveme will gap-fill the model to be able to grow on rich medium (`--gapfil LB`) and initialize the final model with that medium (`--init LB`). The final model is writing to a file named `Marinobacter-adhaerens-HP15.xml` (`--output Marinobacter-adhaerens-HP15.xml`).

Hints:
* You can find carveme's documentation [here](https://carveme.readthedocs.io/en/latest/usage.html).
* You can excecute command line commandos (e.g. `carve`) in Jupyter notebook by prepending them with a `!` here in a code cell.

As an example, let's reconstruct a draft GSM for [*Marinobacter adhaerens HP15*](https://www.ncbi.nlm.nih.gov/assembly/GCF_000166295.1) (this should take a few minutes).

In [1]:
%%time
!carve --refseq GCF_000166295.1 --output Marinobacter-adhaerens-HP15-LB.xml --gapfill LB --init LB

Running diamond for the first time, please wait while we build the internal database...
diamond v0.9.14.115 | by Benjamin Buchfink <buchfink@gmail.com>
Licensed under the GNU AGPL <https://www.gnu.org/licenses/agpl.txt>
Check http://github.com/bbuchfink/diamond for updates.

#CPU threads: 8
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Database file: /Users/s143838/.pyenv/versions/anaconda3-2020.07/envs/carveme_env/lib/python3.6/site-packages/carveme/data/generated/bigg_proteins.faa
Opening the database file...  [0.001324s]
Loading sequences...  [0.088402s]
Masking sequences...  [0.604364s]
Writing sequences...  [0.094553s]
Loading sequences...  [1.9e-05s]
Writing trailer...  [0.002401s]
Closing the input file...  [4.1e-05s]
Closing the database file...  [0.000328s]
Processed 26727 sequences, 11170577 letters.
Total time = 0.791755s
CPU times: user 1.01 s, sys: 333 ms, total: 1.34 s
Wall time: 1min 11s


***
**B.**
Try to answer the following questions.

* How many reactions, metabolites and genes does the model contain?
* What is the percentage of genes covered?
* How fast does your model/organism grow?
* What is the predicted growth rate if you gap-fill the model based on a minimal M9 glucose medium (`M9` instead of `LB`)

Hints:
* You can use `cobra.io.read_sbml_model` to load your model (see also previous exercise)
* Reactions, metabolites and genes are accessible via `model.reactions`, `model.metabolites` and `model.genes` (see also previous exercise).
* You can simulate the model by running model.optimize() (the objective value corresponds to $\mu_{max}$)
* You can look at the medium of a model by running `model.medium`. This will return a dictionary of all the exchange reactions that enable uptable and the bound that has been set on that uptake

In [2]:
# For some reason im not able install the cobra package in my conda env. Thus, a fix maybe to use carveme only in command line and use the regular venv to analyse the model
from cobra.io import read_sbml_model
model = read_sbml_model('Marinobacter-adhaerens-HP15-LB.xml')

***
**C.**

You can generate a report for a model with memote by running 

    memote report snapshot model.xml

In the interest of time we're going to skip a few more computationally demanding tests and also write .

    !memote report snapshot --skip test_stoichiometric_consistency \
        --skip test_find_metabolites_not_produced_with_open_bounds \
        --skip test_find_metabolites_not_consumed_with_open_bounds Marinobacter-adhaerens-HP15-LB.xml --filename Marinobacter-adhaerens-HP15-LB.html

This will take a few minutes (depending on your model's size and the computational load on the Jupyter Classroom) and generate a file called `index.html` that contains the report. You can double click it in the file bwYou can continue to **D.** while you're waiting for the result.

You can adapt the following to match you're model.

In [None]:
%%time
!memote report snapshot --skip test_stoichiometric_consistency \
    --skip test_find_metabolites_not_produced_with_open_bounds \
    --skip test_find_metabolites_not_consumed_with_open_bounds Marinobacter-adhaerens-HP15-LB.xml --filename Marinobacter-adhaerens-HP15-LB.html

***
**D.**
Upload three models that you found in literature and test them with memote. Compare how different quality metrics vary between those models and also the model that you constructed with carveme.