# Reconstruction of genome-scale metabolic models

In the following exercise you are going to:

* Reconstruct a draft of a genome-scale metabolic model using [carveme](https://github.com/cdanielmachado/carveme).
* Analyze its characteristics (number of reactions, metabolites and genes; predicted growth rate etc.).
* Validate your model using [memote](https://memote.io/).
* Validate You are going to validate 3 other models using memote and compare the results to you're model

Before you start, make sure that the following returns a version 1.4.2. If it states 1.4.4., please open a command line terminal (the + icon in the top left) and run `python3.6 -m pip install optlang==1.4.4 --upgrade`

In [15]:
import optlang
optlang.__version__

'1.4.4'

***
**A.**
Use [carveme](https://github.com/cdanielmachado/carveme) to generate a draft reconstruction for a bacterium of your choice.

For example,

    carve --refseq GCF_000166295.1 --output Marinobacter-adhaerens-HP15.xml --gapfill LB --init LB

will generate a GSM for the bacterium *Marinobacter adhaerens* HP15 by accessing its [sequence](https://www.ncbi.nlm.nih.gov/nuccore/NC_017506.1) on [RefSeq](https://www.ncbi.nlm.nih.gov/refseq/) (`GCF_000166295.1` is its RefSeq accession number). After "carving" out a model from the universal reaction model, carveme will gap-fill the model to be able to grow on rich medium (`--gapfil LB`) and initialize the final model with that medium (`--init LB`). The final model is writing to a file named `Marinobacter-adhaerens-HP15.xml` (`--output Marinobacter-adhaerens-HP15.xml`).

Hints:
* You can find carveme's documentation [here](https://carveme.readthedocs.io/en/latest/usage.html).
* You can excecute command line commandos (e.g. `carve`) in Jupyter notebook by prepending them with a `!` here in a code cell.

As an example, let's reconstruct a draft GSM for [*Marinobacter adhaerens HP15*](https://www.ncbi.nlm.nih.gov/assembly/GCF_000166295.1) (this should take a few minutes).

In [9]:
!carve --refseq GCF_000166295.1 --output Marinobacter-adhaerens-HP15-LB.xml --gapfill LB --init LB

***
**B.**

* How many reactions, metabolites and genes does the model contain?
* What is the percentage of genes covered 
* How fast does your model/organism grow?

Hints:
* You can use `cobra.io.read_sbml_model` to load your model (see also previous exercise)
* Reactions, metabolites and genes are accessible via `model.reactions`, `model.metabolites` and `model.genes` (see also previous exercise).
* You can simulate the model by running model.optimize() (the objective value corresponds to $\mu_{max}$)
* What is the predicted growth rate if you gap-fill the model based on a minimal M9 glucose medium (`M9` instead of `LB`)

In [10]:
from cobra.io import read_sbml_model

In [11]:
model = read_sbml_model('Marinobacter-adhaerens-HP15-LB.xml')

The model includes 1089 genes.

In [12]:
len(model.genes)

1089

That accounts for roughly 26% of genes in the organism's genome (4,197 coding genes in total as can be seen on genome's [summary page](https://www.ncbi.nlm.nih.gov/nuccore/NC_017506.1)).

In [6]:
(len(model.genes) / 4197) * 100

25.94710507505361

In [7]:
model.medium

{'EX_glc__D_e': 10.0,
 'EX_h2o_e': 10.0,
 'EX_h_e': 10.0,
 'EX_leu__L_e': 10.0,
 'EX_ala__L_e': 10.0,
 'EX_cl_e': 10.0,
 'EX_pi_e': 10.0,
 'EX_nh4_e': 10.0,
 'EX_gly_e': 10.0,
 'EX_ser__L_e': 10.0,
 'EX_thr__L_e': 10.0,
 'EX_arg__L_e': 10.0,
 'EX_fe3_e': 10.0,
 'EX_lys__L_e': 10.0,
 'EX_asp__L_e': 10.0,
 'EX_aso3_e': 10.0,
 'EX_k_e': 10.0,
 'EX_pro__L_e': 10.0,
 'EX_ca2_e': 10.0,
 'EX_mg2_e': 10.0,
 'EX_mn2_e': 10.0,
 'EX_cobalt2_e': 10.0,
 'EX_zn2_e': 10.0,
 'EX_cu2_e': 10.0,
 'EX_o2_e': 10.0,
 'EX_glu__L_e': 10.0,
 'EX_fe2_e': 10.0,
 'EX_h2s_e': 10.0,
 'EX_pheme_e': 10.0,
 'EX_his__L_e': 10.0,
 'EX_hxan_e': 10.0,
 'EX_ile__L_e': 10.0,
 'EX_met__L_e': 10.0,
 'EX_mobd_e': 10.0,
 'EX_so4_e': 10.0,
 'EX_val__L_e': 10.0,
 'EX_thm_e': 10.0,
 'EX_ura_e': 10.0}

In [8]:
model.optimize()

Unnamed: 0,fluxes,reduced_costs
1PPDCRc,0.000128,-1.387779e-17
2AGPE120tipp,0.000000,0.000000e+00
2AGPE140tipp,0.000000,-7.819936e-02
2AGPE141tipp,0.000000,-5.604288e-02
2AGPE160tipp,0.000000,0.000000e+00
...,...,...
sink_hemeO_c,-0.000000,0.000000e+00
sink_lipopb_c,-0.000000,0.000000e+00
sink_sheme_c,-0.000000,0.000000e+00
Growth,1.319614,-3.330669e-15


In [None]:
!carve --refseq GCF_000166295.1 --output Marinobacter-adhaerens-HP15-M9.xml --gapfill M9 --init M9

***
**C.**

You can generate a report for a model by running 

    memote report snapshot model.xml

This will take a few minutes (depending on your model's size) an generate a file called `index.html` that contains the report.

Hints:
* 

In [16]:
%%time
!memote report snapshot --skip test_stoichiometric_consistency --skip test_find_metabolites_not_produced_with_open_bounds --skip test_find_metabolites_not_consumed_with_open_bounds Marinobacter-adhaerens-HP15-LB.xml

pyenv: memote: command not found

The `memote' command exists in these Python versions:
  27410-course-materials
  3.6.11/envs/27410-course-materials

Note: See 'pyenv help global' for tips on allowing both
      python2 and python3 to be found.
CPU times: user 4.48 ms, sys: 9.65 ms, total: 14.1 ms
Wall time: 226 ms
