# COBRApy

COBRApy is a package for constraint-based modeling of biological networks written in Python.

This tool allows loading and inspecting Genome-Scale Metabolic (GEM) models written in the Systems Biology Markup Language (SBML) format.

Using COBRApy, one can analyze the following model contents:
1. Reactions
2. Metabolites
3. Genes
4. Exchange reactions (Environmental Conditions)

COBRApy allows manipulating the contents of a GEM model. For instance, one can edit reactions' flux bounds, knock out a metabolic gene, or change environmental conditions.

COBRApy contains flux analysis methods to simulate an organism's phenotypic behavior. These include Flux Balance Analysis (FBA), Parsimonious FBA, or Flux Variability Analysis (FVA).

The simulation of gene and reaction deletions for a given GEM model is a straightforward process. One can simulate single or double knockouts using one of the flux analysis methods mentioned above.

## Installation


### Requirements
The following requirements are needed to use COBRApy:
- Python 3.6 or higher
- pip must be installed
- GLPK is the default solver, but CPLEX is preferred


### How to install COBRApy?
```
pip install cobra
```

# Exercise 7 - Phenotype prediction

For this practical session, we will be using the following model:
- _E. coli_ core model which contains the central carbon metabolism of _Escherichia coli_ -> file: ../data/e_coli_core.xml

You can read more about _E. coli_ core model (Orth et al., 2010) in the following links:
- https://journals.asm.org/doi/10.1128/ecosalplus.10.2.1
- http://bigg.ucsd.edu/models/e_coli_core

This exercise consists of searching for potential candidate genes whose knock out would lead to increased Ethanol production. Thus, the following steps will be followed:
- Perform a FBA simulation using an anaerobic medium;
- Load the escher map with the FBA simulation to visualize the flux distribution;
- Identify candidate mutants to increase the production of a compound of interest;
- Determine the essential genes of the model;
- Check the candidates for their essentiality;
- Perform the gene knock-outs;

In [1]:
# imports
import cobra
import escher

In [2]:
# Loading a model
model_path = '../data/e_coli_core.xml'
model = cobra.io.read_sbml_model(model_path)

model

0,1
Name,e_coli_core
Memory address,28e293b2c48
Number of metabolites,72
Number of reactions,95
Number of genes,137
Number of groups,0
Objective expression,1.0*BIOMASS_Ecoli_core_w_GAM - 1.0*BIOMASS_Ecoli_core_w_GAM_reverse_712e5
Compartments,"extracellular space, cytosol"


### Phenotype Simulation

The default environmental conditions of the _E. coli_ core model represent an aerobic medium. However, the production of ethanol in _E. coli_ is often associated with mixed-acid fermentation. Hence, we should use anaerobic conditions.

In [3]:
#setting the environmental conditions to replicate an anaerobic medium
o2 = model.reactions.get_by_id('EX_o2_e')
o2.bounds = (0, 1000)
o2

0,1
Reaction identifier,EX_o2_e
Name,O2 exchange
Memory address,0x28e5e4dd1c8
Stoichiometry,o2_e -->  O2 O2 -->
GPR,
Lower bound,0
Upper bound,1000


In [4]:
#performing a FBA simulation
wt_solution = model.optimize()
wt_solution

Unnamed: 0,fluxes,reduced_costs
PFK,9.789459,2.602085e-18
PFL,17.804674,0.000000e+00
PGI,9.956609,0.000000e+00
PGK,-19.437336,-0.000000e+00
PGL,0.000000,0.000000e+00
...,...,...
NADH16,0.000000,-5.538015e-03
NADTRHD,0.000000,-1.107603e-02
NH4t,1.154156,0.000000e+00
O2t,0.000000,0.000000e+00


### Metabolic Pathway Visualization

Now that we have performed an FBA simulation under anaerobic conditions, we can run the `escher.Builder` method to display the flux distribution on the _E. coli_ core metabolism map. This may help to identify reactions deflecting flux from ethanol production.

In [5]:
# create the builder object which contains the metabolic map
builder = escher.Builder(map_name='e_coli_core.Core metabolism', model=model, reaction_data=wt_solution.fluxes)
builder

Downloading Map from https://escher.github.io/1-0-0/6/maps/Escherichia%20coli/e_coli_core.Core%20metabolism.json


Builder(reaction_data={'PFK': 9.789458863898286, 'PFL': 17.804674217935304, 'PGI': 9.95660909530426, 'PGK': -1…

Citrate synthase and Phosphotransacetylase reactions seem to be deflecting the flux distribution from ethanol production.

### Search for essential Genes

Now that we have two candidate reactions, we can use the `find_essential_genes` function of the flux analysis package to identify a list of essential genes for this model. This list contains all the genes whose knock out would result in no growth. 

Hence, reactions associated with essential genes are not suitable candidates for a metabolic engineering strategy.

In [6]:
#searching for essential genes
essential = cobra.flux_analysis.find_essential_genes(model)
essential

{<Gene b0720 at 0x28e5e3eefc8>,
 <Gene b1136 at 0x28e5e40cd48>,
 <Gene b1779 at 0x28e5e400c08>,
 <Gene b2415 at 0x28e5e3fd5c8>,
 <Gene b2416 at 0x28e5e3faf88>,
 <Gene b2779 at 0x28e5e3f4648>,
 <Gene b2926 at 0x28e5e4224c8>,
 <Gene b3919 at 0x28e5e432fc8>,
 <Gene b3956 at 0x28e5e427c08>,
 <Gene b4025 at 0x28e5e421f48>}

### Check the mutant candidates

After discovering the essential genes of the model, one can now check if the target reactions are associated with any of those essential genes.

In [7]:
model.reactions.PTAr

0,1
Reaction identifier,PTAr
Name,Phosphotransacetylase
Memory address,0x28e5e47f2c8
Stoichiometry,accoa_c + pi_c <=> actp_c + coa_c  Acetyl-CoA + Phosphate <=> Acetyl phosphate + Coenzyme A
GPR,b2297 or b2458
Lower bound,-1000.0
Upper bound,1000.0


In [8]:
model.reactions.CS

0,1
Reaction identifier,CS
Name,Citrate synthase
Memory address,0x28e5e485a48
Stoichiometry,accoa_c + h2o_c + oaa_c --> cit_c + coa_c + h_c  Acetyl-CoA + H2O H2O + Oxaloacetate --> Citrate + Coenzyme A + H+
GPR,b0720
Lower bound,0.0
Upper bound,1000.0


As we can see, the Phosphotransacetylase reaction is a viable knock-out target. By checking its GPR, we can see that none of its genes are in the list of essential genes.

On the other hand, the Citrate synthase should not be knocked out, as the organism would not be able to grow.

### Gene knock-outs

In [9]:
print('Wt strain ethanol production:', wt_solution['EX_etoh_e'])

with model:
    model.genes.b2297.knock_out()
    model.genes.b2458.knock_out()
    mutant_solution = model.optimize()
    print('Mutant strain ethanol production:', mutant_solution['EX_etoh_e'])

Wt strain ethanol production: 8.279455380486581
Mutant strain ethanol production: 16.584255740929652


In [10]:
builder = escher.Builder(map_name='e_coli_core.Core metabolism', model=model, reaction_data=mutant_solution.fluxes)
builder

Downloading Map from https://escher.github.io/1-0-0/6/maps/Escherichia%20coli/e_coli_core.Core%20metabolism.json


Builder(reaction_data={'PFK': 9.271232684579616, 'PFL': 3.9563473930779764, 'PGI': 8.339429028836893, 'PGK': -…