# COBRApy

COBRApy is a package for constraint-based modeling of biological networks written in Python.

This tool allows loading and inspecting Genome-Scale Metabolic (GEM) models written in the Systems Biology Markup Language (SBML) format.

Using COBRApy, one can analyse the following model contents:
1. Reactions
2. Metabolites
3. Genes
4. Exchange reactions (Environmental Conditions)

COBRApy allows manipulating the contents of a GEM model. For instance, one can edit reactions' flux bounds, knock out a metabolic gene, or change the environmental conditions.

Phenotype prediction can be simulated with several flux analysis methods implemented in COBRApy. These include Flux Balance Analysis (FBA), Parsimonious FBA, or Flux Variability Analysis (FVA).

The simulation of gene and reaction deletions for a given GEM model is a simple and straightforward process. One can simulate single or double knock outs using one of the flux analysis methods mentioned above.

## Instalation


### Requirements
The following requirements need to be installed to use COBRApy:
- Python 3.6 or higher
- pip must be installed
- GLPK solver is used by default but CPLEX is prefered


### How to install COBRApy?
```
pip install cobra
```

# Exercise 3 - Flux Analysis

## Working with a GEM model

COBRApy can be used to read a GEM model in SBML format file.
For this practical session, we will be using the following model:
- _E. coli_ core model which contains the central carbon metabolism of _Escherichia coli_ -> file: ../data/e_coli_core.xml

The reactions, metabolites, and genes encoded into a SBML format file can be parsed by COBRApy. These contents are loaded into Python objects simple to use, namely `cobra.Reaction`, `cobra.Metabolite`, and `cobra.Gene`

The model itself will be available as an `cobra.Model` object containing all these attributes.

In [1]:
# importing cobra
import cobra

# Loading a model
model_path = '../data/e_coli_core.xml'
model = cobra.io.read_sbml_model(model_path)

model

0,1
Name,e_coli_core
Memory address,0x01f644517848
Number of metabolites,72
Number of reactions,95
Number of groups,0
Objective expression,1.0*BIOMASS_Ecoli_core_w_GAM - 1.0*BIOMASS_Ecoli_core_w_GAM_reverse_712e5
Compartments,"extracellular space, cytosol"


In [2]:
#retriving first five reactions of the model
model.reactions[0:5]

[<Reaction PFK at 0x1f6455ee908>,
 <Reaction PFL at 0x1f6455ee8c8>,
 <Reaction PGI at 0x1f6455f5cc8>,
 <Reaction PGK at 0x1f6455ffbc8>,
 <Reaction PGL at 0x1f6455ffc48>]

In [3]:
#inspecting the first reaction of the model
model.reactions[0]

0,1
Reaction identifier,PFK
Name,Phosphofructokinase
Memory address,0x1f6455ee908
Stoichiometry,"atp_c + f6p_c --> adp_c + fdp_c + h_c  ATP C10H12N5O13P3 + D-Fructose 6-phosphate --> ADP C10H12N5O10P2 + D-Fructose 1,6-bisphosphate + H+"
GPR,b3916 or b1723
Lower bound,0.0
Upper bound,1000.0


In [4]:
#inspecting a reaction by its ID
model.reactions.get_by_id("ACALD")

0,1
Reaction identifier,ACALD
Name,Acetaldehyde dehydrogenase (acetylating)
Memory address,0x1f6455fbd88
Stoichiometry,acald_c + coa_c + nad_c <=> accoa_c + h_c + nadh_c  Acetaldehyde + Coenzyme A + Nicotinamide adenine dinucleotide <=> Acetyl-CoA + H+ + Nicotinamide adenine dinucleotide - reduced
GPR,b0351 or b1241
Lower bound,-1000.0
Upper bound,1000.0


In [5]:
#retriving first five metabolites of the model
model.metabolites[0:5]

[<Metabolite glc__D_e at 0x1f64456b148>,
 <Metabolite gln__L_c at 0x1f64456b108>,
 <Metabolite gln__L_e at 0x1f64456b8c8>,
 <Metabolite glu__L_c at 0x1f645544408>,
 <Metabolite glu__L_e at 0x1f645546088>]

In [6]:
#inspecting the first metabolite of the model
model.metabolites[0]

0,1
Metabolite identifier,glc__D_e
Name,D-Glucose
Memory address,0x1f64456b148
Formula,C6H12O6
Compartment,e
In 2 reaction(s),"EX_glc__D_e, GLCpts"


In [7]:
#inspecting a metabolite by its ID
model.metabolites.get_by_id("13dpg_c")

0,1
Metabolite identifier,13dpg_c
Name,3-Phospho-D-glyceroyl phosphate
Memory address,0x1f6455545c8
Formula,C3H4O10P2
Compartment,c
In 2 reaction(s),"PGK, GAPD"


In [8]:
#retriving first five genes of the model
model.genes[0:5]

[<Gene b1241 at 0x1f645586b88>,
 <Gene b0351 at 0x1f645586988>,
 <Gene s0001 at 0x1f645589f48>,
 <Gene b1849 at 0x1f645594748>,
 <Gene b3115 at 0x1f645594d88>]

In [9]:
#inspecting the first gene of the model
model.genes[0]

0,1
Gene identifier,b1241
Name,adhE
Memory address,0x1f645586b88
Functional,True
In 2 reaction(s),"ACALD, ALCD2x"


In [10]:
#inspecting a gene by its ID
model.genes.get_by_id('b1241')

0,1
Gene identifier,b1241
Name,adhE
Memory address,0x1f645586b88
Functional,True
In 2 reaction(s),"ACALD, ALCD2x"


### Reactions

In COBRApy reactions are objects that can be inspected. These objects can contain usefull information for all reactions in the model, such as:
- name;
- metabolites;
- stoichiometry;
- genes;
- formula;
- reversibility;
- flux bounds;
- gpr;

In [11]:
#inspecting the reaction name, formula, metabolites, and stoichiometry.
reaction = model.reactions.get_by_id('ACALD')

print(reaction.name, '\n')
print(reaction.reaction, '\n')
for metabolite, coefficient in reaction.metabolites.items():
    print(metabolite, '->', coefficient)

Acetaldehyde dehydrogenase (acetylating) 

acald_c + coa_c + nad_c <=> accoa_c + h_c + nadh_c 

acald_c -> -1.0
coa_c -> -1.0
nad_c -> -1.0
accoa_c -> 1.0
h_c -> 1.0
nadh_c -> 1.0


In [12]:
#inspecting reversibility and flux bounds.
print(reaction.lower_bound, "< ACALD <", reaction.upper_bound, '\n')
print(reaction.reversibility, '\n')
print(reaction.bounds)

-1000.0 < ACALD < 1000.0 

True 

(-1000.0, 1000.0)


In [13]:
#change reaction bounds
reaction.bounds = (0, 1000)
print(reaction.lower_bound, "< ACALD <", reaction.upper_bound, '\n')
print(reaction.reversibility)

0 < ACALD < 1000 

False


### Metabolites

In COBRApy metabolites are objects that can be inspected. These objects can contain usefull information for all metabolites in the model, such as:
- name;
- chemical formula;
- compartment;
- reactions;

In [14]:
#inspecting metabolite name, chemical formula, compartment and reactions.
metabolite = model.metabolites.get_by_id('13dpg_c')

print(metabolite.name, '\n')
print(metabolite.formula, '\n')
print('Metabolite Compartment:',metabolite.compartment, '\n')
for _reaction in metabolite.reactions:
    print(_reaction.id, ':', reaction.reaction)

3-Phospho-D-glyceroyl phosphate 

C3H4O10P2 

Metabolite Compartment: c 

PGK : acald_c + coa_c + nad_c --> accoa_c + h_c + nadh_c
GAPD : acald_c + coa_c + nad_c --> accoa_c + h_c + nadh_c


### Genes

In COBRApy metabolites are objects that can be inspected. These objects can contain usefull information for all metabolites in the model, such as:
- name;
- reactions;

In [15]:
#inspect gene name and reactions.
gene = model.genes.get_by_id('b0351')

print(gene.name, '\n')
for reaction_ in gene.reactions:
    print(reaction_.id, ':', reaction.gene_reaction_rule)

mhpF 

ACALD : b0351 or b1241


### Exchanges

In COBRApy exchanges are reaction objects that can be inspected in the model. These special reactions define the environmental condictions (eg. medium) of the model. According to the lower and upper bound of each exchange reactions, one can visualize the model's uptake and secretion metabolites.

In [16]:
#inspecting the exchange reactions.
for exchange in model.exchanges:
    print(exchange.name, '->', exchange.bounds)

Acetate exchange -> (0.0, 1000.0)
Acetaldehyde exchange -> (0.0, 1000.0)
2-Oxoglutarate exchange -> (0.0, 1000.0)
CO2 exchange -> (-1000.0, 1000.0)
Ethanol exchange -> (0.0, 1000.0)
Formate exchange -> (0.0, 1000.0)
D-Fructose exchange -> (0.0, 1000.0)
Fumarate exchange -> (0.0, 1000.0)
D-Glucose exchange -> (-10.0, 1000.0)
L-Glutamine exchange -> (0.0, 1000.0)
L-Glutamate exchange -> (0.0, 1000.0)
H+ exchange -> (-1000.0, 1000.0)
H2O exchange -> (-1000.0, 1000.0)
D-lactate exchange -> (0.0, 1000.0)
L-Malate exchange -> (0.0, 1000.0)
Ammonia exchange -> (-1000.0, 1000.0)
O2 exchange -> (-1000.0, 1000.0)
Phosphate exchange -> (-1000.0, 1000.0)
Pyruvate exchange -> (0.0, 1000.0)
Succinate exchange -> (0.0, 1000.0)


### Phenotype Prediction

COBRApy includes three different algortithms for phenotype prediction. These include Flux Balance Analysis (FBA), Parsimonious Flux Balance Analysis (pFBA), and Flux Variability Analysis (FVA).

To perform a simulation using one of these methods, you should first define an objective function. This can be a reaction or an exchange, which will be maximized or minimized. By default, the biomass reaction is set as the model objective function as this mimics the biological behavior of most organisms.

In [17]:
model.objective.expression

1.0*BIOMASS_Ecoli_core_w_GAM - 1.0*BIOMASS_Ecoli_core_w_GAM_reverse_712e5

#### Flux Balance Analysis (FBA)

FBA simulations can be performed using `model.optimize()`. This returns a solution object, which includes the result of the simulation. This includes:
- objective_value;
- status;
- fluxes;

In [18]:
#performing a FBA simulation
fba_solution = model.optimize()
fba_solution

Unnamed: 0,fluxes,reduced_costs
PFK,7.477382,8.673617e-19
PFL,0.000000,-1.527746e-02
PGI,4.860861,0.000000e+00
PGK,-16.023526,3.469447e-18
PGL,4.959985,0.000000e+00
...,...,...
NADH16,38.534610,0.000000e+00
NADTRHD,0.000000,-2.546243e-03
NH4t,4.765319,0.000000e+00
O2t,21.799493,0.000000e+00


Models solved using the FBA method can be further analysed using the `model.summary()` method.


This method returns the flux value for the reaction defined as objective function. It also returns the fluxes for the exchange reactions. In this table, one can see uptake fluxes, which corresponds to the rates of metabolite cosumption in the model. On the other hand, the secretion fluxes corresponds to the rates of metabolite production in the model.

In [19]:
model.summary()

Metabolite,Reaction,Flux,C-Number,C-Flux
glc__D_e,EX_glc__D_e,10.0,6,100.00%
nh4_e,EX_nh4_e,4.765,0,0.00%
o2_e,EX_o2_e,21.8,0,0.00%
pi_e,EX_pi_e,3.215,0,0.00%

Metabolite,Reaction,Flux,C-Number,C-Flux
co2_e,EX_co2_e,-22.81,1,100.00%
h2o_e,EX_h2o_e,-29.18,0,0.00%
h_e,EX_h_e,-17.53,0,0.00%


### Pathway Visualization

COBRApy does not include a pathway visualization method. Nevertheless, there are other tools that work with cobrapy models, such as Escher.

Escher allows the visualization of metabolic pathway maps. Escher maps can be build using the `escher.Build()` function, which requires the model. In addition, escher maps can be overlayed with a flux distribution obtained with cobrapy `optimize()` method.

To obtain the metabolic map for our model use Escher as follows:

In [20]:
# first import escher
import escher

In [21]:
# create the builder object wich contains the metabolic map
builder = escher.Builder(map_name='e_coli_core.Core metabolism', model=model, reaction_data=fba_solution.fluxes)
builder

Downloading Map from https://escher.github.io/1-0-0/6/maps/Escherichia%20coli/e_coli_core.Core%20metabolism.json


Builder(reaction_data={'PFK': 7.477381962160285, 'PFL': 0.0, 'PGI': 4.860861146496822, 'PGK': -16.023526143167…