# COBRApy

COBRApy is a package for constraint-based modeling of biological networks written in Python.

This tool allows loading and inspecting Genome-Scale Metabolic (GEM) models written in the Sytems Biology Markup Language (SBML) format.

Using COBRApy, one can analyse the following model contents:
1. Reactions
2. Metabolites
3. Genes
4. Exchange reactions (Environmental Conditions)

COBRApy allows manipulating the contents of a GEM model. For instance, one can edit reactions' flux bounds, knock out a metabolic gene, or change the environmental conditions.

Phenotype prediction can be simulated with several flux analysis methods implemented in COBRApy. These include Flux Balance Anlysis (FBA), Parsimonious FBA, or Flux Variability Analysis (FVA).

The simulation of gene and reaction deletions for a given GEM model is a simple and straightforward process. One can simulate single or double knock outs using one of the flux analysis methods mentioned above.

## Instalation


### Requirements
The following requirements need to be installed to use COBRApy:
- Python 3.6 or higher
- pip must be installed
- GLPK solver is used by default but CPLEX is prefered


### How to install COBRApy?
```
pip install cobra
```

# Exercise 5 - Phenotype prediction

## Working with a GEM model

For this practical session, we will be using the following model:
- _E. coli_ core model which contains the central carbon metabolism of _Escherichia coli_ -> file: e_coli_core.xml

This exercise consists of exploring the phenotype prediction tools of COBRApy. Thus, the following steps will be followed:
- Loading the model with COBRApy;
- Perform a FBA simulation using an aerobic/anaerobic medium;
- Perform reaction and gene deletions
- Perform other flux analysis methods, such as pFBA, FVA, ROOM and MOMA;
- Analyzing the model essential 


In [None]:
# importing cobra
import cobra

# Loading a model
model_path = '../data/e_coli_core.xml'
model = cobra.io.read_sbml_model(model_path)

model

In [3]:
# importing cobra
import cobra

# Loading a model
model_path = '../data/e_coli_core.xml'
model = cobra.io.read_sbml_model(model_path)

model

0,1
Name,e_coli_core
Memory address,0x0184ab3843c8
Number of metabolites,72
Number of reactions,95
Number of groups,0
Objective expression,1.0*BIOMASS_Ecoli_core_w_GAM - 1.0*BIOMASS_Ecoli_core_w_GAM_reverse_712e5
Compartments,"extracellular space, cytosol"


In [None]:
#retriving first five reactions of the model
model.reactions[0:5]

In [None]:
#inspecting the first reaction of the model
model.reactions[0]

In [None]:
#inspecting a reaction by its ID
model.reactions.get_by_id("ACALD")

In [None]:
#retriving first five metabolites of the model
model.metabolites[0:5]

In [None]:
#inspecting the first metabolite of the model
model.metabolites[0]

In [None]:
#inspecting a metabolite by its ID
model.metabolites.get_by_id("13dpg_c")

In [None]:
#retriving first five genes of the model
model.genes[0:5]

In [None]:
#inspecting the first gene of the model
model.genes[0]

In [None]:
#inspecting a gene by its ID
model.genes.get_by_id('b1241')

### Reactions

In COBRApy reactions are objects that can be inspected. These objects can contain usefull information for all reactions in the model, such as:
- name;
- metabolites;
- stoichiometry;
- genes;
- formula;
- reversibility;
- flux bounds;
- gpr;

In [None]:
#inspecting the reaction name, formula, metabolites, and stoichiometry.
reaction = model.reactions.get_by_id('ACALD')

print(reaction.name, '\n')
print(reaction.reaction, '\n')
for metabolite, coefficient in reaction.metabolites.items():
    print(metabolite, '->', coefficient)

In [None]:
#inspecting reversibility and flux bounds.
print(reaction.lower_bound, "< ACALD <", reaction.upper_bound, '\n')
print(reaction.reversibility, '\n')
print(reaction.bounds)

In [None]:
#change reaction bounds
reaction.bounds = (0, 1000)
print(reaction.lower_bound, "< ACALD <", reaction.upper_bound, '\n')
print(reaction.reversibility)

In [None]:
#inspect gene reaction rule
reaction.gene_reaction_rule

### Metabolites

In COBRApy metabolites are objects that can be inspected. These objects can contain usefull information for all metabolites in the model, such as:
- name;
- chemical formula;
- compartment;
- reactions;

In [None]:
#inspecting metabolite name, chemical formula, compartment and reactions.
metabolite = model.metabolites.get_by_id('13dpg_c')

print(metabolite.name, '\n')
print(metabolite.formula, '\n')
print('Metabolite Compartment:',metabolite.compartment, '\n')
for _reaction in metabolite.reactions:
    print(_reaction.id, ':', reaction.reaction)

### Genes

In COBRApy metabolites are objects that can be inspected. These objects can contain usefull information for all metabolites in the model, such as:
- name;
- reactions;

In [None]:
#inspect gene name and reactions.
gene = model.genes.get_by_id('b0351')

print(gene.name, '\n')
for reaction_ in gene.reactions:
    print(reaction_.id, ':', reaction.gene_reaction_rule)

### Exchanges

In COBRApy exchanges are reaction objects that can be inspected in the model. These special reactions define the environmental condictions (eg. medium) of the model. According to the lower and upper bound of each exchange reactions, one can visualize the model's uptake and secretion metabolites.

In [None]:
#inspecting the reaction name, formula, metabolites, and stoichiometry.
for exchange in model.exchanges:
    print(exchange.name, '->', exchange.bounds)

### Phenotype Prediction

COBRApy includes three different algortithms for phenotype prediction. These include Flux Balance Analysis (FBA), Parsimonious Flux Balance Analysis (pFBA), and Flux Variability Analysis (FVA).

To perform a simulation using one of these methods, you should first define an objective function. This can be a reaction or an exchange, which will be maximized or minimized. By default, the biomass reaction is set as the model objective function as this mimics the biological behavior of most organisms.

In [None]:
model.objective = 'Biomass_Ecoli_core'

#### Flux Balance Analysis (FBA)

FBA simulations can be performed using `model.optimize()`. This returns a solution object, which includes the result of the simulation. This includes:
- objective_value;
- status;
- fluxes;

In [4]:
#performing a FBA simulation
fba_solution = model.optimize()
fba_solution

Unnamed: 0,fluxes,reduced_costs
PFK,7.477382,8.673617e-19
PFL,0.000000,-1.527746e-02
PGI,4.860861,0.000000e+00
PGK,-16.023526,3.469447e-18
PGL,4.959985,0.000000e+00
...,...,...
NADH16,38.534610,0.000000e+00
NADTRHD,0.000000,-2.546243e-03
NH4t,4.765319,0.000000e+00
O2t,21.799493,0.000000e+00


Models solved using the FBA method can be further analysed using the `model.summary()` method.


This method returns the flux value for the reaction defined as objective function. It also returns the fluxes for the exchange reactions. In this table, one can see uptake fluxes, which corresponds to the rates of metabolite cosumption in the model. On the other hand, the secretion fluxes corresponds to the rates of metabolite production in the model.

In [None]:
model.summary()

#### Parsimonious Flux Balance Analysis (pFBA)

pFBA simulations gives the optimal growth rate, while minimizing the total sum of fluxes.
pFBA can be used from the flux analysis package `cobra.flux_analysis.pfba(model)`.

In [5]:
#performing pfba simulation
pfba_solution = cobra.flux_analysis.pfba(model)
pfba_solution

Unnamed: 0,fluxes,reduced_costs
PFK,7.477382,-2.000000
PFL,0.000000,5.733333
PGI,4.860861,-2.000000
PGK,-16.023526,2.000000
PGL,4.959985,-2.000000
...,...,...
NADH16,38.534610,-2.000000
NADTRHD,0.000000,1.422222
NH4t,4.765319,-2.000000
O2t,21.799493,-2.000000


The optimal solution of the pFBA is considerably different from the FBA result. This happens because the objective value for the pFBA is defined as the sum of all flux values (`sum(abs(pfba_solution.fluxes.values))`). On the other hand the FBA result corresponds to the flux value of the reaction that is being optimized (`fba_solution.fluxes["Biomass_Ecoli_core"]`).

In [None]:
#calculating the objective value of a pFBA solution
sum(abs(pfba_solution.fluxes.values))

#### Flux Variability Analysis (FVA)

FBA does not obtain an unique flux distribution for a given objective function, but rathar a space with multiple optimum soltuion. FVA is a simulation method that finds the possible flux range for each reaction while optimizing the objective function. FVA can be used from the flux analysis package `cobra.flux_analysis.flux_variability_analysis(model)`.

In [None]:
#performing fva simulation
fva_solution = cobra.flux_analysis.flux_variability_analysis(model)
fva_solution

### Simulating Deletions

As previously mentioned, COBRApy can be used to simulate gene or reaction knock outs or deletions.

#### Single Reaction and Gene Knock outs

The function `knock_out()` can be used to access what happens when a specific reaction is knocked out and not allowed to have any flux. Considering the previously analysed PFK reaction:

In [None]:
#knock out the ACALD reaction
with model:
    model.reactions.PFK.knock_out()
    print('PFK knocked out: ', model.optimize())

In [None]:
#knock out the b1723 and b3916 genes, which are associated with the pfk reaction
with model:
    model.genes.b1723.knock_out()
    print('b1723 knocked out: ', model.optimize())
    model.genes.b3916.knock_out()
    print('b3916 knocked out: ', model.optimize())

Moreover, COBRApy incorporates to simulation methods that are used predict the flux distribution after a gene knock out. These are the Minimization of Metabolic Adjustment (MOMA), which can be called using `cobra.flux_analysis.moma()`, and Regulatory On/Off Minimization (ROOM), using `cobra.flux_analysis.room()`.

In [None]:
#using MOMA with COBRApy
with model:
    model.genes.b1723.knock_out()
    model.genes.b3916.knock_out()
    moma_result = cobra.flux_analysis.moma(model, fba_solution)
    print('MOMA Result: ', moma_result)

In [None]:
#using ROOM with COBRApy
with model:
    model.genes.b1723.knock_out()
    model.genes.b3916.knock_out()
    room_result = cobra.flux_analysis.room(model, fba_solution)
    print('ROOM Result: ', room_result)

#### Single Deletions

Single gene and reaction delections can also be simulated with the flux analysis package of COBRApy. To do so the `cobra.flux_analysis.single_gene_deletion()` and `cobra.flux_analysis.single_reaction_deletion()` methods.

In [None]:
#single reaction deletion
reaction_deletion_results = cobra.flux_analysis.single_reaction_deletion(model)
reaction_deletion_results

In [None]:
#single gene deletion
gene_deletion_results = cobra.flux_analysis.single_gene_deletion(model)
gene_deletion_results

It is worth noting that genes and reactions with a growth value equal to zero can be considered as essential genes or essential reactions, respectively.

#### Double Delections

Double gene and reaction delections can also be simulated with the flux analysis package of COBRApy. To do so the `cobra.flux_analysis.double_gene_deletion()` and `cobra.flux_analysis.double_reaction_deletion()` methods. These methods test the deletion of all the possible combinations.

In [None]:
#double reaction deletion
double_reaction_deletion_results = cobra.flux_analysis.double_reaction_deletion(model)
double_reaction_deletion_results

In [None]:
#double reaction deletion
double_gene_deletion_results = cobra.flux_analysis.double_gene_deletion(model)
double_gene_deletion_results

### Production envelopes

Production Envelopes can be used to show distinct phases of optimal growth using two different carbon substrates. If we want to access a phenotype phase plane to evaluate uptakes of Glucose and Oxygen:

In [None]:
#How to perform a production envolope
prod_env = cobra.flux_analysis.production_envelope(model, ["EX_glc__D_e", "EX_o2_e"])
prod_env.head()

Moreover, if we want to specify the carbon source, the `production_envelope()` can also return the carbon and mass yield. For instance, when the objective is to produce Acetate, it is possible to obtain a production envelope as follows and quickly plot the results:

In [None]:
#obtain a plot of a production envelope using glucose as carbon source and optimizing Acetate production
prod_env = cobra.flux_analysis.production_envelope(model, ["EX_o2_e"], objective="EX_ac_e", carbon_sources="EX_glc__D_e")
prod_env.plot(kind='line', x='EX_o2_e', y='carbon_yield_maximum')

Finally, the production envelope can also be used obtain a flux variation plot. This can be achieved as follows:

In [None]:
#obtain a flux varariation plot
prod_env_co2 = cobra.flux_analysis.production_envelope(model, ["Biomass_Ecoli_core"], objective=['EX_co2_e'])
prod_env_co2.plot(kind='line', x='Biomass_Ecoli_core', y=['flux_minimum', 'flux_maximum'])

### Pathway Visualization

COBRApy does not include any pathway visualization method. Nevertheless, independent tools that work with cobra models can be use for that goal. One of such tools is named Escher.

Escher allows the visualization of metabolic pathway maps. Escher maps can be build using the `escher.Build()` function, which requires the model and a FBA solution.

To obtain the metabolic map for the model in question, Escher can be run as follows:

In [None]:
import escher

builder = escher.Builder(map_name='e_coli_core.Core metabolism', model=model, reaction_data=fba_solution.fluxes)
builder