# Getting Started

## Install Cobrapy Package

In [3]:
!pip install cobra

Collecting cobra
[?25l  Downloading https://files.pythonhosted.org/packages/d3/94/12ae5e185ad8e6bb71cd240988df3cb1378261d5bedcca3f28229cb0eb59/cobra-0.14.2-py2.py3-none-any.whl (1.6MB)
[K    100% |████████████████████████████████| 1.6MB 14.7MB/s 
Collecting depinfo (from cobra)
  Downloading https://files.pythonhosted.org/packages/1d/a3/c5d91c1e91a0e3c61c19563e224bda4f22fc24b6a86ef1c7e00f66ea9c4e/depinfo-1.5.1-py2.py3-none-any.whl
Collecting swiglpk (from cobra)
[?25l  Downloading https://files.pythonhosted.org/packages/ff/37/b0375c9e9a1263820637050cdf8f9bf5cedff4e7fafd2f7f393b26c14150/swiglpk-4.65.0-cp36-cp36m-manylinux1_x86_64.whl (627kB)
[K    100% |████████████████████████████████| 634kB 27.1MB/s 
Collecting ruamel.yaml>=0.15 (from cobra)
[?25l  Downloading https://files.pythonhosted.org/packages/36/e1/cc2fa400fa5ffde3efa834ceb15c464075586de05ca3c553753dcd6f1d3b/ruamel.yaml-0.15.89-cp36-cp36m-manylinux1_x86_64.whl (651kB)
[K    100% |████████████████████████████████| 655kB 26

## Loading a model and inspecting it

To begin with, cobrapy comes with bundled models for _Salmonella_ and _E. coli_, as well as a "textbook" model of _E. coli_ core metabolism. To load a test model, type

In [0]:
from __future__ import print_function

import cobra
import cobra.test

# "ecoli" and "salmonella" are also valid arguments
model = cobra.test.create_test_model("textbook")

The reactions, metabolites, and genes attributes of the cobrapy model are a special type of list called a `cobra.DictList`, and each one is made up of `cobra.Reaction`, `cobra.Metabolite` and `cobra.Gene` objects respectively.

In [5]:
print(len(model.reactions))
print(len(model.metabolites))
print(len(model.genes))

95
72
137


When using [Jupyter notebook](https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/) this type of information is rendered as a table.

In [6]:
model

0,1
Name,e_coli_core
Memory address,0x07f3e3b99a080
Number of metabolites,72
Number of reactions,95
Objective expression,1.0*Biomass_Ecoli_core - 1.0*Biomass_Ecoli_core_reverse_2cdba
Compartments,"cytosol, extracellular"


Just like a regular list, objects in the `DictList` can be retrieved by index. For example, to get the 30th reaction in the model (at index 29 because of [0-indexing](https://en.wikipedia.org/wiki/Zero-based_numbering)):

In [7]:
model.reactions[29]

0,1
Reaction identifier,EX_glu__L_e
Name,L-Glutamate exchange
Memory address,0x07f3e38248390
Stoichiometry,glu__L_e --> L-Glutamate -->
GPR,
Lower bound,0.0
Upper bound,1000.0


Additionally, items can be retrieved by their `id` using the `DictList.get_by_id()` function. For example, to get the cytosolic atp metabolite object (the id is "atp_c"), we can do the following:

In [8]:
model.metabolites.get_by_id("atp_c")

0,1
Metabolite identifier,atp_c
Name,ATP
Memory address,0x07f3e382b0828
Formula,C10H12N5O13P3
Compartment,c
In 13 reaction(s),"ACKr, PFK, PGK, Biomass_Ecoli_core, PYK, GLNS, SUCOAS, ATPM, ADK1, PPCK, GLNabc, ATPS4r, PPS"


As an added bonus, users with an interactive shell such as IPython will be able to tab-complete to list elements inside a list. While this is not recommended behavior for most code because of the possibility for characters like "-" inside ids, this is very useful while in an interactive prompt:

In [9]:
model.reactions.EX_glc__D_e.bounds

(-10.0, 1000.0)

## Reactions

We will consider the reaction glucose 6-phosphate isomerase, which interconverts glucose 6-phosphate and fructose 6-phosphate. The reaction id for this reaction in our test model is PGI.

In [10]:
pgi = model.reactions.get_by_id("PGI")
pgi

0,1
Reaction identifier,PGI
Name,glucose-6-phosphate isomerase
Memory address,0x07f3e381aa240
Stoichiometry,g6p_c <=> f6p_c  D-Glucose 6-phosphate <=> D-Fructose 6-phosphate
GPR,b4025
Lower bound,-1000.0
Upper bound,1000.0


We can view the full name and reaction catalyzed as strings

In [11]:
print(pgi.name)
print(pgi.reaction)

glucose-6-phosphate isomerase
g6p_c <=> f6p_c


We can also view reaction upper and lower bounds. Because the `pgi.lower_bound` < 0, and `pgi.upper_bound` > 0, `pgi` is reversible.

In [12]:
print(pgi.lower_bound, "< pgi <", pgi.upper_bound)
print(pgi.reversibility)

-1000.0 < pgi < 1000.0
True


We can also ensure the reaction is mass balanced. This function will return elements which violate mass balance. If it comes back empty, then the reaction is mass balanced.

In [13]:
pgi.check_mass_balance()

{}

In order to add a metabolite, we pass in a `dict` with the metabolite object and its coefficient

In [14]:
pgi.add_metabolites({model.metabolites.get_by_id("h_c"): -1})
pgi.reaction

'g6p_c + h_c <=> f6p_c'

The reaction is no longer mass balanced

In [15]:
pgi.check_mass_balance()

{'H': -1.0, 'charge': -1.0}

We can remove the metabolite, and the reaction will be balanced once again.

In [16]:
pgi.subtract_metabolites({model.metabolites.get_by_id("h_c"): -1})
print(pgi.reaction)
print(pgi.check_mass_balance())

g6p_c <=> f6p_c
{}


It is also possible to build the reaction from a string. However, care must be taken when doing this to ensure reaction id's match those in the model. The direction of the arrow is also used to update the upper and lower bounds.

In [17]:
pgi.reaction = "g6p_c --> f6p_c + h_c + green_eggs + ham"

unknown metabolite 'green_eggs' created
unknown metabolite 'ham' created


In [18]:
pgi.reaction

'g6p_c --> f6p_c + green_eggs + h_c + ham'

In [19]:
pgi.reaction = "g6p_c <=> f6p_c"
pgi.reaction

'g6p_c <=> f6p_c'

## Metabolites

We will consider cytosolic atp as our metabolite, which has the id `"atp_c"` in our test model.

In [20]:
atp = model.metabolites.get_by_id("atp_c")
atp

0,1
Metabolite identifier,atp_c
Name,ATP
Memory address,0x07f3e382b0828
Formula,C10H12N5O13P3
Compartment,c
In 13 reaction(s),"ACKr, PFK, PGK, Biomass_Ecoli_core, PYK, GLNS, SUCOAS, ATPM, ADK1, PPCK, GLNabc, ATPS4r, PPS"


We can print out the metabolite name and compartment (cytosol in this case) directly as string.

In [21]:
print(atp.name)
print(atp.compartment)

ATP
c


We can see that ATP is a charged molecule in our model.

In [22]:
atp.charge

-4

We can see the chemical formula for the metabolite as well.

In [23]:
print(atp.formula)

C10H12N5O13P3


The reactions attribute gives a `frozenset` of all reactions using the given metabolite. We can use this to count the number of reactions which use atp.

In [24]:
len(atp.reactions)

13

A metabolite like glucose 6-phosphate will participate in fewer reactions.

In [25]:
model.metabolites.get_by_id("g6p_c").reactions

frozenset({<Reaction Biomass_Ecoli_core at 0x7f3e3821fc88>,
           <Reaction G6PDH2r at 0x7f3e381e3198>,
           <Reaction GLCpts at 0x7f3e381e3be0>,
           <Reaction PGI at 0x7f3e381aa240>})

## Genes

The `gene_reaction_rule` is a boolean representation of the gene requirements for this reaction to be active as described in [Schellenberger et al 2011 Nature Protocols 6(9):1290-307](http://dx.doi.org/doi:10.1038/nprot.2011.308).

The GPR is stored as the gene_reaction_rule for a Reaction object as a string.

In [26]:
gpr = pgi.gene_reaction_rule
gpr

'b4025'

Corresponding gene objects also exist. These objects are tracked by the reactions itself, as well as by the model

In [27]:
pgi.genes

frozenset({<Gene b4025 at 0x7f3e3827f278>})

In [28]:
pgi_gene = model.genes.get_by_id("b4025")
pgi_gene

0,1
Gene identifier,b4025
Name,pgi
Memory address,0x07f3e3827f278
Functional,True
In 1 reaction(s),PGI


Each gene keeps track of the reactions it catalyzes

In [29]:
pgi_gene.reactions

frozenset({<Reaction PGI at 0x7f3e381aa240>})

Altering the gene_reaction_rule will create new gene objects if necessary and update all relationships.

In [30]:
pgi.gene_reaction_rule = "(spam or eggs)"
pgi.genes

frozenset({<Gene eggs at 0x7f3e38206208>, <Gene spam at 0x7f3e38206be0>})

In [31]:
pgi_gene.reactions

frozenset()

Newly created genes are also added to the model

In [32]:
model.genes.get_by_id("spam")

0,1
Gene identifier,spam
Name,
Memory address,0x07f3e38206be0
Functional,True
In 1 reaction(s),PGI


The `delete_model_genes` function will evaluate the GPR and set the upper and lower bounds to 0 if the reaction is knocked out. This function can preserve existing deletions or reset them using the `cumulative_deletions` flag.

In [33]:
cobra.manipulation.delete_model_genes(
    model, ["spam"], cumulative_deletions=True)
print("after 1 KO: %4d < flux_PGI < %4d" % (pgi.lower_bound, pgi.upper_bound))

cobra.manipulation.delete_model_genes(
    model, ["eggs"], cumulative_deletions=True)
print("after 2 KO:  %4d < flux_PGI < %4d" % (pgi.lower_bound, pgi.upper_bound))

after 1 KO: -1000 < flux_PGI < 1000
after 2 KO:     0 < flux_PGI <    0


The undelete_model_genes can be used to reset a gene deletion

In [34]:
cobra.manipulation.undelete_model_genes(model)
print(pgi.lower_bound, "< pgi <", pgi.upper_bound)

-1000 < pgi < 1000
