## Introduction to COBRApy

First, we will load a model for E. coli, and then we will take a look at a few reactions and metabolites. Later, we will run linear programming optimizations and interpret the results. Then we will perform experiments and see how metabolism is affected.

To cobrapy and visualization tools, create a new conda environment and use pip:

```
conda create -n cobra matplotlib numpy scipy pandas sympy jupyterlab nodejs
conda activate cobra
pip install cobra
pip install escher
jupyter labextension install @jupyter-widgets/jupyterlab-manager
jupyter labextension install escher
```

In [None]:
from cobra.io import read_sbml_model

This module can read models in SBML format. See the docs for functions to load from .mat format among others.

In [None]:
model = read_sbml_model('data/iJO1366.xml.gz')

In [None]:
model

We see that this model contains over two thousand reactions. It is set to optimize a biomass equation.

While you can put commands in your notebook, it will be a lot easier to create an interactive terminal linked to the notebook while you figure out how to manipulate the model. Right click on the notebook name above, and select "New console for notebook." You can write in command and code blocks, and run them with `shift+enter`.

A model is an object that is a collection of metabolite objects, reaction objects, compartment objects, and methods (like optimize). You can search for metabolites or reactions with the query method.

model.metabolites calls all associated metabolites in the model. There are a lot! Let's try to find the code for fructose.

We can search for those containing fru in the name as shown below.

In [None]:
model.metabolites.query("fru")

Ok, this narrows it down. But it still isn't clear exactly what each code is. Each metabolite has a list of attributes, including name. Let's use a list comprehension iterate through the results in the cell above and print the name for each.

In [None]:
[print(meta,":",meta.name) for meta in model.metabolites.query("fru")]

We see that fru_c, fru_e, and fru_p are both fructose. What is the difference? "_c", "_e", and "_p" tell us that one represents fructose in the cystoplasm, extracellularly, and in the periplasm. Let's look up the reactions that involve fructose in the cytoplasm. These are listed as an attribute with the metabolite name.

In [None]:
model.metabolites.fru_c.reactions

Ok, four reactions, but the names aren't always so useful. Let's use another loop to get names. This time, I'm using a for loop, but really it is the same as list comprehension.

In [None]:
for reaction in model.metabolites.fru_c.reactions:
    print(reaction, reaction.name)

We see some nice transformations here, but how does this link to fructose in the periplasm? Let's see what reactions that state is involved in.

In [None]:
for reaction in model.metabolites.fru_p.reactions:
    print(reaction, reaction.name)

These reactions tell us something interesting about E. coli: there are no direct fructose transporters! Instead, we see in the FRUptspp and FRUpts2pp reaction that periplasmic fructose is transported into the cytosol coincident with its phosphorylation. This uses cytoplasmic phosphoenylpyruvate (PEP). For completeness, we should expect to see a zero-order exchange reaction introducing fru_e into the system. Let's check.

In [None]:
for reaction in model.metabolites.fru_e.reactions:
    print(reaction, reaction.name)

Many models use negative values for metabolite uptake into the system, so here if the flux for fru_e --> was -10, it means fructose is entering the system.

# Performing flux balance analysis

Let's take a closer look at the default objective function for this model, and then run a simulation and see some fit fluxes.

In [None]:
model.objective.expression

We seem to be optimizing for a core biomass equation. Notice how we are optimizing for the net forward direction. Let's take a closer look at what the biomass equation is:

In [None]:
model.reactions.BIOMASS_Ec_iJO1366_core_53p95M.reaction

Complicated! Biomass functions like this are very carefully considered and tuned to match experimental conditions. This expression captures what it takes for *E. coli* to grow and divide. 

Notice how this equation is not mass balanced! Metabolites like amino acids (met, his, ile, gln, etc.) are consumed and are lost from the system.

## Constraints

Ok, so our stoichiometry matrix derives from the reactions, our objective function is defined. What about our constraints? These are encoded in the reaction objects. You can view or modify them. Let's look at the bounds for the F6PP reaction

In [None]:
model.reactions.F6PP.bounds

1000 is a fairly high upper bound, practically unbounded. Scroll up a few blocks to where we looked at the F6PP reaction. Notice that it is irreversible? That is consistent with the lower bound 0.

For fun, set the lower bound to a negative number (you can use the `=` operator and input a tuple like (-10,1000). Then, view the reaction again with `model.reactions.F6PP.reaction` and see what happens.

## Media

The growth media or surrounding environment can be defined by exchange reactions. You can view these reactions as below.

In [None]:
model.medium

We see that many micronutrients are in abundance. Glucose, however, is limiting with 10, and there is no other carbon source supplied. Importantly, O2 is not limiting, so this is aerobic growth. EX_cbl1_e is also interesting, let's see what this is.

In [None]:
model.metabolites.cbl1_e.name

Cobalamin is vitamin B12.

Enough chit-chat, let's optimize this cell.

In [None]:
solution = model.optimize()

In [None]:
solution

The solution object contains fluxes, shadow costs, and reduced costs for each reaction. We see that the objective solved to 0.982. That's cool, but without some graphs or variables, it doesn't tell us much.

If you installed escher and used the Jupyter widget, we can plot our fluxes.

In [None]:
import escher

In [None]:
escher.list_available_maps() # Let's see what models come with escher by default

In [None]:
escher.Builder(map_name='iJO1366.Central metabolism',
                   reaction_data=dict(solution.fluxes))

Pretty cool, eh? Let's try something. What if the cell didn't have oxygen?

In [None]:
newmodel = model.copy()


In [None]:
medium = newmodel.medium
medium["EX_o2_e"] = 0.0
newmodel.medium = medium

newmodel.medium

In [None]:
anaerobic = newmodel.optimize()

In [None]:
anaerobic

Growth rate took a big hit! Let's map it

In [None]:
escher.Builder(map_name='iJO1366.Central metabolism',
                   reaction_data=dict(anaerobic.fluxes))

Pretty cool. We see a huge increase in glycolysis. Makes sense!

Now let's extract specific fluxes from the simulation and see how they vary with constraints. Let's test the effect of glucose availability on the media. Let's also see how much oxygen is sucked up by the cells, and how much acetate and lactate are secreted from the cell.

In [None]:
medium = model.medium
testvalues = [0.01, 0.1, 0.5, 1, 5, 10, 50, 100, 200, 500, 1000]
output = [] # empty lists to start collecting output
oxygen = []
acetate = []
lactate = []
with model: # doesn't overwrite the original model
    for i in testvalues:
        medium['EX_glc__D_e'] = i
        model.medium = medium
        solution = model.optimize()
        output.append(solution.objective_value)
        oxygen.append(-solution.fluxes['EX_o2_e'])
        acetate.append(solution.fluxes['EX_ac_e'])
        lactate.append(solution.fluxes['EX_lac__L_e'])

In [None]:
import matplotlib.pyplot as plt

In [None]:
plt.plot(testvalues, output, "o-")
plt.xlabel("Maximum Glucose Uptake")
plt.ylabel("Growth rate")
plt.title("Media composition and growth")

In [None]:
plt.plot(testvalues, oxygen, "o-", label = "Oxygen uptake")
plt.plot(testvalues, lactate, "o-", label = "Lactate sec.")
plt.plot(testvalues, acetate, "o-", label = "Acetate sec.")
plt.legend()
plt.xlabel("Maximum Glucose Uptake")
plt.ylabel("Flux")
plt.title("Media composition and oxygen consumption")

Let's try something else. Let's keep the glucose availability in the media constant, and then force the cells to uptake increasing amounts of oxygen. Oxygen is good, right?

In [None]:
medium = model.medium
testvalues = [0.01, 0.1, 0.5, 1, 5, 10, 50, 100, 200, 250, 260, 280, 300]
output = []
oxygen = []
gluuptake = []
with model: # doesn't overwrite the original model
    for i in testvalues:
        model.reactions.EX_o2_e.bounds = (-i, -i) # Negative means O2 is entering the cell
        solution = model.optimize()
        output.append(solution.objective_value)

In [None]:
plt.plot(testvalues, output, "o-")
plt.xlabel("Oxygen Uptake")
plt.ylabel("Growth")
plt.title("Oxygen uptake and growth")

In [None]:
from cobra.flux_analysis import production_envelope
prod_env = production_envelope(
    model, ["EX_o2_e"], objective="EX_ac_e", carbon_sources="EX_glc__D_e")

In [None]:
prod_env.plot(
    kind='line', x='EX_o2_e', y='carbon_yield_maximum');

In [None]:
prod_env

The production envelope gives us the same information as our manual experiment. Note that the sign for glucose uptake is inverted in the COBRA function; you'll see that I added a negative sign in my code to provide opposite values. The exchange reaction has metabolites leaving the environment, so they will appear negative when they enter.