# Using an SBML model

## Getting started

### Installing libraries

Before you start, you will need to install a couple of libraries:
   
The [ModelSeedDatabase](https://github.com/ModelSEED/ModelSEEDDatabase) has all the biochemistry we'll need. You can install that with `git clone`.
   
The [PyFBA](http://linsalrob.github.io/PyFBA) library has detailed [installation instructions](http://linsalrob.github.io/PyFBA/installation.html). Don't be scared, its mostly just `pip install`.

(Optional) Also, get the [SEED Servers](https://github.com/linsalrob/SEED_Servers_Python) as you can get a lot of information from them. You can install the git python repo from github.  Make sure that the SEED_Servers_Python is in your PYTHONPATH.

We start with importing some modules that we are going to use. 

We import *sys* so that we can use standard out and standard error if we have some error messages.<br>
We import *copy* so that we can make a deep copy of data structures for later comparisons.<br>
Then we import the *PyFBA* module to get started.

In [1]:
import sys
import os
import copy
import PyFBA
import pickle

## Sharing the data

If you set this variable to true, we will export some of the data, as either `txt` files or `pickle` files, and then you can import them into other notebooks to explore the data

In [2]:
share_data = True

## Running an SBML model

If you have run your genome through RAST, you can download the [SBML](http://www.sbml.org/) model and use that directly.

We have provided an [SBML model of *Citrobacter sedlakii*](https://raw.githubusercontent.com/linsalrob/PyFBA/master/example_data/Citrobacter/Citrobacter_sedlakii.sbml) that you can download and use. You can right-ctrl click on this link and save the SBML file in the same location you are running this iPython notebook.

We use this SBML model to demonstrate the key points of the FBA approach: defining the reactions, including the boundary, or drainflux, reactions; the compounds, including the drain compounds; the media; and the reaction bounds. 

We'll take it step by step!

We start by parsing the model:

In [3]:
sbml = PyFBA.parse.parse_sbml_file("../example_data/Citrobacter/Citrobacter_sedlakii.sbml")

We are logging to /home/redwards/GitHubsLinux/PyFBA/iPythonNotebooks/logs/PyFBA.2021-06-16T10:19:22.389049.log


### Find all the reactions and identify those that are boundary reactions

We need a set of reactions to run in the model. In this case, we are going to run all the reactions in our SBML file. However, you can change this set if you want to knock out reactions, add reactions, or generally modify the model. We store those in the `reactions_to_run` set.

The boundary reactions refer to compounds that are secreted but then need to be removed from the `reactions_to_run` set. We usually include a consumption of those compounds that is open ended, as if they are draining away. We store those reactions in the `uptake_secretion_reactions` dictionary.


In [4]:
# Get a dict of reactions.
# The key is the reaction ID, and the value is a metabolism.reaction.Reaction object
reactions = sbml.reactions
reactions_to_run = set()
uptake_secretion_reactions = {}
biomass_equation = None
for r in reactions:
    if 'biomass_equation' == r:
        biomass_equation = reactions[r]
        print(f"Our biomass equation is {biomass_equation.readable_name}")
        continue
    is_boundary = False
    for c in reactions[r].all_compounds():
        if c.uptake_secretion:
            is_boundary = True
            break
    if is_boundary:
        reactions[r].is_uptake_secretion = True
        uptake_secretion_reactions[r] = reactions[r]
    else:
        reactions_to_run.add(r)

Our biomass equation is Citrobacter_sedlakii_119_auto_biomass


At this point, we can take a look at how many reactions are in the model, not counting the biomass reaction:

In [5]:
print(f"The biomass equation is {biomass_equation}")
print(f"There are {len(reactions)} reactions in the model")
print(f"There are {len(uptake_secretion_reactions)} uptake/secretion reactions in the model")
print(f"There are {len(reactions_to_run)} reactions to be run in the model")

The biomass equation is biomass_equation: Citrobacter_sedlakii_119_auto_biomass
There are 1574 reactions in the model
There are 174 uptake/secretion reactions in the model
There are 1399 reactions to be run in the model


In [6]:
if share_data:
    with open('sbml_reactions.txt', 'w') as out:
        for r in reactions:
            out.write(f"{r}\n")

### Find all the compounds in the model, and filter out those that are secreted

We need to filter out uptake and secretion compounds from our list of all compounds before we can make a stoichiometric matrix.

In [7]:
all_compounds = sbml.compounds
# Filter for compounds that are boundary compounds
filtered_compounds = set()
for c in all_compounds:
    if not c.uptake_secretion:
        filtered_compounds.add(c)

Again, we can see how many compounds there are in the model.

In [8]:
print(f"There are {len(all_compounds)} total compounds in the model")
print(f"There are {len(filtered_compounds)} compounds that are not involved in uptake and secretion")

There are 1475 total compounds in the model
There are 1301 compounds that are not involved in uptake and secretion


And now we have the size of our stoichiometric matrix! Notice that the stoichiometric matrix is composed of the reactions that we are going to run and the compounds that are in those reactions (but not the uptake/secretion reactions and compounds).

In [9]:
print(f"The stoichiometric matrix will be {len(reactions_to_run):,} reactions by {len(filtered_compounds):,} compounds")

The stoichiometric matrix will be 1,399 reactions by 1,301 compounds


### Read the media file, and correct the media names

In our [media](https://github.com/linsalrob/PyFBA/tree/master/media) directory, we have a lot of different media formulations, most of which we use with the Genotype-Phenotype project. For this example, we are going to use Lysogeny Broth (LB). There are many different formulations of LB, but we have included the recipe created by the folks at Argonne so that it is comparable with their analysis. You can download [ArgonneLB.txt](https://raw.githubusercontent.com/linsalrob/PyFBA/master/media/ArgonneLB.txt) and put it in the same directory as this iPython notebook to run it.

Once we have read the file we need to correct the names in the compounds. Sometimes when compound names are exported to the SBML file they are modified slightly. This just corrects those names.

In [10]:
# Read the media file
#media = PyFBA.parse.read_media_file("/home/redwards/.local/lib/python3.9/site-packages/PyFBA-2.1-py3.9.egg/PyFBA/Biochemistry/media/ArgonneLB.txt")
# mediafile = "MOPS_NoC_L-Methionine"
mediafile = 'ArgonneLB'
# ediafile = 'MOPS_NoC_D-Glucose'
media = PyFBA.parse.pyfba_media(mediafile)
# Correct the names
media = sbml.correct_media(media)
print(f"The media has {len(media)} compounds")

The media has 65 compounds


Checking media compounds: Our compounds do not include  Vitamin B12
Checking media compounds: Our compounds do not include  chromate
Checking media compounds: Our compounds do not include  Molybdate
Checking media compounds: Our compounds do not include  L-Cystine
Checking media compounds: Our compounds do not include  Fe3+
Checking media compounds: Our compounds do not include  Ni2+
Checking media compounds: Our compounds do not include  Thiamine phosphate
It just means that we did not find that compound anywhere in the reactions, and so it is unlikely to be
needed or used. We typically see a few of these in rich media.


### Set the reaction bounds for uptake/secretion compounds

The uptake and secretion compounds typically have reaction bounds that allow them to be consumed (i.e. diffuse away from the cell) but not produced. However, our media components can also increase in concentration (i.e. diffuse to the cell) and thus the bounds are set higher. Whenever you change the growth media, you also need to adjust the reaction bounds to ensure that the media can be consumed!


In [11]:
# Adjust the lower bounds of uptake secretion reactions
# for things that are not in the media
mcr = 0
for u in uptake_secretion_reactions:
    # just reset the bounds in case we change media and re-run this block
    reactions[u].lower_bound = -1000.0
    uptake_secretion_reactions[u].lower_bound = -1000.0
    reactions[u].upper_bound = 1000.0
    uptake_secretion_reactions[u].upper_bound = 1000.0
    is_media_component = False
    override = False
    for c in uptake_secretion_reactions[u].all_compounds():
        if c in media:
            is_media_component = True

    if is_media_component:
        mcr += 1
    else:
        reactions[u].lower_bound = 0.0
        uptake_secretion_reactions[u].lower_bound = 0.0
        # these are the reactions that allow the media components to flux
        # print(f"{u} {sbml.reactions[u].equation}  ({sbml.reactions[u].lower_bound}, {sbml.reactions[u].upper_bound})")
print(f"There are {mcr} reactions (out of {len(uptake_secretion_reactions)}) with a media component")

There are 46 reactions (out of 174) with a media component


### Run the FBA

Now that we have constructed our model, we can run the FBA!

In [12]:
ms = PyFBA.model_seed.ModelData(compounds = filtered_compounds, reactions = reactions)
status, value, growth = PyFBA.fba.run_fba(ms, reactions_to_run, media, biomass_equation,
                                          uptake_secretion_reactions, verbose=True)
print("The FBA completed with a flux value of {} --> growth: {}".format(value, growth))


csm did not find media compound Vitamin B12 in the compounds database
csm did not find media compound chromate in the compounds database
csm did not find media compound Molybdate in the compounds database
csm did not find media compound L-Cystine in the compounds database
csm did not find media compound Fe3+ in the compounds database
csm did not find media compound Thiamine phosphate in the compounds database
csm did not find media compound Ni2+ in the compounds database
We are loading 1320 rows and 1574 columns


The FBA completed with a flux value of 281.7387771116904 --> growth: True


In parsing the bounds we found 0 media uptake and secretion reactions and 0 other u/s reactions
Length of the media: 65
Number of reactions to run: 1399
Number of compounds in SM: 1320
Number of reactions in SM: 1574
Number of uptake/secretion reactions 174
SMat dimensions: 1320 x 1574


# Export the components of the model

This demonstrates how to export and import the components of this model, so you can do other things with it!

In [13]:
if share_data:
    pickle.dump(filtered_compounds, open('compounds.pickle', 'wb'))
    pickle.dump(reactions, open('reactions.pickle', 'wb'))
    pickle.dump(reactions_to_run, open('reactions_to_run.pickle', 'wb'))
    pickle.dump(media, open('media.pickle', 'wb'))
    pickle.dump(biomass_equation, open('sbml_biomass.pickle', 'wb'))
    pickle.dump(uptake_secretion_reactions, open('uptake_secretion_reactions.pickle', 'wb'))

In [14]:
if share_data:
    sbml_filtered_compounds = pickle.load(open('compounds.pickle', 'rb'))
    sbml_reactions = pickle.load(open('reactions.pickle', 'rb'))
    sbml_reactions_to_run = pickle.load(open('reactions_to_run.pickle', 'rb'))
    sbml_media = pickle.load(open('media.pickle', 'rb'))
    sbml_biomass_equation = pickle.load(open('sbml_biomass.pickle', 'rb'))
    sbml_uptake_secretion_reactions = pickle.load(open('uptake_secretion_reactions.pickle', 'rb'))
    
    ms = PyFBA.model_seed.ModelData(compounds = sbml_filtered_compounds, reactions = sbml_reactions)
    status, value, growth = PyFBA.fba.run_fba(ms, sbml_reactions_to_run, sbml_media, sbml_biomass_equation,
                                          sbml_uptake_secretion_reactions, verbose=True)
    print("The FBA completed with a flux value of {} --> growth: {}".format(value, growth))

csm did not find media compound Vitamin B12 in the compounds database
csm did not find media compound chromate in the compounds database
csm did not find media compound Molybdate in the compounds database
csm did not find media compound L-Cystine in the compounds database
csm did not find media compound Fe3+ in the compounds database
csm did not find media compound Thiamine phosphate in the compounds database
csm did not find media compound Ni2+ in the compounds database
We are loading 1320 rows and 1574 columns


The FBA completed with a flux value of 281.7387771116904 --> growth: True


In parsing the bounds we found 0 media uptake and secretion reactions and 0 other u/s reactions
Length of the media: 65
Number of reactions to run: 1399
Number of compounds in SM: 1320
Number of reactions in SM: 1574
Number of uptake/secretion reactions 174
SMat dimensions: 1320 x 1574
