# How to create and run a gap-filled FBA from PATRIC

The PATRIC (the Pathosystems Resource Integration Center) contains the best collection of well annotated genomes. They also happen to have been annotated by RAST, and so we should be able to use those integrations directly.

Here we'll walk through taking a genome from PATRIC, building a model, and running it. PATRIC also has model reconstruction built in, but when I tried it (05/24/16) it was not working.

As usual, we'll start by loading some modules that we'll need for our analysis.

In [38]:
import sys
import os
import copy
import PyFBA
import re

In [2]:
import inspect
inspect.getfile(PyFBA)

'/home/redwards/.local/lib/python3.8/site-packages/PyFBA-2.1-py3.8.egg/PyFBA/__init__.py'

# Find a genome and download the annotations

You need to find your genome in PATRIC and download the annotations.

Once you have identified the genome you would like to build the model for, choose _Feature Table_ from the menu bar:
<img src="img/patric_ft.png">

Next, choose _Download_ and save as a _text file (.txt)_. 

<img src="img/patric_dl.png">

That will save a file called _FeatureTable.txt_ to your Downloads location. That file has the following columns:


| Genome | Genome ID | Accession | PATRIC ID | RefSeq Locus Tag | Alt Locus Tag | Feature ID | 
| Annotation | Feature Type | Start | End | Length | Strand | FIGfam ID |
| PATRIC genus-specific families (PLfams) | PATRIC cross-genus families (PGfams) | Protein ID | AA Length | Gene Symbol | Product | GO



The key columns are PATRIC ID (Column 3) and Product (Column 19) [Column numbers are 0 based!]



Now that we know that, we need to convert these feature names into functional roles. The key here is to split on adjoiners, such as ' / ', ' # ', and ' @ '.




In [3]:
assigned_functions = {}
with open(os.path.join('workspace/Citrobacter_sedlakii_genome_features.txt'), 'r') as f:
    for l in f:
        p=l.strip().split("\t")
        assigned_functions[p[3]]=PyFBA.parse.roles_of_function(p[19])
roles = set([i[0] for i in [list(j) for j in assigned_functions.values()]])
print("There are {} unique roles in this genome".format(len(roles)))

There are 3509 unique roles in this genome


Next, we convert those roles to reactions. We start with a dict of roles and reactions, but we only need a list of unique reactions, so we convert the keys to a set.

In [4]:
roles_to_reactions = PyFBA.filters.roles_to_reactions(roles, organism_type="Gram_Negative", verbose=False)

If you toggle `verbose=True`, you will see that there are a lot of roles that we skip, even though we have an EC number for them: for whatever reason, the annotation is not quite right. We can check for those too, because our model seed parsed data has EC numbers with reactions.

In [5]:
# ecr2r = PyFBA.filters.roles_to_ec_reactions(roles, organism_type="Gram_Negative", verbose=False)
ecr2r = set()

We combine `roles_to_reactions` and `ecr2r` and figure out what the unique set of reactions is for our genome.

In [6]:
roles_to_reactions.update(ecr2r)
reactions_to_run = set()
for role in roles_to_reactions:
    reactions_to_run.update(roles_to_reactions[role])
print("There are {}".format(len(reactions_to_run)) +
      " unique reactions associated with this genome".format(len(reactions_to_run)))

There are 1065 unique reactions associated with this genome


### Read all the reactions and compounds in our database

We read all the reactions, compounds, and enzymes in the [ModelSEEDDatabase](https://github.com/ModelSEED/ModelSEEDDatabase) into three data structures. Note, the first time you call this it is a bit slow as it has to parse the files, but if we've parsed them once, we don't need to do it again!

We modify the reactions specifically for Gram negative models (there are also options for Gram positive models, Mycobacterial models, general microbial models, and plant models).

In [7]:
compounds, reactions, enzymes = \
    PyFBA.parse.model_seed.compounds_reactions_enzymes('gramnegative')
print(f"There are {len(compounds):,} compounds, {len(reactions):,} reactions, and {len(enzymes):,} enzymes in total")

There are 33,992 compounds, 43,774 reactions, and 9,423 enzymes in total


In [8]:
for r in reactions:
    for c in reactions[r].all_compounds():
        if c.uptake_secretion:
            print(f"US: {c}")

#### Update reactions to run, making sure that all reactions are in the list!

There are some reactions that come from functional roles that do not appear in the reactions list. We're working on tracking these down, but for now we just check that all reaction IDs in *reactions_to_run* are in *reactions*, too.

In [9]:
tempset = set()
for r in reactions_to_run:
    if r in reactions:
        tempset.add(r)
    else:
        sys.stderr.write("Reaction ID {} is not in our reactions list. Skipped\n".format(r))
reactions_to_run = tempset

### Test whether these reactions grow on ArgonneLB media

We can test whether this set of reactions grows on ArgonneLB media. The media is the same one we used above, and you can download the [ArgonneLB.txt](https://raw.githubusercontent.com/linsalrob/PyFBA/master/media/ArgonneLB.txt) and text file and put it in the same directory as this iPython notebook to run it.

(Note: we don't need to convert the media components, because the media and compounds come from the same source.)

In [10]:
media = PyFBA.parse.read_media_file("/home/redwards/test_media/ArgonneLB.txt")
print("Our media has {} components".format(len(media)))

Our media has 65 components


### Define a biomass equation

The biomass equation is the part that says whether the model will grow! This is a [metabolism.reaction.Reaction](https://github.com/linsalrob/PyFBA/blob/master/PyFBA/metabolism/reaction.py) object.

In [11]:
biomass_equation = PyFBA.metabolism.biomass_equation()

In [12]:
biomass_equation.equation

'(40.11) ATP + (0.008) CO2 + (0.084) CTP + (0.008) CoA + (0.135) GTP + (0.511) Glycine + (0.429) L-Alanine + (0.247) L-Arginine + (0.201) L-Asparagine + (0.201) L-Aspartate + (0.076) L-Cysteine + (0.219) L-Glutamate + (0.219) L-Glutamine + (0.079) L-Histidine + (0.242) L-Isoleucine + (0.376) L-Leucine + (0.286) L-Lysine + (0.128) L-Methionine + (0.155) L-Phenylalanine + (0.185) L-Proline + (0.18) L-Serine + (0.211) L-Threonine + (0.047) L-Tryptophan + (0.115) L-Tyrosine + (0.353) L-Valine + (0.008) NAD + (0.008) NADP + (0.008) S-Adenosyl-L-methionine + (0.015) TTP + (0.091) UTP + (0.015) dATP + (0.015) dCTP + (0.015) dGTP > (40.0) ADP + (1.0) Biomass + (40.0) H + (0.406) PPi  + (39.992) Phosphate + (0.008) apo-ACP'

In [13]:
with open('rbad.txt', 'w') as out:
    for r in reactions_to_run:
        out.write(f"{r}\n")

### Run the FBA

With the reactions, compounds, reactions_to_run, media, and biomass model, we can test whether the model grows on this media.

In [14]:
print(f"Before running FBA there are {len(reactions)} reactions")
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, reactions_to_run,
                                          media, biomass_equation)
print(f"After running FBA there are {len(reactions)} reactions")
print("Initial run has a biomass flux value of {} --> Growth: {}".format(value, growth))

Before running FBA there are 43774 reactions
After running FBA there are 43939 reactions
Initial run has a biomass flux value of 0.0 --> Growth: False


In [15]:
print(f"There are {len(reactions_to_run)} reactions to run")

There are 1065 reactions to run


In [16]:
upsr = 0
for r in reactions_to_run:
    if r.startswith('upsr'):
        upsr += 1
print(f"There are {upsr} uptake secretion reactions in reactions_to_run")
upsr = 0
for r in reactions:
    if r.startswith('upsr'):
        upsr += 1
print(f"There are {upsr} uptake secretion reactions in reactions")

There are 0 uptake secretion reactions in reactions_to_run
There are 165 uptake secretion reactions in reactions


# Will gap filling work?

These are the reactions from the C. sedlakii SBML file, and so if we add these, we should get growth!

In [17]:
sbml_addnl = {'rxn00868', 'rxn01923', 'rxn02268', 'rxn10215', 'rxn10219', 'rxn08089', 'rxn10212', 'rxn08083', 'rxn10214', 'rxn10211', 'rxn10218', 'rxn08086', 'rxn10217', 'rxn08087', 'rxn08088', 'rxn08085', 'rxn10216', 'rxn08084', 'rxn10213', 'rxn05572', 'rxn05565', 'rxn00541', 'rxn10155', 'rxn10157', 'rxn05536', 'rxn05544', 'rxn12848', 'rxn12851', 'rxn05539', 'rxn05541', 'rxn05537', 'rxn05543', 'rxn12849', 'rxn05533', 'rxn05540', 'rxn05534', 'rxn05547', 'rxn05546', 'rxn05542', 'rxn05535', 'rxn12850', 'rxn05545', 'rxn05538', 'rxn05168', 'rxn05179', 'rxn05161', 'rxn03061', 'rxn09313', 'rxn08354', 'rxn08356', 'rxn09315', 'rxn05549', 'rxn05160', 'rxn05644', 'rxn05330', 'rxn05335', 'rxn05334', 'rxn05329', 'rxn05333', 'rxn05332', 'rxn05331', 'rxn05415', 'rxn05381', 'rxn05386', 'rxn05427', 'rxn05431', 'rxn05373', 'rxn05377', 'rxn05398', 'rxn05419', 'rxn05402', 'rxn05369', 'rxn05361', 'rxn05394', 'rxn05406', 'rxn05365', 'rxn05390', 'rxn05423', 'rxn05462', 'rxn05411', 'rxn03492', 'rxn04050', 'rxn08258', 'rxn04713', 'rxn00990', 'rxn00875', 'rxn08471', 'rxn05737', 'rxn08467', 'rxn10067', 'rxn08468', 'rxn08469', 'rxn08470', 'rxn02160', 'rxn05422', 'rxn05372', 'rxn05341', 'rxn05376', 'rxn05342', 'rxn05337', 'rxn05385', 'rxn05397', 'rxn05340', 'rxn05461', 'rxn05368', 'rxn05418', 'rxn05393', 'rxn05336', 'rxn05426', 'rxn05364', 'rxn05430', 'rxn05410', 'rxn05339', 'rxn05401', 'rxn05338', 'rxn05360', 'rxn05414', 'rxn05405', 'rxn05389', 'rxn05380', 'rxn03164', 'rxn05229', 'rxn07586', 'rxn05054', 'rxn04384', 'rxn00503', 'rxn00183', 'rxn05187', 'rxn05515', 'rxn02056', 'rxn09134', 'rxn09125', 'rxn09157', 'rxn09128', 'rxn09142', 'rxn09161', 'rxn09147', 'rxn09164', 'rxn09152', 'rxn09124', 'rxn09131', 'rxn09133', 'rxn09138', 'rxn09143', 'rxn09153', 'rxn09160', 'rxn09158', 'rxn09148', 'rxn09144', 'rxn09150', 'rxn09130', 'rxn09149', 'rxn09163', 'rxn09159', 'rxn09132', 'rxn09127', 'rxn09140', 'rxn09145', 'rxn09137', 'rxn09154', 'rxn09151', 'rxn09146', 'rxn09123', 'rxn09139', 'rxn09126', 'rxn09141', 'rxn09135', 'rxn09136', 'rxn09155', 'rxn09162', 'rxn09129', 'rxn09156', 'rxn02949', 'rxn03241', 'rxn03245', 'rxn02911', 'rxn02167', 'rxn03250', 'rxn02934', 'rxn03240', 'rxn03247', 'rxn05316', 'rxn09687', 'rxn05198', 'rxn09688', 'rxn05199', 'rxn05200', 'rxn09685', 'rxn05318', 'rxn05205', 'rxn05621', 'rxn05656', 'rxn05585', 'rxn05172', 'rxn05594', 'rxn05552', 'rxn05599', 'rxn05512', 'rxn05620', 'rxn01277', 'rxn05518', 'rxn05145', 'rxn05460', 'rxn05396', 'rxn05363', 'rxn05359', 'rxn05367', 'rxn05417', 'rxn05421', 'rxn05392', 'rxn05413', 'rxn05349', 'rxn05388', 'rxn05429', 'rxn05371', 'rxn05400', 'rxn05425', 'rxn05409', 'rxn05404', 'rxn05375', 'rxn05379', 'rxn05384', 'rxn04139', 'rxn00640', 'rxn05507', 'rxn05506', 'rxn01893', 'rxn00671', 'rxn00501', 'rxn10340', 'rxn10334', 'rxn10337', 'rxn10338', 'rxn10341', 'rxn10335', 'rxn10342', 'rxn10339', 'rxn10336', 'rxn00160', 'rxn01285', 'rxn04143', 'rxn01847', 'rxn01103', 'rxn00227', 'rxn05175', 'rxn05163', 'rxn05958', 'rxn05683', 'rxn05484', 'rxn02933', 'rxn04750', 'rxn03244', 'rxn01451', 'rxn03239', 'rxn03246', 'rxn03242', 'rxn03249', 'rxn06777', 'rxn05500', 'rxn01637', 'rxn01122', 'rxn04602', 'rxn02416', 'rxn04601', 'rxn04928', 'rxn05596', 'rxn02775', 'rxn04046', 'rxn07589', 'rxn03491', 'rxn10117', 'rxn10119', 'rxn08333', 'rxn04673', 'rxn10308', 'rxn10311', 'rxn10315', 'rxn10309', 'rxn10307', 'rxn10312', 'rxn10310', 'rxn10314', 'rxn08040', 'rxn10313', 'rxn12147', 'rxn03931', 'rxn03916', 'rxn04674', 'rxn03397', 'rxn10094', 'rxn02286', 'rxn00555', 'rxn08709', 'rxn04052', 'rxn03512', 'rxn04045', 'rxn12224', 'rxn09188', 'rxn02359', 'rxn02008', 'rxn03643', 'rxn09177', 'rxn12512', 'rxn07587', 'rxn02507', 'rxn05202', 'rxn08291', 'rxn06865', 'rxn00303', 'rxn00222', 'rxn09978', 'rxn09979', 'rxn07588', 'rxn03919', 'rxn03435', 'rxn02187', 'rxn02186', 'rxn03436', 'rxn03068', 'rxn05317', 'rxn01219', 'rxn00364', 'rxn03514', 'rxn04048', 'rxn02792', 'rxn00350', 'rxn02791', 'rxn00171', 'rxn01000', 'rxn00675', 'rxn00175', 'rxn00986', 'rxn03932', 'rxn08712', 'rxn04113', 'rxn04996', 'rxn08756', 'rxn08352', 'rxn06023', 'rxn03136', 'rxn00800', 'rxn05165', 'rxn05181', 'rxn08194', 'rxn09180', 'rxn00670', 'rxn00173', 'rxn03644', 'rxn08619', 'rxn09289', 'rxn00776', 'rxn01360', 'rxn08335', 'rxn08336', 'rxn12500', 'rxn02287', 'rxn02774', 'rxn09167', 'rxn08708', 'rxn05156', 'rxn05151', 'rxn01629', 'rxn12146', 'rxn01123', 'rxn05147', 'rxn05173', 'rxn08707', 'rxn00927', 'rxn01299', 'rxn01226', 'rxn01545', 'rxn02476', 'rxn02011', 'rxn05201', 'rxn01895', 'rxn04604', 'rxn00830', 'rxn01403', 'rxn00179', 'rxn03991', 'rxn03990', 'rxn03975', 'rxn03974', 'rxn00818', 'rxn03838', 'rxn00817', 'rxn02596', 'rxn05555', 'rxn00056', 'rxn00212', 'rxn06979', 'rxn11544', 'rxn03918', 'rxn05559', 'rxn08345', 'rxn00509', 'rxn00006', 'rxn00834', 'rxn05293', 'rxn00634', 'rxn08618', 'rxn06848', 'rxn09997', 'rxn05938', 'rxn04783', 'rxn05206', 'rxn00102', 'rxn05937', 'rxn01644', 'rxn02938', 'rxn00792', 'rxn08711', 'rxn03513', 'rxn04047', 'rxn01265', 'rxn03394', 'rxn00777', 'rxn01106', 'rxn07492', 'rxn03538', 'rxn01480', 'rxn00119', 'rxn01517', 'rxn01966', 'rxn01132', 'rxn05162', 'rxn02277', 'rxn08257', 'rxn01352', 'rxn03540', 'rxn00789', 'rxn00508', 'rxn04386', 'rxn10481', 'rxn05528', 'rxn06077', 'rxn01671', 'rxn02929', 'rxn03917', 'rxn03135', 'rxn00469', 'rxn00791', 'rxn00756', 'rxn03087', 'rxn01329', 'rxn01917', 'rxn01879', 'rxn02285', 'rxn08710', 'rxn07438', 'rxn02321', 'rxn00787', 'rxn01289', 'rxn00851', 'rxn05297', 'rxn00062', 'rxn04132', 'rxn04133', 'rxn05319', 'rxn05467', 'rxn05468', 'rxn02374', 'rxn03012', 'rxn05064', 'rxn02666', 'rxn04457', 'rxn04456', 'rxn01664', 'rxn02916', 'rxn05667', 'rxn10571', 'rxn05195', 'rxn05645', 'rxn05144', 'rxn02988', 'rxn01256', 'rxn12604', 'rxn05039', 'rxn10904', 'rxn05499', 'rxn01152', 'rxn05691', 'rxn12893', 'rxn11116', 'rxn00880', 'rxn05593', 'rxn05469', 'rxn00186', 'rxn05694', 'rxn05491', 'rxn05682', 'rxn01748', 'rxn00327', 'rxn01746', 'rxn09656'}

In [18]:
r2r_plussbml = copy.copy(reactions_to_run)
print(f"Before adding sbml reactions there were {len(r2r_plussbml)}")
r2r_plussbml.update(sbml_addnl)
print(f"After adding sbml reactions there were {len(r2r_plussbml)}")

Before adding sbml reactions there were 1065
After adding sbml reactions there were 1544


In [19]:
print(f"Before running FBA there are {len(reactions)} reactions")
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, r2r_plussbml,
                                          media, biomass_equation, verbose=True)
print(f"After running FBA there are {len(reactions)} reactions")
print("Initial run has a biomass flux value of {} --> Growth: {}".format(value, growth))

Before running FBA there are 43939 reactions
After running FBA there are 44014 reactions
Initial run has a biomass flux value of 0.0 --> Growth: False


In [20]:
print(f"Before adding upsr reactions there were {len(r2r_plussbml)} reactions")
for r in reactions:
    if r.startswith('upsr'):
        r2r_plussbml.update({r})
print(f"After adding upsr reactions there were {len(r2r_plussbml)} reactions")

Before adding upsr reactions there were 1544 reactions
After adding upsr reactions there were 1784 reactions


In [21]:
print(f"Before running FBA there are {len(reactions)} reactions")
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, r2r_plussbml,
                                          media, biomass_equation, verbose=True)
print(f"After running FBA there are {len(reactions)} reactions")
print("Initial run has a biomass flux value of {} --> Growth: {}".format(value, growth))

Before running FBA there are 44014 reactions


In the model there are : 1804 compounds and 2026 reactions
We are logging to /home/redwards/GitHubsLinux/PyFBA/iPythonNotebooks/PyFBA.2021-06-01T08:42:47.350536.log
We are loading 1804 rows and 2026 columns


After running FBA there are 44015 reactions
Initial run has a biomass flux value of 0.0 --> Growth: False


Length of the media: 65
Number of reactions to run: 1784
Number of compounds in SM: 1804
Number of reactions in SM: 2026
Revised number of total reactions: 44015
Number of total compounds: 33992
SMat dimensions: 1804 x 2026


In [22]:
# seems like we need EX_cpd00034

In [23]:
upsr = 0
for r in reactions_to_run:
    if r.startswith('EX'):
        upsr += 1
print(f"There are {upsr} EX reactions in reactions_to_run")
upsr = 0
for r in reactions:
    if r.startswith('EX'):
        upsr += 1
print(f"There are {upsr} EX reactions in reactions")

There are 0 EX reactions in reactions_to_run
There are 0 EX reactions in reactions


In [24]:
biomass_equation = PyFBA.metabolism.biomass_equation('standard')
biomass_equation.equation

'(40.11) ATP + (0.008) CO2 + (0.084) CTP + (0.008) CoA + (0.135) GTP + (0.511) Glycine + (0.429) L-Alanine + (0.247) L-Arginine + (0.201) L-Asparagine + (0.201) L-Aspartate + (0.076) L-Cysteine + (0.219) L-Glutamate + (0.219) L-Glutamine + (0.079) L-Histidine + (0.242) L-Isoleucine + (0.376) L-Leucine + (0.286) L-Lysine + (0.128) L-Methionine + (0.155) L-Phenylalanine + (0.185) L-Proline + (0.18) L-Serine + (0.211) L-Threonine + (0.047) L-Tryptophan + (0.115) L-Tyrosine + (0.353) L-Valine + (0.008) NAD + (0.008) NADP + (0.008) S-Adenosyl-L-methionine + (0.015) TTP + (0.091) UTP + (0.015) dATP + (0.015) dCTP + (0.015) dGTP > (40.0) ADP + (1.0) Biomass + (40.0) H + (0.406) PPi  + (39.992) Phosphate + (0.008) apo-ACP'

In [25]:
print(f"Before running FBA there are {len(reactions)} reactions")
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, r2r_plussbml,
                                          media, biomass_equation, verbose=True)
print(f"After running FBA there are {len(reactions)} reactions")
print("Initial run has a biomass flux value of {} --> Growth: {}".format(value, growth))

Before running FBA there are 44015 reactions


In the model there are : 1803 compounds and 2026 reactions
We are loading 1803 rows and 2026 columns


After running FBA there are 44015 reactions
Initial run has a biomass flux value of 0.0 --> Growth: False


Length of the media: 65
Number of reactions to run: 1784
Number of compounds in SM: 1803
Number of reactions in SM: 2026
Revised number of total reactions: 44015
Number of total compounds: 33992
SMat dimensions: 1803 x 2026


In [26]:
uptake_secretion_reactions

NameError: name 'uptake_secretion_reactions' is not defined

In [27]:
all_compounds = compounds
# Filter for compounds that are boundary compounds
filtered_compounds = set()
for c in all_compounds:
    if not compounds[c].uptake_secretion:
        filtered_compounds.add(c)
print(f"There are {len(all_compounds)} total compounds and {len(filtered_compounds)} filtered compounds")

There are 33992 total compounds and 33992 filtered compounds


In [28]:
without_ex = set()
with open('rwex.txt', 'r') as fin:
    for l in fin:
        l = l.strip()
        without_ex.add(l)
without_ex

{'rxn01997',
 'rxn08296',
 'rxn03080',
 'rxn05187',
 'rxn08200',
 'rxn00245',
 'rxn03962',
 'rxn03060',
 'rxn01879',
 'rxn00395',
 'rxn05312',
 'rxn03147',
 'rxn10256',
 'rxn00756',
 'rxn00206',
 'rxn05347',
 'rxn01300',
 'rxn01101',
 'rxn06600',
 'rxn05365',
 'rxn00365',
 'rxn05150',
 'rxn02596',
 'rxn00615',
 'rxn12147',
 'rxn05337',
 'rxn00772',
 'rxn00874',
 'rxn09199',
 'rxn00293',
 'rxn05594',
 'rxn01675',
 'rxn01116',
 'rxn06729',
 'rxn01847',
 'rxn00966',
 'rxn06591',
 'rxn05325',
 'rxn08356',
 'rxn04413',
 'rxn01967',
 'rxn05163',
 'rxn05402',
 'rxn05401',
 'rxn09205',
 'rxn02331',
 'rxn00143',
 'rxn03248',
 'rxn05162',
 'rxn09127',
 'rxn08470',
 'rxn05551',
 'rxn06510',
 'rxn01068',
 'rxn01520',
 'rxn12646',
 'rxn01538',
 'rxn03436',
 'rxn05359',
 'rxn05569',
 'rxn10313',
 'rxn00100',
 'rxn05361',
 'rxn00856',
 'rxn00139',
 'rxn10268',
 'rxn02277',
 'rxn08205',
 'rxn01360',
 'rxn05559',
 'rxn01146',
 'rxn05533',
 'rxn01851',
 'rxn00584',
 'rxn03136',
 'rxn05607',
 'rxn05528',

In [29]:
print(f"Before running FBA there are {len(reactions)} reactions")
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, without_ex,
                                          media, biomass_equation, verbose=True)
print(f"After running FBA there are {len(reactions)} reactions")
print("Initial run has a biomass flux value of {} --> Growth: {}".format(value, growth))

Before running FBA there are 44015 reactions


In the model there are : 1406 compounds and 1638 reactions
We are loading 1406 rows and 1638 columns


After running FBA there are 44015 reactions
Initial run has a biomass flux value of 0.0 --> Growth: False


Length of the media: 65
Number of reactions to run: 1399
Number of compounds in SM: 1406
Number of reactions in SM: 1638
Revised number of total reactions: 44015
Number of total compounds: 33992
SMat dimensions: 1406 x 1638


In [30]:
len(without_ex)

1399

In [33]:
len(reactions_to_run)

1065

# it is the biomass model that is the problem

Lets take the biomass model from the SBML and see if this work.

In [34]:
sbml_equation = '(0.00778132482043096) cpd00063: Ca2 (location: c) +  (0.352889948968272) cpd00156: L_Valine (location: e) +  (0.00778132482043096) cpd00030: Mn2 (location: e) +  (0.00778132482043096) cpd00205: K (location: c) +  (0.428732289454499) cpd00035: L_Alanine (location: e) +  (0.128039715997337) cpd00060: L_Methionine (location: e) +  (0.15480760087483) cpd00066: L_Phenylalanine (location: c) +  (0.00778132482043096) cpd00017: S_Adenosyl_L_methionine (location: c) +  (0.00778132482043096) cpd00010: CoA (location: c) +  (0.0609084652443221) cpd15665: Peptidoglycan_polymer_n_subunits (location: c) +  (0.0841036156544863) cpd00052: CTP (location: c) +  (0.00778132482043096) cpd10516: fe3 (location: e) +  (0.01468498342018) cpd00357: TTP (location: c) +  (0.00778132482043096) cpd00099: Cl_ (location: e) +  (0.01468498342018) cpd00356: dCTP (location: c) +  (0.00778132482043096) cpd10515: Fe2 (location: e) +  (0.00778132482043096) cpd00254: Mg (location: c) +  (0.242249358141304) cpd00322: L_Isoleucine (location: e) +  (0.00778132482043096) cpd00058: Cu2 (location: e) +  (0.00778132482043096) cpd00149: Co2 (location: c) +  (0.201205267995816) cpd00041: L_Aspartate (location: e) +  (1) cpd17043: RNA_transcription (location: c) +  (0.219496655995436) cpd00023: L_Glutamate (location: e) +  (0.219496655995436) cpd00053: L_Glutamine (location: e) +  (0.376088782528765) cpd00107: L_Leucine (location: e) +  (0.00778132482043096) cpd00220: Riboflavin (location: e) +  (0.179790960093822) cpd00054: L_Serine (location: e) +  (0.0472899299502361) cpd00065: L_Tryptophan (location: e) +  (0.0609084652443221) cpd02229: Bactoprenyl_diphosphate (location: c) +  (0.00778132482043096) cpd11493: ACP (location: c) +  (1) cpd17041: Protein_biosynthesis (location: c) +  (0.184698405654696) cpd00129: L_Proline (location: e) +  (0.135406821203723) cpd00038: GTP (location: c) +  (0.01468498342018) cpd00241: dGTP (location: c) +  (1) cpd17042: DNA_replication (location: c) +  (0.211466290532188) cpd00161: L_Threonine (location: e) +  (40.1101757365074) cpd00002: ATP (location: c) +  (0.00778132482043096) cpd00016: Pyridoxal_phosphate (location: c) +  (0.00778132482043096) cpd00048: Sulfate (location: e) +  (0.00778132482043096) cpd00003: NAD (location: c) +  (0.01468498342018) cpd00115: dATP (location: c) +  (0.115101904973216) cpd00069: L_Tyrosine (location: e) +  (0.00778132482043096) cpd00015: FAD (location: c) +  (0.201205267995816) cpd00132: L_Asparagine (location: e) +  (0.00778132482043096) cpd00006: NADP (location: c) +  (35.5386858537513) cpd00001: H2O (location: e) +  (0.0762884719008526) cpd00084: L_Cysteine (location: c) +  (0.0794113918032267) cpd00119: L_Histidine (location: e) +  (0.285970236774541) cpd00039: L_Lysine (location: e) +  (0.0908319049068452) cpd00062: UTP (location: c) +  (0.00778132482043096) cpd00034: Zn2 (location: e) +  (0.247156803702178) cpd00051: L_Arginine (location: e) +  (0.510820469745475) cpd00033: Glycine (location: e) >  (40) cpd00008: ADP (location: c) +  (39.9922186751796) cpd00009: Phosphate (location: e) +  (0.00778132482043096) cpd12370: apo_ACP (location: c) +  (1) cpd11416: Biomass (location: c) +  (40) cpd00067: H (location: e) +  (0.0609084652443221) cpd15666: Peptidoglycan_polymer_n_1_subunits (location: c) +  (0.405833094852252) cpd00012: PPi (location: e)'

In [36]:
sbml_left_compounds = {'cpd00066: L_Phenylalanine (location: c)' : 0.15480760087483,  'cpd00016: Pyridoxal_phosphate (location: c)' : 0.00778132482043096,  'cpd00132: L_Asparagine (location: e)' : 0.201205267995816,  'cpd00156: L_Valine (location: e)' : 0.352889948968272,  'cpd00099: Cl_ (location: e)' : 0.00778132482043096,  'cpd00038: GTP (location: c)' : 0.135406821203723,  'cpd00003: NAD (location: c)' : 0.00778132482043096,  'cpd17041: Protein_biosynthesis (location: c)' : 1.0,  'cpd00033: Glycine (location: e)' : 0.510820469745475,  'cpd00322: L_Isoleucine (location: e)' : 0.242249358141304,  'cpd00254: Mg (location: c)' : 0.00778132482043096,  'cpd17043: RNA_transcription (location: c)' : 1.0,  'cpd00048: Sulfate (location: e)' : 0.00778132482043096,  'cpd10515: Fe2 (location: e)' : 0.00778132482043096,  'cpd02229: Bactoprenyl_diphosphate (location: c)' : 0.0609084652443221,  'cpd11493: ACP (location: c)' : 0.00778132482043096,  'cpd00161: L_Threonine (location: e)' : 0.211466290532188,  'cpd00006: NADP (location: c)' : 0.00778132482043096,  'cpd00060: L_Methionine (location: e)' : 0.128039715997337,  'cpd00119: L_Histidine (location: e)' : 0.0794113918032267,  'cpd00052: CTP (location: c)' : 0.0841036156544863,  'cpd00051: L_Arginine (location: e)' : 0.247156803702178,  'cpd15665: Peptidoglycan_polymer_n_subunits (location: c)' : 0.0609084652443221,  'cpd00017: S_Adenosyl_L_methionine (location: c)' : 0.00778132482043096,  'cpd00030: Mn2 (location: e)' : 0.00778132482043096,  'cpd10516: fe3 (location: e)' : 0.00778132482043096,  'cpd00065: L_Tryptophan (location: e)' : 0.0472899299502361,  'cpd00084: L_Cysteine (location: c)' : 0.0762884719008526,  'cpd00023: L_Glutamate (location: e)' : 0.219496655995436,  'cpd17042: DNA_replication (location: c)' : 1.0,  'cpd00356: dCTP (location: c)' : 0.01468498342018,  'cpd00035: L_Alanine (location: e)' : 0.428732289454499,  'cpd00069: L_Tyrosine (location: e)' : 0.115101904973216,  'cpd00220: Riboflavin (location: e)' : 0.00778132482043096,  'cpd00129: L_Proline (location: e)' : 0.184698405654696,  'cpd00357: TTP (location: c)' : 0.01468498342018,  'cpd00205: K (location: c)' : 0.00778132482043096,  'cpd00149: Co2 (location: c)' : 0.00778132482043096,  'cpd00063: Ca2 (location: c)' : 0.00778132482043096,  'cpd00054: L_Serine (location: e)' : 0.179790960093822,  'cpd00001: H2O (location: e)' : 35.5386858537513,  'cpd00010: CoA (location: c)' : 0.00778132482043096,  'cpd00015: FAD (location: c)' : 0.00778132482043096,  'cpd00062: UTP (location: c)' : 0.0908319049068452,  'cpd00107: L_Leucine (location: e)' : 0.376088782528765,  'cpd00241: dGTP (location: c)' : 0.01468498342018,  'cpd00053: L_Glutamine (location: e)' : 0.219496655995436,  'cpd00039: L_Lysine (location: e)' : 0.285970236774541,  'cpd00034: Zn2 (location: e)' : 0.00778132482043096,  'cpd00058: Cu2 (location: e)' : 0.00778132482043096,  'cpd00002: ATP (location: c)' : 40.1101757365074,  'cpd00041: L_Aspartate (location: e)' : 0.201205267995816,  'cpd00115: dATP (location: c)' : 0.01468498342018}

In [37]:
sbml_right_compounds = {'cpd00067: H (location: e)' : 40.0,  'cpd00012: PPi (location: e)' : 0.405833094852252,  'cpd00008: ADP (location: c)' : 40.0,  'cpd11416: Biomass (location: c)' : 1.0,  'cpd12370: apo_ACP (location: c)' : 0.00778132482043096,  'cpd00009: Phosphate (location: e)' : 39.9922186751796,  'cpd15666: Peptidoglycan_polymer_n_1_subunits (location: c)' : 0.0609084652443221}

In [62]:
sbml_biomass = PyFBA.metabolism.Reaction('sbml_biomass', 'sbml_biomass')
sbml_biomass.equation = sbml_equation
parsecomp = re.compile('^(cpd\\d+): (.*?) \(location: (.)\)')
for c in sbml_left_compounds:
    m = parsecomp.match(c)
    if not m:
        sys.stderr.write(f"Can't parse {c}\n")
    if m.group(1) in compounds:
        if False and compounds[m.group(1)] != m.group(2):
            sys.stderr.write(f"We had |{compounds[m.group(1)]}| for {m.group(1)} in the SBML, but now have |{m.group(2)}|\n")
        newcomp = PyFBA.metabolism.CompoundWithLocation.from_compound(compounds[m.group(1)], m.group(3))
        sbml_biomass.add_left_compounds({newcomp})
        sbml_biomass.set_left_compound_abundance(newcomp, sbml_left_compounds[c])
    else:
        print(f"{m.group(1)} not found")

In [70]:
for c in sbml_right_compounds:
    m = parsecomp.match(c)
    if not m:
        sys.stderr.write(f"Can't parse {c}\n")
    if m.group(1) in compounds:
        if True and compounds[m.group(1)] != m.group(2):
            sys.stderr.write(f"We had |{compounds[m.group(1)]}| for {m.group(1)} in the SBML, but now have |{m.group(2)}|\n")
        newcomp = PyFBA.metabolism.CompoundWithLocation.from_compound(compounds[m.group(1)], m.group(3))
        sbml_biomass.add_right_compounds({newcomp})
        sbml_biomass.set_right_compound_abundance(newcomp, sbml_right_compounds[c])
    else:
        print(f"{m.group(1)} not found")

We had |cpd00067: H+| for cpd00067 in the SBML, but now have |H|
We had |cpd00012: PPi| for cpd00012 in the SBML, but now have |PPi|
We had |cpd00008: ADP| for cpd00008 in the SBML, but now have |ADP|
We had |cpd11416: Biomass| for cpd11416 in the SBML, but now have |Biomass|
We had |cpd12370: apo-ACP| for cpd12370 in the SBML, but now have |apo_ACP|
We had |cpd00009: Phosphate| for cpd00009 in the SBML, but now have |Phosphate|
We had |cpd15666: Peptidoglycan polymer (n-1 subunits)| for cpd15666 in the SBML, but now have |Peptidoglycan_polymer_n_1_subunits|


In [65]:
print(f"Before running FBA there are {len(reactions)} reactions")
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, reactions_to_run,
                                          media, sbml_biomass, verbose=True)
print(f"After running FBA there are {len(reactions)} reactions")
print("Initial run has a biomass flux value of {} --> Growth: {}".format(value, growth))

Before running FBA there are 44015 reactions


In the model there are : 1226 compounds and 1238 reactions
We are loading 1226 rows and 1238 columns


After running FBA there are 44015 reactions
Initial run has a biomass flux value of 0.0 --> Growth: False


Length of the media: 65
Number of reactions to run: 1065
Number of compounds in SM: 1226
Number of reactions in SM: 1238
Revised number of total reactions: 44015
Number of total compounds: 33992
SMat dimensions: 1226 x 1238


# Add the missing reactions

In [66]:
all_reactions = {'rxn00868', 'rxn01923', 'rxn02268', 'rxn10215', 'rxn10219', 'rxn08089', 'rxn10212', 'rxn08083', 'rxn10214', 'rxn10211', 'rxn10218', 'rxn08086', 'rxn10217', 'rxn08087', 'rxn08088', 'rxn08085', 'rxn10216', 'rxn08084', 'rxn10213', 'rxn05572', 'rxn05565', 'rxn00541', 'rxn10155', 'rxn10157', 'rxn05536', 'rxn05544', 'rxn12848', 'rxn12851', 'rxn05539', 'rxn05541', 'rxn05537', 'rxn05543', 'rxn12849', 'rxn05533', 'rxn05540', 'rxn05534', 'rxn05547', 'rxn05546', 'rxn05542', 'rxn05535', 'rxn12850', 'rxn05545', 'rxn05538', 'rxn05168', 'rxn05179', 'rxn05161', 'rxn09313', 'rxn08354', 'rxn08356', 'rxn09315', 'rxn05549', 'rxn05160', 'rxn05644', 'rxn05330', 'rxn05335', 'rxn05334', 'rxn05329', 'rxn05333', 'rxn05332', 'rxn05331', 'rxn05415', 'rxn05381', 'rxn05386', 'rxn05427', 'rxn05431', 'rxn05373', 'rxn05377', 'rxn05398', 'rxn05419', 'rxn05402', 'rxn05369', 'rxn05361', 'rxn05394', 'rxn05406', 'rxn05365', 'rxn05390', 'rxn05423', 'rxn05462', 'rxn05411', 'rxn03492', 'rxn04050', 'rxn08258', 'rxn04713', 'rxn00990', 'rxn00875', 'rxn08471', 'rxn05737', 'rxn08467', 'rxn10067', 'rxn08468', 'rxn08469', 'rxn08470', 'rxn01302', 'rxn01301', 'rxn05422', 'rxn05372', 'rxn05341', 'rxn05376', 'rxn05342', 'rxn05337', 'rxn05385', 'rxn05397', 'rxn05340', 'rxn05461', 'rxn05368', 'rxn05418', 'rxn05393', 'rxn05336', 'rxn05426', 'rxn05364', 'rxn05430', 'rxn05410', 'rxn05339', 'rxn05401', 'rxn05338', 'rxn05360', 'rxn05414', 'rxn05405', 'rxn05389', 'rxn05380', 'rxn03164', 'rxn05229', 'rxn07586', 'rxn05054', 'rxn04384', 'rxn00503', 'rxn00183', 'rxn05187', 'rxn05515', 'rxn02056', 'rxn09134', 'rxn09125', 'rxn09157', 'rxn09128', 'rxn09142', 'rxn09161', 'rxn09147', 'rxn09164', 'rxn09152', 'rxn09124', 'rxn09131', 'rxn09133', 'rxn09138', 'rxn09143', 'rxn09153', 'rxn09160', 'rxn09158', 'rxn09148', 'rxn09144', 'rxn09150', 'rxn09130', 'rxn09149', 'rxn09163', 'rxn09159', 'rxn09132', 'rxn09127', 'rxn09140', 'rxn09145', 'rxn09137', 'rxn09154', 'rxn09151', 'rxn09146', 'rxn09123', 'rxn09139', 'rxn09126', 'rxn09141', 'rxn09135', 'rxn09136', 'rxn09155', 'rxn09162', 'rxn09129', 'rxn09156', 'rxn02949', 'rxn03241', 'rxn03245', 'rxn02911', 'rxn02167', 'rxn03250', 'rxn02934', 'rxn03240', 'rxn03247', 'rxn05316', 'rxn09687', 'rxn05198', 'rxn09688', 'rxn05199', 'rxn05200', 'rxn09685', 'rxn05318', 'rxn05205', 'rxn05621', 'rxn05656', 'rxn05585', 'rxn05172', 'rxn05594', 'rxn05552', 'rxn05599', 'rxn05512', 'rxn05620', 'rxn01277', 'rxn05518', 'rxn05145', 'rxn05460', 'rxn05396', 'rxn05363', 'rxn05359', 'rxn05367', 'rxn05417', 'rxn05421', 'rxn05392', 'rxn05413', 'rxn05349', 'rxn05388', 'rxn05429', 'rxn05371', 'rxn05400', 'rxn05425', 'rxn05409', 'rxn05404', 'rxn05375', 'rxn05379', 'rxn05384', 'rxn04139', 'rxn00640', 'rxn05507', 'rxn05506', 'rxn01893', 'rxn00671', 'rxn00501', 'rxn10340', 'rxn10334', 'rxn10337', 'rxn10338', 'rxn10341', 'rxn10335', 'rxn10342', 'rxn10339', 'rxn10336', 'rxn00160', 'rxn01285', 'rxn04143', 'rxn01847', 'rxn01103', 'rxn00227', 'rxn05175', 'rxn05163', 'rxn05683', 'rxn05484', 'rxn02933', 'rxn04750', 'rxn03244', 'rxn01451', 'rxn03239', 'rxn03246', 'rxn03242', 'rxn03249', 'rxn06777', 'rxn05500', 'rxn01637', 'rxn01122', 'rxn04602', 'rxn02416', 'rxn04601', 'rxn04928', 'rxn05596', 'rxn02762', 'rxn02521', 'rxn02522', 'rxn03483', 'rxn02775', 'rxn04046', 'rxn07589', 'rxn03491', 'rxn10117', 'rxn10119', 'rxn08333', 'rxn04673', 'rxn10308', 'rxn10311', 'rxn10315', 'rxn10309', 'rxn10307', 'rxn10312', 'rxn10310', 'rxn10314', 'rxn08040', 'rxn10313', 'rxn12147', 'rxn03931', 'rxn03916', 'rxn04674', 'rxn03397', 'rxn10094', 'rxn02286', 'rxn02474', 'rxn00555', 'rxn08709', 'rxn04052', 'rxn03512', 'rxn12224', 'rxn09188', 'rxn02359', 'rxn02008', 'rxn08179', 'rxn08178', 'rxn03643', 'rxn09177', 'rxn12512', 'rxn07587', 'rxn02507', 'rxn08291', 'rxn06865', 'rxn00303', 'rxn00222', 'rxn09978', 'rxn09979', 'rxn07588', 'rxn04413', 'rxn03537', 'rxn03536', 'rxn03919', 'rxn03435', 'rxn02187', 'rxn02186', 'rxn03436', 'rxn03068', 'rxn05317', 'rxn01219', 'rxn00364', 'rxn03514', 'rxn04048', 'rxn00544', 'rxn02792', 'rxn00350', 'rxn02791', 'rxn05221', 'rxn00675', 'rxn00175', 'rxn00986', 'rxn01507', 'rxn02400', 'rxn01670', 'rxn00363', 'rxn00708', 'rxn01218', 'rxn01521', 'rxn01445', 'rxn00913', 'rxn01145', 'rxn00132', 'rxn01961', 'rxn00831', 'rxn08712', 'rxn04113', 'rxn04996', 'rxn08756', 'rxn08352', 'rxn06023', 'rxn02449', 'rxn05165', 'rxn05181', 'rxn08194', 'rxn01093', 'rxn09180', 'rxn03644', 'rxn08619', 'rxn09289', 'rxn00776', 'rxn01360', 'rxn08335', 'rxn08336', 'rxn12500', 'rxn02287', 'rxn02774', 'rxn09167', 'rxn08708', 'rxn05156', 'rxn05151', 'rxn01629', 'rxn12146', 'rxn01123', 'rxn05147', 'rxn05173', 'rxn08707', 'rxn00927', 'rxn01299', 'rxn01226', 'rxn01545', 'rxn02476', 'rxn02011', 'rxn05201', 'rxn01895', 'rxn04604', 'rxn00830', 'rxn00179', 'rxn03991', 'rxn03990', 'rxn03975', 'rxn03974', 'rxn00818', 'rxn03838', 'rxn00817', 'rxn02596', 'rxn05555', 'rxn00056', 'rxn06979', 'rxn11544', 'rxn03918', 'rxn05559', 'rxn08345', 'rxn00509', 'rxn00205', 'rxn00006', 'rxn02473', 'rxn00834', 'rxn05293', 'rxn00105', 'rxn00634', 'rxn08618', 'rxn06848', 'rxn09997', 'rxn05938', 'rxn04783', 'rxn05206', 'rxn00102', 'rxn01644', 'rxn02938', 'rxn00792', 'rxn08711', 'rxn03513', 'rxn04047', 'rxn01265', 'rxn01404', 'rxn03394', 'rxn00777', 'rxn01106', 'rxn07492', 'rxn03538', 'rxn01480', 'rxn00119', 'rxn01517', 'rxn01966', 'rxn01132', 'rxn05162', 'rxn02277', 'rxn08257', 'rxn05197', 'rxn01352', 'rxn03540', 'rxn00789', 'rxn00508', 'rxn04386', 'rxn10481', 'rxn05528', 'rxn06077', 'rxn01671', 'rxn02929', 'rxn03917', 'rxn03135', 'rxn00469', 'rxn00756', 'rxn03087', 'rxn01329', 'rxn01917', 'rxn01879', 'rxn01538', 'rxn02285', 'rxn08710', 'rxn07438', 'rxn02321', 'rxn00787', 'rxn01289', 'rxn00851', 'rxn05297', 'rxn00062', 'rxn04132', 'rxn04133', 'rxn05319', 'rxn05467', 'rxn05468', 'rxn02374', 'rxn03012', 'rxn05064', 'rxn02666', 'rxn04457', 'rxn04456', 'rxn01664', 'rxn02916', 'rxn05667', 'rxn10571', 'rxn05195', 'rxn05645', 'rxn05144', 'rxn02988', 'rxn01256', 'rxn12604', 'rxn05039', 'rxn10904', 'rxn05499', 'rxn01152', 'rxn05691', 'rxn12893', 'rxn11116', 'rxn00880', 'rxn05593', 'rxn05469', 'rxn00186', 'rxn05694', 'rxn05491', 'rxn05682', 'rxn01748', 'rxn00327', 'rxn01746', 'rxn09656'}

In [68]:
print(f"Before updating there are {len(reactions_to_run)} reactions")
r2ra = copy.copy(reactions_to_run)
r2ra.update(all_reactions)
print(f"After updating there are {len(r2ra)} reactions")

Before updating there are 1065 reactions
After updating there are 1579 reactions


In [69]:
print(f"Before running FBA there are {len(reactions)} reactions")
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, reactions_to_run,
                                          media, sbml_biomass, verbose=True)
print(f"After running FBA there are {len(reactions)} reactions")
print("Initial run has a biomass flux value of {} --> Growth: {}".format(value, growth))

Before running FBA there are 44015 reactions


In the model there are : 1226 compounds and 1238 reactions
We are loading 1226 rows and 1238 columns


After running FBA there are 44015 reactions
Initial run has a biomass flux value of 0.0 --> Growth: False


Length of the media: 65
Number of reactions to run: 1065
Number of compounds in SM: 1226
Number of reactions in SM: 1238
Revised number of total reactions: 44015
Number of total compounds: 33992
SMat dimensions: 1226 x 1238


In [31]:
new_reactions = PyFBA.gapfill.suggest_from_media(compounds, reactions,
                                                   reactions_to_run, media, verbose=False)

In [32]:
print(f"There are {len(new_reactions)} new reactions to add")

There are 30663 new reactions to add


In [None]:
transrct = set()
for r in new_reactions:
    if reactions[r].is_transport:
        transrct.add(r)
print(f"There are {len(transrct)} new transport reactions")

In [None]:
reactions_to_run.update(transrct)

In [None]:
print(f"Before running FBA there are {len(reactions)} reactions")
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, reactions_to_run,
                                          media, biomass_equation)
print(f"After running FBA there are {len(reactions)} reactions")
print("Initial run has a biomass flux value of {} --> Growth: {}".format(value, growth))

In [None]:
print(f"There are {len(reactions_to_run)} reactions to run")

## Gap-fill the model

Since the model does not grow on ArgonneLB we need to gap-fill it to ensure growth. There are several ways that we can gap-fill, and we will work through them until we get growth.

As you will see, we update the *reactions_to_run list* each time, and keep the media and everything else consistent. Then we just need to run the FBA like we have done above and see if we get growth.

We also keep a copy of the original *reactions_to_run*, and a list with all the reactions that we are adding, so once we are done we can go back and bisect the reactions that are added.

In [None]:
added_reactions = []
original_reactions_to_run = copy.copy(reactions_to_run)

### Media import reactions

We need to make sure that the cell can import everything that is in the media... otherwise it won't be able to grow. Be sure to only do this step if you are certain that the cell can grow on the media you are testing.

In [None]:
update_type = 'media'
new_reactions = PyFBA.gapfill.suggest_from_media(compounds, reactions,
                                                   reactions_to_run, media, verbose=True)
added_reactions.append((update_type, new_reactions))

print(f"Before adding {update_type} reactions, we had {len(reactions_to_run)} reactions.")
reactions_to_run.update(new_reactions)
print(f"After adding {update_type} reactions, we had {len(reactions_to_run)} reactions.")

In [None]:
for r in reactions:
    if reactions[r].is_transport:
        print(r)

In [None]:
for r in reactions:
    for c in reactions[r].left_compounds:
        if c.location == 'e':
            if not reactions[r].is_transport:
                print(f"Check {r}")

In [None]:
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, reactions_to_run,
                                          media, biomass_equation)
print("Run has a biomass flux value of {} --> Growth: {}".format(value, growth))

### Essential reactions

There are ~100 reactions that are in every model we have tested, and we construe these to be essential for all models, so we typically add these next!

In [None]:
update_type = 'essential'
new_reactions = PyFBA.gapfill.suggest_essential_reactions()
added_reactions.append((update_type, new_reactions))
print(f"Before adding {update_type} reactions, we had {len(reactions_to_run)} reactions.")
reactions_to_run.update(new_reactions)
print(f"After adding {update_type} reactions, we had {len(reactions_to_run)} reactions.")

In [None]:
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, reactions_to_run,
                                          media, biomass_equation)
print("Run has a biomass flux value of {} --> Growth: {}".format(value, growth))

### Subsystems

The reactions connect us to subsystems (see [Overbeek et al. 2014](http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965101/)), and this test ensures that all the subsystems are complete. We add reactions required to complete the subsystem.

In [None]:
update_type = 'subsystems'
new_reactions = \
    PyFBA.gapfill.suggest_reactions_from_subsystems(reactions,
                                                    reactions_to_run,
                                                    threshold=0.5)
added_reactions.append((update_type, new_reactions))
print(f"Before adding {update_type} reactions, we had {len(reactions_to_run)} reactions.")
reactions_to_run.update(new_reactions)
print(f"After adding {update_type} reactions, we had {len(reactions_to_run)} reactions.")

In [None]:
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, reactions_to_run,
                                          media, biomass_equation)
print("Run has a biomass flux value of {} --> Growth: {}".format(value, growth))

In [None]:
pre_orphan=copy.copy(reactions_to_run)
pre_o_added=copy.copy(added_reactions)
print("Pre orphan has {} reactions".format(len(pre_orphan)))

### Orphan compounds

Orphan compounds are those compounds which are only associated with one reaction. They are either produced, or trying to be consumed. We need to add reaction(s) that complete the network of those compounds.

You can change the maximum number of reactions that a compound is in to be considered an orphan (try increasing it to 2 or 3).

In [None]:
update_type = 'orphan compounds'
new_reactions = PyFBA.gapfill.suggest_by_compound(compounds, reactions,
                                                     reactions_to_run,
                                                     max_reactions=1)
added_reactions.append((update_type, new_reactions))
print(f"Before adding {update_type} reactions, we had {len(reactions_to_run)} reactions.")
reactions_to_run.update(new_reactions)
print(f"After adding {update_type} reactions, we had {len(reactions_to_run)} reactions.")

In [None]:
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, reactions_to_run,
                                          media, biomass_equation)
print("Run has a biomass flux value of {} --> Growth: {}".format(value, growth))

## Trimming the model
Now that the model has been shown to grow on ArgonneLB media after several gap-fill iterations, we should trim down the reactions to only the required reactions necessary to observe growth.

In [None]:
reqd_additional = set()

# Begin loop through all gap-filled reactions
while added_reactions:
    ori = copy.copy(original_reactions_to_run)
    ori.update(reqd_additional)
    # Test next set of gap-filled reactions
    # Each set is based on a method described above
    how, new = added_reactions.pop()
    sys.stderr.write("Testing reactions from {}\n".format(how))
    
    # Get all the other gap-filled reactions we need to add
    for tple in added_reactions:
        ori.update(tple[1])
    
    # Use minimization function to determine the minimal
    # set of gap-filled reactions from the current method
    new_essential = PyFBA.gapfill.minimize_additional_reactions(ori, new, compounds,
                                                                reactions, media,
                                                                biomass_equation)
    sys.stderr.write("Saved {} reactions from {}\n".format(len(new_essential), how))
    for r in new_essential:
        sys.stderr.write(r + "\n")
    # Record the method used to determine
    # how the reaction was gap-filled
    for new_r in new_essential:
        reactions[new_r].is_gapfilled = True
        reactions[new_r].gapfill_method = how
    reqd_additional.update(new_essential)

# Combine old and new reactions
all_reactions = original_reactions_to_run.union(reqd_additional)

In [None]:
status, value, growth = PyFBA.fba.run_fba(compounds, reactions, all_reactions,
                                          media, biomass_equation)
print("The biomass reaction has a flux of {} --> Growth: {}".format(value, growth))