# How to gap-fill a genome scale metabolic model

## Getting started

### Installing libraries

Before you start, you will need to install a couple of libraries:
   
The [PyFBA](http://linsalrob.github.io/PyFBA) library has detailed [installation instructions](http://linsalrob.github.io/PyFBA/installation.html). Don't be scared, its mostly just `pip install`.

(Optional) Also, get the [SEED Servers](https://github.com/linsalrob/SEED_Servers_Python) as you can get a lot of information from them. You can install the git python repo from github.  Make sure that the SEED_Servers_Python is in your PYTHONPATH.

We start with importing some modules that we are going to use. 

We import *sys* so that we can use standard out and standard error if we have some error messages.<br>
We import *copy* so that we can make a deep copy of data structures for later comparisons.<br>
Then we import the *PyFBA* module to get started.

In [1]:
import sys
import copy
import PyFBA

In [2]:
modeldata = PyFBA.parse.model_seed.parse_model_seed_data('gramnegative', verbose=True)

We are logging to /home/redwards/GitHubsLinux/PyFBA/iPythonNotebooks/logs/PyFBA.2021-06-16T16:09:16.229866.log
Reading compounds from PyFBA.Biochemistry.ModelSEEDDatabase.Biochemistry.compounds.json
Reading reactions from PyFBA.Biochemistry.ModelSEEDDatabase.Biochemistry.reactions.json
Creating enzymes with complexes and reactions


## Running a basic model

The SBML model is great if you have built a model elsewhere, but what if you want to build a model from a genome?

We typically start with an *assigned_functions* file from RAST. The easiest way to find that is in the RAST directory by choosing `Genome Directory` from the `Downloads` menu on the job details page.

For this example, [here is an *assigned_functions* file](https://raw.githubusercontent.com/linsalrob/PyFBA/master/example_data/Citrobacter/ungapfilled_model/citrobacter.assigned_functions) from our *Citrobacter* model that you can download to the same directory as this iPython notebook. Notice that it has two columns: the first column is the protein ID (using SEED standard IDs that start with `fig|`, and then have the taxonomy ID and version number of the genome, and then `peg` to indicate *protein encoding gene*, `rna` to indicate *RNA*, `crispr_spacer` to indicate *crispr spacers* or other acronym, followed by the feature number. After the tab is the *functional role* of that feature. Download that file to use in this test. 

We start by converting this *assigned_functions* file to a list of reactions.

In [3]:
assigned_functions = PyFBA.parse.read_assigned_functions("citrobacter.assigned_functions")
roles = set([i[0] for i in [list(j) for j in assigned_functions.values()]])
print("There are {} unique roles in this genome".format(len(roles)))

There are 3591 unique roles in this genome


Convert those roles to reactions. We start with a dict of roles and reactions, but we only need a list of unique reactions, so we convert the keys to a set.

In [5]:
roles_to_reactions = PyFBA.filters.roles_to_reactions(roles, organism_type="Gram_Negative", verbose=True)
reactions_to_run = set()
for role in roles_to_reactions:
    reactions_to_run.update(roles_to_reactions[role])
print(f"There are {len(reactions_to_run)} unique reactions associated with this genome")
with open("evaluated_reactions.txt", "w") as out:
    for r in reactions_to_run:
        out.write(f"{r}\n")

There are 1346 unique reactions associated with this genome


In [None]:
print(f"roles: {len(roles)} r2r: {len(roles_to_reactions)} r2run: {len(reactions_to_run)}")

### Read all the reactions and compounds in our database

We read all the reactions, compounds, and enzymes in the [ModelSEEDDatabase](https://github.com/ModelSEED/ModelSEEDDatabase) into three data structures. Each one is a dictionary with a string representation of the object as the key and the PyFBA object as the value.

We modify the reactions specifically for Gram negative models (there are also options for Gram positive models, Mycobacterial models, general microbial models, and plant models).

In [6]:
modeldata = PyFBA.parse.model_seed.parse_model_seed_data('gramnegative', verbose=True)
print(f"There are {len(modeldata.compounds):,} compounds, {len(modeldata.reactions):,} reactions, and {len(modeldata.enzymes):,} enzymes in total")

There are 33,845 compounds, 43,774 reactions, and 9,423 enzymes in total


In [None]:
# for known in ['rxn05514', 'rxn05541', 'rxn01103', 'rxn05533', 'rxn09137', 'rxn00837']:
#    modeldata.reactions[known].lower_bound = -1000
#    modeldata.reactions[known].upper_bound = 1000

#### Update reactions to run, making sure that all reactions are in the list!

There are some reactions that come from functional roles that do not appear in the reactions list. We're working on tracking these down, but for now we just check that all reaction IDs in *reactions_to_run* are in *reactions*, too.

In [7]:
tempset = set()
for r in reactions_to_run:
    if r in modeldata.reactions:
        tempset.add(r)
    else:
        print(f"Reaction ID {r} is not in our reactions list. Skipped", file=sys.stderr)
reactions_to_run = tempset

### Test whether these reactions grow on ArgonneLB media

We can test whether this set of reactions grows on ArgonneLB media. The media is the same one we used above, and you can download the [ArgonneLB.txt](https://raw.githubusercontent.com/linsalrob/PyFBA/master/media/ArgonneLB.txt) and text file and put it in the same directory as this iPython notebook to run it.

(Note: we don't need to convert the media components, because the media and compounds come from the same source.)

In [8]:
media = PyFBA.parse.pyfba_media("ArgonneLB")
media = PyFBA.parse.correct_media_names(media, modeldata.compounds)
print(f"Our media has {len(media)} components")

Our media has 65 components


### Define a biomass equation

The biomass equation is the part that says whether the model will grow! This is a [metabolism.reaction.Reaction](https://github.com/linsalrob/PyFBA/blob/master/PyFBA/metabolism/reaction.py) object.

In [10]:
biomass_equation = PyFBA.metabolism.biomass_equation('gramnegative')

### Run the FBA

With the reactions, compounds, reactions_to_run, media, and biomass model, we can test whether the model grows on this media.

In [14]:
status, value, growth = PyFBA.fba.run_fba(modeldata, reactions_to_run,
                                          media, biomass_equation)
print("Initial run has a biomass flux value of {} --> Growth: {}".format(value, growth))

Initial run has a biomass flux value of -2.093436047086711e-14 --> Growth: False


## Can we gap fill it to success??

This is the set of reactions that we need to add.

In [15]:
sbml_reactions = set()
with open('sbml_reactions.txt', 'r') as f:
    for l in f:
        if l.startswith('rxn'):
            sbml_reactions.add(l.strip())

In [16]:
status, value, growth = PyFBA.fba.run_fba(modeldata, reactions_to_run.union(sbml_reactions),
                                          media, biomass_equation)
print("Initial run has a biomass flux value of {} --> Growth: {}".format(value, growth))

Initial run has a biomass flux value of 273.02622233524255 --> Growth: True


In [17]:
# these are the reactions we are looking for
missing_reactions = sbml_reactions.difference(reactions_to_run)
print(f"There are {len(missing_reactions)} missing reactions yet to find")

There are 244 missing reactions yet to find


## Gap-fill the model

Since the model does not grow on ArgonneLB we need to gap-fill it to ensure growth. There are several ways that we can gap-fill, and we will work through them until we get growth.

As you will see, we update the *reactions_to_run list* each time, and keep the media and everything else consistent. Then we just need to run the FBA like we have done above and see if we get growth.

We also keep a copy of the original *reactions_to_run*, and a list with all the reactions that we are adding, so once we are done we can go back and bisect the reactions that are added.

In [18]:
added_reactions = []
original_reactions_to_run = copy.copy(reactions_to_run)

### Essential reactions

There are ~100 reactions that are in every model we have tested, and we construe these to be essential for all models, so we typically add these next!

In [19]:
essential_reactions = PyFBA.gapfill.suggest_essential_reactions()
for r in essential_reactions:
    modeldata.reactions[r].reset_bounds()
added_reactions.append(("essential", essential_reactions))
print(f"Before updating we have {len(reactions_to_run)} reactions, and after updating we have ", end="")
reactions_to_run.update(essential_reactions)
print(f"{len(reactions_to_run)} reactions")

Before updating we have 1346 reactions, and after updating we have 1394 reactions


In [20]:
status, value, growth = PyFBA.fba.run_fba(modeldata, reactions_to_run, media, biomass_equation)
print(f"The biomass reaction has a flux of {value} --> Growth: {growth}")

The biomass reaction has a flux of -2.4960681287600593e-12 --> Growth: False


In [21]:
print(f"There are still {len(missing_reactions.difference(reactions_to_run))} reactions we have not found")

There are still 196 reactions we have not found


## Linked Reactions

The ModelSEED has a notion of `linked reactions` that do things together. Here we add all of the linked reactions.

In [22]:
linked_reactions = PyFBA.gapfill.suggest_linked_reactions(modeldata, reactions_to_run, verbose=True)
for r in linked_reactions:
    modeldata.reactions[r].reset_bounds()
added_reactions.append(("linked_reactions", linked_reactions))

Gapfill by linked reactions found 1443 new reactions


In [23]:
print(f"Before updating we have {len(reactions_to_run)} reactions, and after updating we have ", end="")
reactions_to_run.update(linked_reactions)
print(f"{len(reactions_to_run)} reactions")

Before updating we have 1394 reactions, and after updating we have 2837 reactions


In [24]:
status, value, growth = PyFBA.fba.run_fba(modeldata, reactions_to_run, media, biomass_equation)
print(f"The biomass reaction has a flux of {value} --> Growth: {growth}")

The biomass reaction has a flux of -1.0847732382836745e-12 --> Growth: False


In [25]:
print(f"There are still {len(missing_reactions.difference(reactions_to_run))} reactions we have not found")

There are still 196 reactions we have not found


## EC Numbers

Make sure we have added all the EC numbers that we know about from our roles!

In [26]:
ecnumber_reactions = PyFBA.gapfill.suggest_reactions_using_ec(roles, modeldata, reactions_to_run, verbose=True)
for r in ecnumber_reactions:
    modeldata.reactions[r].reset_bounds()
added_reactions.append(("ec_numbers_brief", ecnumber_reactions))

Gapfilling by EC number found 145 new reactions


In [27]:
print(f"Before updating we have {len(reactions_to_run)} reactions, and after updating we have ", end="")
reactions_to_run.update(ecnumber_reactions)
print(f"{len(reactions_to_run)} reactions")

Before updating we have 2837 reactions, and after updating we have 2982 reactions


In [28]:
status, value, growth = PyFBA.fba.run_fba(modeldata, reactions_to_run, media, biomass_equation)
print(f"The biomass reaction has a flux of {value} --> Growth: {growth}")

The biomass reaction has a flux of -1.3589565316186557e-12 --> Growth: False


In [29]:
print(f"There are still {len(missing_reactions.difference(reactions_to_run))} reactions we have not found")

There are still 184 reactions we have not found


### Media import reactions

We need to make sure that the cell can import everything that is in the media... otherwise it won't be able to grow. Be sure to only do this step if you are certain that the cell can grow on the media you are testing.

In [30]:
media_reactions = PyFBA.gapfill.suggest_from_media(modeldata, reactions_to_run, media, verbose=True)
for r in media_reactions:
    modeldata.reactions[r].reset_bounds()
added_reactions.append(("media", media_reactions))
print(f"Before updating we have {len(reactions_to_run)} reactions, and after updating we have ", end="")
reactions_to_run.update(media_reactions)
print(f"{len(reactions_to_run)} reactions")

ERROR: upsr_10: UPTAKE_SECRETION_REACTION 10 was not found in our reactions
ERROR: upsr_8: UPTAKE_SECRETION_REACTION 8 was not found in our reactions
Adding from media: For Niacin added 4 reactions
ERROR: upsr_76: UPTAKE_SECRETION_REACTION 76 was not found in our reactions
ERROR: upsr_82: UPTAKE_SECRETION_REACTION 82 was not found in our reactions
Adding from media: For O2 added 11 reactions
ERROR: upsr_126: UPTAKE_SECRETION_REACTION 126 was not found in our reactions
ERROR: upsr_125: UPTAKE_SECRETION_REACTION 125 was not found in our reactions
ERROR: upsr_127: UPTAKE_SECRETION_REACTION 127 was not found in our reactions
ERROR: upsr_139: UPTAKE_SECRETION_REACTION 139 was not found in our reactions
Adding from media: For Cd2+ added 11 reactions
ERROR: upsr_44: UPTAKE_SECRETION_REACTION 44 was not found in our reactions
ERROR: upsr_40: UPTAKE_SECRETION_REACTION 40 was not found in our reactions
Adding from media: For Folate added 4 reactions
ERROR: upsr_116: UPTAKE_SECRETION_REACTION 116

ERROR: upsr_165: UPTAKE_SECRETION_REACTION 165 was not found in our reactions
ERROR: upsr_164: UPTAKE_SECRETION_REACTION 164 was not found in our reactions
ERROR: upsr_179: UPTAKE_SECRETION_REACTION 179 was not found in our reactions
Adding from media: For H+ added 1319 reactions
ERROR: upsr_43: UPTAKE_SECRETION_REACTION 43 was not found in our reactions
ERROR: upsr_47: UPTAKE_SECRETION_REACTION 47 was not found in our reactions
Adding from media: For Uracil added 4 reactions
ERROR: upsr_95: UPTAKE_SECRETION_REACTION 95 was not found in our reactions
ERROR: upsr_93: UPTAKE_SECRETION_REACTION 93 was not found in our reactions
ERROR: upsr_104: UPTAKE_SECRETION_REACTION 104 was not found in our reactions
ERROR: upsr_94: UPTAKE_SECRETION_REACTION 94 was not found in our reactions
Adding from media: For L-Tyrosine added 14 reactions
ERROR: upsr_60: UPTAKE_SECRETION_REACTION 60 was not found in our reactions
ERROR: upsr_66: UPTAKE_SECRETION_REACTION 66 was not found in our reactions
Adding f

Before updating we have 2982 reactions, and after updating we have 4508 reactions


ERROR: upsr_81: UPTAKE_SECRETION_REACTION 81 was not found in our reactions
ERROR: upsr_75: UPTAKE_SECRETION_REACTION 75 was not found in our reactions
Adding from media: For L-Arginine added 19 reactions
ERROR: upsr_158: UPTAKE_SECRETION_REACTION 158 was not found in our reactions
ERROR: upsr_171: UPTAKE_SECRETION_REACTION 171 was not found in our reactions
ERROR: upsr_157: UPTAKE_SECRETION_REACTION 157 was not found in our reactions
ERROR: upsr_156: UPTAKE_SECRETION_REACTION 156 was not found in our reactions
Adding from media: For Co2+ added 8 reactions


In [31]:
status, value, growth = PyFBA.fba.run_fba(modeldata, reactions_to_run, media, biomass_equation)
print(f"The biomass reaction has a flux of {value} --> Growth: {growth}")

The biomass reaction has a flux of 4.3860264727446435e-13 --> Growth: False


In [32]:
print(f"There are still {len(missing_reactions.difference(reactions_to_run))} reactions we have not found")

There are still 160 reactions we have not found


### Reactions from closely related organisms

We also gap-fill on closely related organisms. We assume that an organism is most likely to have reactions in its genome that are similar to those in closely related organisms. 

You will need to download the [closest.genomes.roles](https://raw.githubusercontent.com/linsalrob/PyFBA/master/example_data/Citrobacter/ungapfilled_model/closest.genomes.roles) file

In [33]:
reactions_from_other_orgs = PyFBA.gapfill.suggest_from_roles("closest.genomes.roles", modeldata.reactions)
for r in reactions_from_other_orgs:
    modeldata.reactions[r].reset_bounds()
added_reactions.append(("close genomes", reactions_from_other_orgs))
print(f"Before updating we have {len(reactions_to_run)} reactions, and after updating we have ", end="")
reactions_to_run.update(reactions_from_other_orgs)
print(f"{len(reactions_to_run)} reactions")

Before updating we have 4508 reactions, and after updating we have 4863 reactions


In [34]:
status, value, growth = PyFBA.fba.run_fba(modeldata, reactions_to_run, media, biomass_equation)
print(f"The biomass reaction has a flux of {value} --> Growth: {growth}")

The biomass reaction has a flux of 999.9999999999991 --> Growth: True


In [35]:
print(f"There are still {len(missing_reactions.difference(reactions_to_run))} reactions we have not found")

There are still 109 reactions we have not found


### Subsystems

The reactions connect us to subsystems (see [Overbeek et al. 2014](http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965101/)), and this test ensures that all the subsystems are complete. We add reactions required to complete the subsystem.

In [36]:
if not growth:
    subsystem_reactions = \
        PyFBA.gapfill.suggest_reactions_from_subsystems(modeldata.reactions,
                                                        reactions_to_run,
                                                        threshold=0.5)
    for r in subsystem_reactions:
        modeldata.reactions[r].reset_bounds()
    added_reactions.append(("subsystems", subsystem_reactions))
    print(f"Before updating we have {len(reactions_to_run)} reactions, and after updating we have ", end="")
    reactions_to_run.update(subsystem_reactions)
    print(f"{len(reactions_to_run)} reactions")
    status, value, growth = PyFBA.fba.run_fba(modeldata, reactions_to_run, media, biomass_equation)
    print(f"The biomass reaction has a flux of {value} --> Growth: {growth}")
    print(f"There are still {len(missing_reactions.difference(reactions_to_run))} reactions we have not found")

## Revisit EC Numbers

When we added the EC numbers before, we were a little conservative, only adding those EC numbers that appeared in two or less (by default) reactions. If we get here, lets be aggressive and add any EC number regardless of how many reactions we add. We set the `maxnumrx` variable to 0

In [37]:
if not growth:
    ecnumber_reactions = PyFBA.gapfill.suggest_reactions_using_ec(roles, modeldata, reactions_to_run, maxnumrx=0, verbose=True)
    for r in ecnumber_reactions:
        modeldata.reactions[r].reset_bounds()
    added_reactions.append(("ec_numbers_full", ecnumber_reactions))
    # lets try limiting these so we don't add everything
    lec = PyFBA.gapfill.limit_reactions_by_compound(modeldata.reactions, reactions_to_run, ecnumber_reactions, 50)
    print(f"Before limiting we wanted to add {len(ecnumber_reactions)}, and after, we wanted to add {len(lec)} reactions")
    print(f"Before updating we have {len(reactions_to_run)} reactions, and after updating we have ", end="")
    reactions_to_run.update(ecnumber_reactions)
    print(f"{len(reactions_to_run)} reactions")
    status, value, growth = PyFBA.fba.run_fba(modeldata, reactions_to_run, media, biomass_equation)
    print(f"The biomass reaction has a flux of {value} --> Growth: {growth}")
    print(f"There are still {len(missing_reactions.difference(reactions_to_run))} reactions we have not found")

## Linked Reactions

We revist linked reactions once more, because now we have many more reactions in our set to run!

In [38]:
if not growth:
    linked_reactions = PyFBA.gapfill.suggest_linked_reactions(modeldata, reactions_to_run, verbose=True)
    for r in linked_reactions:
        modeldata.reactions[r].reset_bounds()
    added_reactions.append(("linked_reactions_addition", linked_reactions))
    print(f"Before updating we have {len(reactions_to_run)} reactions, and after updating we have ", end="")
    reactions_to_run.update(linked_reactions)
    print(f"{len(reactions_to_run)} reactions")
    status, value, growth = PyFBA.fba.run_fba(modeldata, reactions_to_run, media, biomass_equation)
    print(f"The biomass reaction has a flux of {value} --> Growth: {growth}")
    print(f"There are still {len(missing_reactions.difference(reactions_to_run))} reactions we have not found")

### Orphan compounds

Orphan compounds are those compounds which are only associated with one reaction. They are either produced, or trying to be consumed. We need to add reaction(s) that complete the network of those compounds.

You can change the maximum number of reactions that a compound is in to be considered an orphan (try increasing it to 2 or 3).

In [39]:
if not growth:
    orphan_reactions = PyFBA.gapfill.suggest_by_compound(modeldata,
                                                     reactions_to_run,
                                                     max_reactions=1)
    # lets try limiting these so we don't add everything
    lor = PyFBA.gapfill.limit_reactions_by_compound(modeldata.reactions, reactions_to_run, orphan_reactions, 50)
    print(f"Before limiting we wanted to add {len(orphan_reactions)}, and after, we wanted to add {len(lor)} reactions")
    for r in orphan_reactions:
        modeldata.reactions[r].reset_bounds()
    added_reactions.append(("orphans", orphan_reactions))
    print(f"Before updating we have {len(reactions_to_run)} reactions, and after updating we have ", end="")
    reactions_to_run.update(orphan_reactions)
    print(f"{len(reactions_to_run)} reactions")

In [None]:
status, value, growth = PyFBA.fba.run_fba(modeldata, reactions_to_run, media, biomass_equation)
print(f"The biomass reaction has a flux of {value} --> Growth: {growth}")

In [None]:
print(f"There are still {len(missing_reactions.difference(reactions_to_run))} reactions we have not found")

## Trimming the model

Now that the model has been shown to grow on ArgonneLB media after several gap-fill iterations, we should trim down the reactions to only the required reactions necessary to observe growth.

In this example, we start removing the additional reactions from the last added (`orphans`) and bisect each set, trying to find the miniumum number of reactions that will ensure growth.

We strongly recommend enabling verbose output on the `PyFBA.gapfill.minimize_additional_reactions` function, as it will demonstrate the power of `O(log n)` functions

In [40]:
reqd_additional = set()
print(f"Before we begin, our original reactions were {len(original_reactions_to_run)}")
# Begin loop through all gap-filled reactions
while added_reactions:
    ori = copy.copy(original_reactions_to_run)
    ori.update(reqd_additional)
    # Test next set of gap-filled reactions
    # Each set is based on a method described above
    how, new = added_reactions.pop()
    sys.stderr.write(f"Testing reactions from {how}\n")
    
    # Get all the other gap-filled reactions we need to add
    for tple in added_reactions:
        ori.update(tple[1])
    
    # Use minimization function to determine the minimal
    # set of gap-filled reactions from the current method
    new_essential = PyFBA.gapfill.minimize_additional_reactions(ori, new, modeldata, media,
                                                                biomass_equation, verbose=True)
    sys.stderr.write(f"Saved {len(new_essential)} reactions from {how}\n")
    # Record the method used to determine
    # how the reaction was gap-filled
    for new_r in new_essential:
        modeldata.reactions[new_r].is_gapfilled = True
        modeldata.reactions[new_r].gapfill_method = how
    reqd_additional.update(new_essential)

# Combine old and new reactions
all_reactions = original_reactions_to_run.union(reqd_additional)

Before we begin, our original reactions were 1346


Testing reactions from close genomes
Successfully limited the reactions by compound and reduced from 1670 to 301
At the beginning the base list has 4508 and the optional list has 301 reactions
Iteration: 1 Try: 0 Length: 151 and 150 Growth: False and False
Iteration: 1 Try: 0 Length: 120 and 181 Growth: False and True
Iteration: 2 Try: 0 Length: 91 and 90 Growth: False and False
Iteration: 2 Try: 0 Length: 72 and 109 Growth: False and False
Iteration: 2 Try: 0 Length: 36 and 145 Growth: False and True
Iteration: 3 Try: 0 Length: 73 and 72 Growth: False and False
Iteration: 3 Try: 0 Length: 58 and 87 Growth: False and False
Iteration: 3 Try: 0 Length: 29 and 116 Growth: False and False
Iteration: 3 Try: 0 Length: 14 and 131 Growth: False and True
Iteration: 4 Try: 0 Length: 66 and 65 Growth: False and False
Iteration: 4 Try: 0 Length: 52 and 79 Growth: False and False
Iteration: 4 Try: 0 Length: 26 and 105 Growth: False and False
Iteration: 4 Try: 0 Length: 13 and 118 Growth: False and 

Result: NOT REQUIRED
Single reaction iteration 17 of 17: Attempting without rxn00162: (1) Oxaloacetate[c] + (1) H+[c] <=> (1) CO2[c] + (1) Pyruvate[c]
Result: NOT REQUIRED
There are 3 reactions remaining: {'rxn01513', 'rxn12512', 'rxn03087'}
Saved 3 reactions from close genomes
Testing reactions from media
Successfully limited the reactions by compound and reduced from 1526 to 978
At the beginning the base list has 2985 and the optional list has 978 reactions
Iteration: 1 Try: 0 Length: 489 and 489 Growth: False and True
Iteration: 2 Try: 0 Length: 245 and 244 Growth: True and NOT TESTED
Iteration: 3 Try: 0 Length: 123 and 122 Growth: True and NOT TESTED
Iteration: 4 Try: 0 Length: 62 and 61 Growth: True and NOT TESTED
Iteration: 5 Try: 0 Length: 31 and 31 Growth: True and NOT TESTED
Iteration: 6 Try: 0 Length: 16 and 15 Growth: True and NOT TESTED
Iteration: 7 Try: 0 Length: 8 and 8 Growth: True and NOT TESTED
Iteration: 8 Try: 0 Length: 4 and 4 Growth: True and NOT TESTED
Iteration: 

Iteration: 9 Try: 0 Length: 3 and 3 Growth: False and True
Iteration: 10 Try: 0 Length: 2 and 1 Growth: True and NOT TESTED
Iteration: 11 Try: 0 Length: 1 and 0 Growth: True and NOT TESTED
There are 1 reactions remaining: {'rxn32076'}
Saved 1 reactions from linked_reactions
Testing reactions from essential
Successfully limited the reactions by compound and reduced from 109 to 38
At the beginning the base list has 1355 and the optional list has 38 reactions
Iteration: 1 Try: 0 Length: 19 and 19 Growth: False and False
Iteration: 1 Try: 0 Length: 15 and 23 Growth: False and False
Iteration: 1 Try: 0 Length: 7 and 31 Growth: False and False
Iteration: 1 Try: 0 Length: 3 and 35 Growth: False and True
Iteration: 2 Try: 0 Length: 18 and 17 Growth: False and False
Iteration: 2 Try: 0 Length: 14 and 21 Growth: False and False
Iteration: 2 Try: 0 Length: 7 and 28 Growth: False and False
Iteration: 2 Try: 0 Length: 3 and 32 Growth: False and False
Iteration: 2 Try: 0 Length: 1 and 34 Growth: Fal

In [41]:
with open('gapfilled_reactions.txt', 'w') as out:
    for a in all_reactions:
        out.write(f"{a}\n")

In [42]:
print(f"After completing reaction trimming, we have {len(all_reactions)} reactions")
status, value, growth = PyFBA.fba.run_fba(modeldata, all_reactions, media, biomass_equation)
print(f"The biomass reaction has a flux of {value} --> Growth: {growth}")

After completing reaction trimming, we have 1359 reactions
The biomass reaction has a flux of 290.97497947966707 --> Growth: True


## Other gap-filling techniques

Besides those methods we have described above, listed here are other methods that can be used to gap-fill your model. This list will continue to grow over time as we create new techniques to identify reactions and compounds that should be added to your model.

### Probable reactions
Probable reactions are those reactions whose probability is based on whether there is a protein associated with the reaction and if the reaction's compounds are currently present in the model. Above a certain probability threshold, those reactions will be added to the model.

In [None]:
probable_reactions = PyFBA.gapfill.compound_probability(reactions, reactions_to_run,
                                                        cutoff=0, rxn_with_proteins=True)