# Verifying the Accuracy of iGEM Toronto's Flux Scanning Based on Enforced objective Flux (FSEOF) Implementation

As part of their 2023 project, the Dry Lab sub-team of the University of Toronto's iGEM team implemented the Flux Scanning Based on Enforced objective Flux (FSEOF) method, based on the original FSEOF framework presented in the paper by Choi et al., titled: *In silico identification of gene amplification targets for improvement of lycopene production*. FSEOF is used to identify candidate genes that can be overexpressed in order to achieve a metabolic engineering objective.

<in order to validate our implementation, we decided to replicate so and so experiment (from the paper) to see if we got the same candidate genes for improved lycopene biosynthesis



*Implementation of FSEOF on lycopene-producing E. coli*
Choi et al. 2010: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2869140/

In [1]:
import sys
sys.path.append("..")

## Construction of the lycopene-producing *E. coli*

The parental *E. coli* strain that was used in the paper to experimentally validate the targets identified by FSEOF (paper's FSEOF) is a "recombinant E. coli DH5 alpha strain... which contains the *Erwinia uredovora crtEIB* (lycopene biosynthesis) genes and the *E. coli dxs* gene." (Choi et al. 2010)

< with the exception of dxs (they added it for the purposes of overexpression, but we cant simulate that), since it was already present in the model)>

The genome scale metabolic model used in the paper for the *in-silico* metabolic modeling and analysis of the recombinant E. coli DH5 alpha strain (using MetaFluxNet) was EcoMBEL979, which was expanded to include the lycopene biosynthetic pathway genes (crtEIB).

Instead of using the expanded EcoMBEL979 model for the *in-silico* metabolic modeling and analysis of the recombinant E. coli DH5 alpha strain, this example uses the already available BiGG model for E. coli DH5 alpha (BiGG ID: iEC1368_DH5a), and expands it to include the necessary lycopene biosynthetic pathway genes, crtE, crtB and crtI.

In addition to the lycopene biosynthetic genes, a lycopene demand reaction was added. [Should I explain why here?]
Ref: https://cnls.lanl.gov/external/qbio2018/Slides/FBA%202/qBio-FBA-lab-slides.pdf

Reference for the exact genes (and their reactions) added can be found in choi et. al supplementary file 3, page 11.

In [2]:
import cobra
import cobra
from cobra import Reaction


model = cobra.io.read_sbml_model("iEC1368_DH5a.xml")


# ==============================
# Additional genes and reactions 
# ==============================


def add_single_gene_reaction_pair_lyc(
    model,
    gene_id,
    reaction_id,
    reaction_name,
    metabolites,
    gene_name=None,
):
    assert not model.genes.query(lambda k: k == gene_id, attribute="id")
    assert not model.reactions.query(lambda k: k == reaction_id, attribute="id")

    rxn = cobra.Reaction(id=reaction_id)

    if gene_name is None:
        gene_name = gene_id
    gene = cobra.Gene(gene_id, name=gene_name)

    model.add_reactions([rxn])
    model.genes.add(gene)

    rxn.name = reaction_name
    rxn.bounds = (-1000, 1000)
    rxn.add_metabolites(metabolites)
    rxn.gene_reaction_rule = gene_id


# Add crtE gene
ggpp = cobra.Metabolite(
    id="ggpp",
    formula="C20H33O7P2",
    name="geranylgeranyl diphosphate",
    charge=-3,
    compartment="c",
)
model.add_metabolites([ggpp])

phyto = cobra.Metabolite(
    id="phyto",
    formula="C40H64",
    name="phytoene",
    charge=0,
    compartment="c",
)
model.add_metabolites([phyto])

lyco = cobra.Metabolite(
    id="lyco",
    formula="C40H56",
    name="lycopene",
    charge=0,
    compartment="c",
)
model.add_metabolites([lyco])

add_single_gene_reaction_pair_lyc(
    model=model,
    gene_id="crtE",
    reaction_id="ZCRTE",
    reaction_name="Synthesis of geranylgeranyl pyrophosphate",
    metabolites={
        "ipdp_c": -1.0,
        "frdp_c": -1.0,
        "ggpp": 1.0,
        "ppi_c": 1.0
    },
)


# Add crtB gene 
add_single_gene_reaction_pair_lyc(
    model=model,
    gene_id="crtB",
    reaction_id="ZCRTB",
    reaction_name="Synthesis of phytoene",
    metabolites={
        "ggpp": -2.0,
        "phyto": 1.0,
        "ppi_c": 1.0
    },
)

# Add crtI gene 
add_single_gene_reaction_pair_lyc(
    model=model,
    gene_id="crtI",
    reaction_id="ZCRTI",
    reaction_name="Synthesis of lycopene from phytoene (dehydrogenation reaction)",
    metabolites={
        "phyto": -1.0,
        "fad_c": -8.0,
        "lyco": 1.0,
        "fadh2_c": 8.0
    },
)


# Add lycopene demand 
# https://cnls.lanl.gov/external/qbio2018/Slides/FBA%202/qBio-FBA-lab-slides.pdf (slide 21)
lyco_dem = Reaction("LYCOdem")
model.add_reactions([lyco_dem])
lyco_dem.name = "Lycopene demand reaction"
lyco_dem.lower_bound = 0
lyco_dem.upper_bound = 1000
lyco_dem.add_metabolites({"lyco": -1.0})

## Media Setup

Next, the media must be set. [Explain media setup]


In [3]:
#set LB media
#lb medium ref: https://github.com/cdanielmachado/carveme/blob/master/carveme/data/input/media_db.tsv
#TODO: what should the flux of all the metabolites here be set to?
#TODO: simulated LB media here, but paper's media is '2 x YT'

LB_MEDIA_COMP = [
    "adn", "ala__L", "amp", "arg__L",
    "aso3", "asp__L", "ca2", "cbl1",
    "cd2", "cl", "cmp", "cobalt2",
    "cro4", "cu2", "cys__L", "dad_2",
    "dcyt", "fe2", "fe3", "fol", "glc__D",
    "glu__L", "gly", "gmp", "gsn", "h2o",
    "h2s", "h", "hg2", "his__L", "hxan",
    "ile__L", "ins", "k", "leu__L", "lipoate",
    "lys__L", "met__L", "mg2", "mn2", "mobd",
    "na1", "nac", "nh4", "ni2", "o2",
    "phe__L", "pheme", "pi", "pnto__R",
    "pro__L", "pydx", "ribflv", "ser__L",
    "so4", "thm", "thr__L", "thymd", "trp__L",
    "tyr__L", "ump", "ura", "uri", "val__L",
    "zn2",
]

for metabolite in LB_MEDIA_COMP:
    model.medium[f"EX_{metabolite}_e"] = 5

In [4]:
# Check objective to see if matches the one presented in paper

# Max. theoretical lycopene flux
model.objective = "LYCOdem"
model.slim_optimize()

0.9657492354740086

Now, we will run our implementation of FSEOF on the model created above.

In [None]:
import cobra_fseof

results = cobra_fseof.fseof(model, 10, "LYCOdem", "BIOMASS_Ec_iJO1366_core_53p95M", "max")
print(results.scan)

FSEOF; Scanning: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 29.30it/s]
FSEOF; Running FVA:  30%|██████████████████████████████████▏                                                                               | 3/10 [00:48<01:53, 16.19s/it]

Let's print the list of target reactions that FSEOF identified.

In [None]:
scan = results.scan 
targets = set(scan[scan.target].index)
targets

In [None]:
# Choi et al., 2010 Supplemental Table 4A (All reactions that were identified by their FSEOF)

# TCA cycle reactions
# "FUM_rxn" not in model, so replaced it with "FUM"
tableFourTCA = {"ACONT", "CS", "FUM", "ICDHyr", "MDH", "SUCDli", "SUCOAS", "AKGDH", "SUCD4"}

# "MECDPDH5" from table 4A, is replaced with "MECDPDH2" and "MECDPDH5" here
tableFourLyc = {"CDPMEK", "DMATT", "DXPRIi", "DXPS","GRTT", "IPDDI", "IPDPS", "MECDPDH5", "MECDPDH2", "MECDPS", "MEPCT", "ZCRTE", "ZCRTB", "ZCRTI"}

# TODO: PFK_3==PFK (but theres also PFK in model...)
# FBA is replaced with "FBA3" http://bigg.ucsd.edu/models/iEC1368_DH5a/genes/locus_2187
tableFourGlyc = {"FBA3", "PFK_3", "PGI", "TPI"}

# Disclaimer: "CACTP" is replaced by "CRNabcpp" (since model did not have CACTP rxn)
# CAITP is replaced by CRNt8pp (since model did not have CAITP rxn)
tableFourOther = {"ADK1", "ADK4", "CYTK1", "CRNabcpp", "CRNt8pp", "PPA", "CO2t", "H2Ot", "PIt2rpp"}

In [None]:
# Of the genes in supp file 4, how many was our algorithm able to identify?

choi_targets = tableFourTCA | tableFourLyc | tableFourGlyc | tableFourOther
print(f"Number of reactions we identified: {len(targets)}")
print(f"Number of reactions Choi et al. identified: {len(choi_targets)}")
print(
    f"Percentage of reactions Choi et al. identified that we also identified: "
    f"{len(targets & choi_targets) / len(choi_targets):0.2%}"
)
print(
    f"Percentage of reactions we identified that Choi et al. also identified: "
    f"{len(targets & choi_targets) / len(targets):0.2%}"
)