# Heterologous pathway implementation

Yeast cells naturally produce steroids. The main steroid in yeast is ergosterol which is produced in a long biosynthetic pathway from the precursor squalene (Figure 2.1). In this pathway the intermediates zymosterol and 5-dehydroepisterol are produced. Interestingly, from these compounds it should be theoretically possible to produce progesterone through the implementation of new pathways. However, the biosynthesis of progesterone from 5-dehydroepisterol rely on an enzymatic reaction that to our knowledge is not validated. Therefore, the production of progesterone from zymosterol is of most interest.

In order to produce progesterone in yeast a new pathway needs to be established. This pathway rely on four heterologous enzymes with origin from mammals and two naturally yeast enzymes. Zymosterol is converted into lathosterol via the yeast enzyme ERG2 and DHCR24 in random order forming a diamond shaped path. The yeast enzyme ERG3 converts lathosteol into 7-dehydrocholsterol and subsequently three mammalian enzymes convert 7-dehydrocholsterol into progesterone. 7-dehydrocholsterol in converted to cholesterol by DHCR7, cholesterol is converted to pregnenolone by CYP11A1, and finally HSD3B converts pregnenolone into progesterone.

The co-enzymes NADP(H) and NAD(H) are used multiple times either directly or indirectly by enzymes in the progesterone pathway. However, for simplification we have treated all co-enzymes as direct substrates of the enzymes. Furthermore, the mammalian CYP11A1 enzyme which converts cholesterol into pregnenolone in three independent reactions were simplified to one comprehensive reaction. 

![figures/pathway_med_strukturer_v3.png](figures/pathway_med_strukturer_v3.png)
**Figure 2.1.** Steroid biosynthesis. Natural ergosterol pathway is shown with a green box and implemented progesterone pathway is shown with a blue box. The enzymes are represented by their gene name where endogenous yeast genes are represented in black and heterologous genes are represented in red. Arrows indicate the direction of reaction. Co-enzymes and co-substrates are shown in light grey.

The new reactions and metabolites needed in the iMM904 model in order to implement the progesterone pathway were added to the reactions.csv and metabolites.csv files in the data folder. Subsequently, the following code were used to load the model, add the reactions and metabolites from the csv files, and finally save a new model:

In [125]:
# Load libraries
import numpy as np
from cobra.io import read_sbml_model, write_sbml_model
from cobra.util import create_stoichiometric_matrix
from cobra import Reaction, Metabolite
from cobra.core.gene import GPR

In [126]:
# Loading model
infilename = 'models/iMM904.xml'
print(f"Loading {infilename}")
model = read_sbml_model(infilename)

Loading models/iMM904.xml


In [127]:
# Model statistics
before_add = {"metabolites": len(model.metabolites), "reactions": len(model.reactions), "genes": len(model.genes)}
print("Metabolites:", before_add['metabolites']) #1226
print("Reactions:", before_add['reactions']) #1577
print("Genes:", before_add['genes']) #905

Metabolites: 1226
Reactions: 1577
Genes: 905


In [128]:
# Add all new metabolites from metabolites.csv
new_metabolites = dict()
with open("data/metabolites.csv","r") as infile:
    infile.readline()
    for line in infile:
        line = line.rstrip().split(",")
        m = Metabolite(
            line[0],
            formula=line[1],
            name=line[2],
            compartment=line[3])
        print(f'Adding metabolite {m.name} (id: {m})')
        new_metabolites[line[0]] = m

Adding metabolite cholesterol (id: cholesterol_c)
Adding metabolite 7-dehydrocholesterol (id: dehydrocholesterol_c)
Adding metabolite pregnenolone (id: pregnenolone_c)
Adding metabolite progesterone (id: progesterone_c)
Adding metabolite 5alpha-Cholesta-7_24-dien-3beta-ol (id: cholesta724dien3betaol_c)
Adding metabolite 5alpha-Cholest-8-en-3beta-ol (id: cholesta8en3betaol_c)
Adding metabolite lathosterol (id: lathosterol_c)
Adding metabolite 4-methylpentanal (id: methylpentanal_c)


In [129]:
# Add all new reactions from reactions.csv
with open("data/reactions.csv","r") as infile:
    infile.readline()
    for line in infile:
        line = line.rstrip().split(",")
        r = Reaction(line[0])
        r.name = line[2]
        if line[3] != "":
            r.subsystem = line[3]
        r.lower_bound = float(line[4])
        r.upper_bound = float(line[5])
        if line[1] != "":
            r.gpr = GPR.from_string(line[1])
        for i in range(int(len(line[6:])/2)):
            metaboliteID = line[6+i*2]
            if metaboliteID != "":
                if metaboliteID in model.metabolites:
                    metaboliteID = model.metabolites.get_by_id(metaboliteID)
                else:
                    metaboliteID = new_metabolites[metaboliteID]
                bin = float(line[7+i*2])
                r.add_metabolites({metaboliteID:bin})
        print(f'Adding reaction {r} | enzyme {r.gpr}')
        model.add_reactions([r])

model.reactions.CHLSTI.gpr = GPR.from_string("EBP")
print(f'Adding gene annotation (EBP) to reaction: {model.reactions.CHLSTI}')

model.add_boundary(model.metabolites.get_by_id("methylpentanal_c"), type="sink")
print(f'Adding reaction {model.reactions.SK_methylpentanal_c.name}: {model.reactions.SK_methylpentanal_c.reaction}')


Adding reaction R07498: h_c + nadph_c + zymst_c <=> cholesta8en3betaol_c + nadp_c | enzyme DHCR24
Adding reaction R05703: cholesta724dien3betaol_c + h_c + nadph_c <=> lathosterol_c + nadp_c | enzyme DHCR24
Adding reaction R01456: dehydrocholesterol_c + h_c + nadph_c --> cholesterol_c + nadp_c | enzyme DHCR7
Adding reaction ECYP11A1: cholesterol_c + 6.0 h_c + 6.0 nadph_c + 3.0 o2_c --> 4.0 h2o_c + methylpentanal_c + 6.0 nadp_c + pregnenolone_c | enzyme CYP11A1
Adding reaction R02216: nad_c + pregnenolone_c <=> h_c + nadh_c + progesterone_c | enzyme HSD3B
Adding reaction R03353: cholesta8en3betaol_c --> lathosterol_c | enzyme YMR202W
Adding reaction R07215: h_c + lathosterol_c + nadph_c + o2_c --> dehydrocholesterol_c + 2.0 h2o_c + nadp_c | enzyme YLR056W
Adding reaction R04804: zymst_c --> cholesta724dien3betaol_c | enzyme YMR202W
Adding gene annotation (EBP) to reaction: CHLSTI: amet_c + o2_c + zymst_c --> ahcys_c + ergtetrol_c + 2.0 h2o_c + h_c
Adding reaction 4-methylpentanal sink: m

In [130]:
# New model statistics
print("Metabolites:", len(model.metabolites), f"(change: {len(model.metabolites) - before_add['metabolites']})") 
print("Reactions:", len(model.reactions), f"(change: {len(model.reactions) - before_add['reactions']})") 
print("Genes:",len(model.genes), f"(change: {len(model.genes) - before_add['genes']})") 

Metabolites: 1234 (change: 8)
Reactions: 1586 (change: 9)
Genes: 910 (change: 5)


In [131]:
# Saving new model
outfilename = "models/iMM904_progesterone.xml"
model.id = outfilename.split("/")[-1].split(".")[0]
print(f"Saving to {outfilename}")
write_sbml_model(model, outfilename)

Saving to models/iMM904_progesterone.xml


As seen from the code output and from the statistics of the new model, we successfully implemented eighth new metabolites, nine new reactions, and four new genes to the model. 

## Alternative pathways


The "pathway_prediction" algorithm from cameo was used to find four different heterlogous pathways producing progesterone using the iMM904 model.

In [132]:
# Generating progesterone pathways with cameo
from cameo import models
from cameo.strain_design import pathway_prediction

bigg_model = models.bigg.iMM904

predictor = pathway_prediction.PathwayPredictor(bigg_model)

pathways = predictor.run(product="progesterone", max_predictions=4)

Unnamed: 0,equation,lower_bound,upper_bound
MNXR82275,H(+) + cholesterol + FAD <=> FADH2 + cholesta-...,-1000.0,1000.0
MNXR1551,"H(+) + NADPH + cholesta-5,7,24-trien-3beta-ol ...",-1000.0,1000.0
MNXR82170,"H(+) + FAD + 5alpha-cholesta-7,24-dien-3beta-o...",-1000.0,1000.0
MNXR1851,"5alpha-cholesta-8,24-dien-3beta-ol <=> 5alpha-...",-1000.0,1000.0
MNXR3037,H(+) + NADH + progesterone <=> pregnenolone + ...,-1000.0,1000.0
MNXR4011,H(+) + cholesterol + 2.0 O2 + NADPH <=> 2.0 H2...,-1000.0,1000.0


Unnamed: 0,equation,lower_bound,upper_bound
MNXR1552,cholesterol + NADP(+) <=> H(+) + NADPH + chole...,-1000.0,1000.0
MNXR82170,"H(+) + FAD + 5alpha-cholesta-7,24-dien-3beta-o...",-1000.0,1000.0
MNXR1851,"5alpha-cholesta-8,24-dien-3beta-ol <=> 5alpha-...",-1000.0,1000.0
MNXR3037,H(+) + NADH + progesterone <=> pregnenolone + ...,-1000.0,1000.0
MNXR11345,"H(+) + NADPH + cholesta-5,7,24-trien-3beta-ol ...",-1000.0,1000.0
MNXR4011,H(+) + cholesterol + 2.0 O2 + NADPH <=> 2.0 H2...,-1000.0,1000.0


Unnamed: 0,equation,lower_bound,upper_bound
MNXR5471,cholesterol + NADP(+) <=> H(+) + NADPH + chole...,-1000.0,1000.0
MNXR82170,"H(+) + FAD + 5alpha-cholesta-7,24-dien-3beta-o...",-1000.0,1000.0
MNXR82403,"H(+) + FAD + cholesta-5,24-dien-3beta-ol <=> F...",-1000.0,1000.0
MNXR1851,"5alpha-cholesta-8,24-dien-3beta-ol <=> 5alpha-...",-1000.0,1000.0
MNXR3037,H(+) + NADH + progesterone <=> pregnenolone + ...,-1000.0,1000.0
MNXR4011,H(+) + cholesterol + 2.0 O2 + NADPH <=> 2.0 H2...,-1000.0,1000.0


Unnamed: 0,equation,lower_bound,upper_bound
MNXR1552,cholesterol + NADP(+) <=> H(+) + NADPH + chole...,-1000.0,1000.0
MNXR3338,"H(+) + O2 + NADPH + 5alpha-cholesta-7,24-dien-...",-1000.0,1000.0
MNXR1851,"5alpha-cholesta-8,24-dien-3beta-ol <=> 5alpha-...",-1000.0,1000.0
MNXR3037,H(+) + NADH + progesterone <=> pregnenolone + ...,-1000.0,1000.0
MNXR11345,"H(+) + NADPH + cholesta-5,7,24-trien-3beta-ol ...",-1000.0,1000.0
MNXR4011,H(+) + cholesterol + 2.0 O2 + NADPH <=> 2.0 H2...,-1000.0,1000.0


The reactions from the above pathways were extracted and the ones that differentiate from the already implemented reactions were visualized (Figure 2.2). Reactions containing the FAD(H2) co-factor are not included since FADH2 does exists in the cytoplasm in model iMM904. 
Interestingly, cameo have found another reaction (MNXR4011) between cholesterol and pregnenolone where only one NADP(H), instead of three in the implemented pathway, is needed. Therefore, this reaction was in a new model (iMM904_progesterone_pathway1) implemented instead of the existing reaction (CYP11A1).


![figures/heterologous_new.png](figures/heterologous_new.png)
**Figure 2.2.** Hetrologous pathways. The enzymes are represented by their gene name where implemented genes are represented in black and genes found using the cameo PathwayPredictor algorithm which differs from the impemented genes are represented in red. Arrows indicate the direction of reaction. Co-enzymes and co-substrates are shown in light grey. Large blue, green, and yellow arrows indicate the path of pathway 1, 2, and 3, respectively.

In [133]:
# Knock out of CYP11A1 gene
KOs = ["CYP11A1"]
model_pathway1 = model.copy()
model_pathway1.id = "iMM904_progesterone_pathway1"
print(f"Model {model_pathway1.id} were made as a copy of {model.id}")
print(f"For model {model_pathway1.id}:")
for KO in KOs:
    model_pathway1.genes.get_by_id(KO).knock_out()
    print(f"{KO} were knocked out")

Model iMM904_progesterone_pathway1 were made as a copy of iMM904_progesterone
For model iMM904_progesterone_pathway1:
CYP11A1 were knocked out


In [134]:
# Add all new metabolites from metabolites_new.csv
new_metabolites = dict()
with open("data/metabolites_new.csv","r") as infile:
    infile.readline()
    for line in infile:
        line = line.rstrip().split(",")
        m = Metabolite(
            line[0],
            formula=line[1],
            name=line[2],
            compartment=line[3])
        print(f'Adding metabolite {m.name} (id: {m})')
        new_metabolites[line[0]] = m

Adding metabolite 7-dehydrodesmosterol (id: dehydrodesmosterol_c)
Adding metabolite desmosterol (id: desmosterol_c)


In [135]:
# Add the MNXR4011 reaction
print(f"For model {model_pathway1.id}:")
import_reactions = ("MNXR4011")
with open("data/reactions_new.csv","r") as infile:
    infile.readline()
    for line in infile:
        line = line.rstrip().split(",")
        r = Reaction(line[0])
        if line[0] in import_reactions:
            r.name = line[2]
            if line[3] != "":
                r.subsystem = line[3]
            r.lower_bound = float(line[4])
            r.upper_bound = float(line[5])
            if line[1] != "":
                r.gpr = GPR.from_string(line[1])
            for i in range(int(len(line[6:])/2)):
                metaboliteID = line[6+i*2]
                if metaboliteID != "":
                    if metaboliteID in model_pathway1.metabolites:
                        metaboliteID = model_pathway1.metabolites.get_by_id(metaboliteID)
                    else:
                        metaboliteID = new_metabolites[metaboliteID]
                    bin = float(line[7+i*2])
                    r.add_metabolites({metaboliteID:bin})
            print(f'Adding reaction {r} | enzyme {r.gpr}')
            model_pathway1.add_reactions([r])

For model iMM904_progesterone_pathway1:
Adding reaction MNXR4011: cholesterol_c + h_c + nadph_c + o2_c <=> 2.0 h2o_c + methylpentanal_c + nadp_c + pregnenolone_c | enzyme MNXR4011


All cameo pathways agree with the implemented pathway in the way that zymosterol is in four steps converted into cholesterol which is afterwards converted by two steps to progesterone. However, different paths towards cholesterol are used. Together it shows a grid pathway structure (Figure 2.2). Therefore, to test which way through the grid is optimal, new models were constructed with pathway 1, 2, 3, or all combined. Pathway 2 goes from zymosterol via 7-dehydrodesmosterol and 7-dehydrocholesterol towards cholesterol and pathway 3 goes from zymosterol via desmosterol towards cholesterol (Figure 2.2). All new models contain the MNXR4011 reaction instead of CYP11A1. 
To avoid futile cycling all reactions pointing downwards in figure 2.2 are unidirectional.


In [136]:
# Knock out genes in pathway 2 and 3 models
model_pathway2 = model_pathway1.copy()
model_pathway3 = model_pathway1.copy()
model_pathway_combine = model_pathway1.copy()
model_pathway2.id = "iMM904_progesterone_pathway2"
model_pathway3.id = "iMM904_progesterone_pathway3"
model_pathway_combine.id = "iMM904_progesterone_pathway_combine"
print(f"Model {model_pathway2.id} were made as a copy of {model_pathway1.id}")
print(f"Model {model_pathway3.id} were made as a copy of {model_pathway1.id}")
print(f"Model {model_pathway_combine.id} were made as a copy of {model_pathway1.id}")
models = [model_pathway2,model_pathway3,model_pathway_combine]
KOs_p2 = ["DHCR24","YLR056W"]
KOs_p3 = ["DHCR24","YLR056W","DHCR7"]
print(f"For model {model_pathway2.id}:")
for KO in KOs_p2:
    model_pathway2.genes.get_by_id(KO).knock_out()
    if KO == "YLR056W":
        print(f"ERG3 were knocked out")
    else:
        print(f"{KO} were knocked out")
print(f"For model {model_pathway3.id}:")
for KO in KOs_p3:
    model_pathway3.genes.get_by_id(KO).knock_out()
    if KO == "YLR056W":
        print(f"ERG3 were knocked out")
    else:
        print(f"{KO} were knocked out")

Model iMM904_progesterone_pathway2 were made as a copy of iMM904_progesterone_pathway1
Model iMM904_progesterone_pathway3 were made as a copy of iMM904_progesterone_pathway1
Model iMM904_progesterone_pathway_combine were made as a copy of iMM904_progesterone_pathway1
For model iMM904_progesterone_pathway2:
DHCR24 were knocked out
ERG3 were knocked out
For model iMM904_progesterone_pathway3:
DHCR24 were knocked out
ERG3 were knocked out
DHCR7 were knocked out


In [137]:
# Add reactionas for pathway 2, 3, and combined
models = {model_pathway2:['MNXR3338','MNXR11345'],model_pathway3:['MNXR3338','MNXR1551','MNXR5471'],model_pathway_combine:['MNXR3338','MNXR1551','MNXR5471','MNXR11345']}
for m, import_reactions in models.items():
    print(f"For model {m.id}:")
    with open("data/reactions_new.csv","r") as infile:
        infile.readline()
        for line in infile:
            line = line.rstrip().split(",")
            r = Reaction(line[0])
            if line[0] in import_reactions:
                r.name = line[2]
                if line[3] != "":
                    r.subsystem = line[3]
                r.lower_bound = float(line[4])
                r.upper_bound = float(line[5])
                if line[1] != "":
                    r.gpr = GPR.from_string(line[1])
                for i in range(int(len(line[6:])/2)):
                    metaboliteID = line[6+i*2]
                    if metaboliteID != "":
                        if metaboliteID in m.metabolites:
                            metaboliteID = m.metabolites.get_by_id(metaboliteID)
                        else:
                            metaboliteID = new_metabolites[metaboliteID]
                        bin = float(line[7+i*2])
                        r.add_metabolites({metaboliteID:bin})
                print(f'Adding reaction {r} | enzyme {r.gpr}')
                m.add_reactions([r])

For model iMM904_progesterone_pathway2:
Adding reaction MNXR11345: dehydrodesmosterol_c + h_c + nadph_c <=> dehydrocholesterol_c + nadp_c | enzyme MNXR11345
Adding reaction MNXR3338: cholesta724dien3betaol_c + h_c + nadph_c + o2_c --> dehydrodesmosterol_c + 2.0 h2o_c + nadp_c | enzyme MNXR3338
For model iMM904_progesterone_pathway3:
Adding reaction MNXR1551: dehydrodesmosterol_c + h_c + nadph_c --> desmosterol_c + nadp_c | enzyme MNXR1551
Adding reaction MNXR5471: desmosterol_c + h_c + nadph_c <=> cholesterol_c + nadp_c | enzyme MNXR5471
Adding reaction MNXR3338: cholesta724dien3betaol_c + h_c + nadph_c + o2_c --> dehydrodesmosterol_c + 2.0 h2o_c + nadp_c | enzyme MNXR3338
For model iMM904_progesterone_pathway_combine:
Adding reaction MNXR1551: dehydrodesmosterol_c + h_c + nadph_c --> desmosterol_c + nadp_c | enzyme MNXR1551
Adding reaction MNXR11345: dehydrodesmosterol_c + h_c + nadph_c <=> dehydrocholesterol_c + nadp_c | enzyme MNXR11345
Adding reaction MNXR5471: desmosterol_c + h_c 

In [138]:
# Saving new model
models = [model_pathway1,model_pathway2,model_pathway3,model_pathway_combine]
for m in models:
    outfilename = 'models/'+m.id+'.xml'
    print(f"Saving to {outfilename}")
    write_sbml_model(m, outfilename)

Saving to models/iMM904_progesterone_pathway1.xml
Saving to models/iMM904_progesterone_pathway2.xml
Saving to models/iMM904_progesterone_pathway3.xml
Saving to models/iMM904_progesterone_pathway_combine.xml


In [139]:
# Add demand reaction for progesterone
models = [model,model_pathway1,model_pathway2,model_pathway3,model_pathway_combine]
for m in models:
    m.add_boundary(m.metabolites.get_by_id('progesterone_c'), type='demand')
    print(f'Model {m.id}: Adding reaction {m.reactions.DM_progesterone_c.name}: {m.reactions.DM_progesterone_c.reaction}')

Model iMM904_progesterone: Adding reaction progesterone demand: progesterone_c --> 
Model iMM904_progesterone_pathway1: Adding reaction progesterone demand: progesterone_c --> 
Model iMM904_progesterone_pathway2: Adding reaction progesterone demand: progesterone_c --> 
Model iMM904_progesterone_pathway3: Adding reaction progesterone demand: progesterone_c --> 
Model iMM904_progesterone_pathway_combine: Adding reaction progesterone demand: progesterone_c --> 


In [140]:
# Calculate µ
biomass = []
for i, m in enumerate(models):
    biomass.append(m.optimize().objective_value)
    print(f'{m.id}: µ = {biomass[i]}')

iMM904_progesterone: µ = 0.28786570370401837
iMM904_progesterone_pathway1: µ = 0.2878657037040172
iMM904_progesterone_pathway2: µ = 0.28786570370401404
iMM904_progesterone_pathway3: µ = 0.287865703704016
iMM904_progesterone_pathway_combine: µ = 0.28786570370401654


In [141]:
# Calculate flux towards progesterone
pp = []
for i, m in enumerate(models):
    with m:
        objective_reaction = m.reactions.R02216
        m.objective = objective_reaction
        pp.append(m.optimize().objective_value)
    print(f'{m.id}: progesterone flux = {pp[i]}')

iMM904_progesterone: progesterone flux = 0.14285714285714285
iMM904_progesterone_pathway1: progesterone flux = 0.16666666666666666
iMM904_progesterone_pathway2: progesterone flux = 0.16666666666666666
iMM904_progesterone_pathway3: progesterone flux = 0.16666666666666666
iMM904_progesterone_pathway_combine: progesterone flux = 0.16666666666666666


In [142]:
# Calculate procentage increase in flux towards progesterone
print(f"Procentage increase in flux towards progesterone: {round((pp[1]/pp[0]-1)*100,2)}%")

Procentage increase in flux towards progesterone: 16.67%


From the results it is clear that no matter how the pathways go through the “grid” from zymosterol to cholesterol, neither growth rate nor flux towards progesterone are chanced. However, when swapping the CYP11A1 enzyme with MNXR4011 an increase of 16.67% in flux towards progesterone is obtained.