# Heterologous pathway implementation

Yeast cells naturally produce steroids. The main steroid in yeast is ergosterol which is produced in a long biosynthetic pathway from the precursor squalene (**Figure 2.1**). In this pathway the intermediates zymosterol and 5-dehydroepisterol are produced. Interestingly, from these compounds it should be theoretically possible to produce progesterone through the implementation of new pathways. However, the biosynthesis of progesterone from 5-dehydroepisterol rely on an enzymatic reaction that to our knowledge is not validated. Therefore, the production of progesterone from zymosterol is of most interest.

In order to produce progesterone in yeast a new pathway needs to be established. This pathway rely on four heterologous enzymes with origin from mammals and two naturally yeast enzymes. Zymosterol is converted into lathosterol via the yeast enzyme ERG2 and DHCR24 in random order forming a diamond shaped path. The yeast enzyme ERG3 converts lathosteol into 7-dehydrocholsterol and subsequently three mammalian enzymes convert 7-dehydrocholsterol into progesterone. 7-dehydrocholsterol in converted to cholesterol by DHCR7, cholesterol is converted to pregnenolone by CYP11A1, and finally HSD3B converts pregnenolone into progesterone.

The co-enzymes NADP(H) and NAD(H) are used multiple times either directly or indirectly by enzymes in the progesterone pathway. However, for simplification we have treated all co-enzymes as direct substrates of the enzymes. Furthermore, the mammalian CYP11A1 enzyme which converts cholesterol into pregnenolone in three independent reactions were simplified to one comprehensive reaction. 

![figures/pathway_med_strukturer_v3.png](figures/pathway_med_strukturer_v3.png)
**Figure 2.1.** Steroid biosynthesis. Natural ergosterol pathway is shown with a green box and implemented progesterone pathway is shown with a blue box. The enzymes are represented by their gene name where endogenous yeast genes are represented in black and heterologous genes are represented in red. Arrows indicate the direction of reaction. Co-enzymes and co-substrates are shown in light grey.

The new reactions and metabolites needed in the iMM904 model in order to implement the progesterone pathway were added to the reactions.csv and metabolites.csv files in the data folder. Subsequently, the following code were used to load the model, add the reactions and metabolites from the csv files, and finally save a new model:

In [1]:
# Load libraries
import numpy as np
from cobra.io import read_sbml_model, write_sbml_model
from cobra.util import create_stoichiometric_matrix
from cobra import Reaction, Metabolite
from cobra.core.gene import GPR

In [2]:
# Loading model
infilename = 'models/iMM904.xml'
print(f"Loading {infilename}")
model = read_sbml_model(infilename)

Loading models/iMM904.xml


In [3]:
# Model statistics
before_add = {"metabolites": len(model.metabolites), "reactions": len(model.reactions), "genes": len(model.genes)}
print("Metabolites:", before_add['metabolites']) #1226
print("Reactions:", before_add['reactions']) #1577
print("Genes:", before_add['genes']) #905

Metabolites: 1226
Reactions: 1577
Genes: 905


In [4]:
# Add all new metabolites from metabolites.csv
new_metabolites = dict()
with open("data/metabolites.csv","r") as infile:
    infile.readline()
    for line in infile:
        line = line.rstrip().split(",")
        m = Metabolite(
            line[0],
            formula=line[1],
            name=line[2],
            compartment=line[3])
        print(f'Adding metabolite {m.name} (id: {m})')
        new_metabolites[line[0]] = m

Adding metabolite cholesterol (id: cholesterol_c)
Adding metabolite 7-dehydrocholesterol (id: dehydrocholesterol_c)
Adding metabolite pregnenolone (id: pregnenolone_c)
Adding metabolite progesterone (id: progesterone_c)
Adding metabolite 5alpha-Cholesta-7_24-dien-3beta-ol (id: cholesta724dien3betaol_c)
Adding metabolite 5alpha-Cholest-8-en-3beta-ol (id: cholesta8en3betaol_c)
Adding metabolite lathosterol (id: lathosterol_c)
Adding metabolite 4-methylpentanal (id: methylpentanal_c)


In [5]:
# Add new reactions from reactions.csv
def import_reactions(reaction_dict,metabolite_dict,infilename):
    for m, reactions in reaction_dict.items():
        print(f"For model {m.id}:")
        with open(infilename,"r") as infile:
            infile.readline()
            for line in infile:
                line = line.rstrip().split(",")
                if line[0] in reactions:
                    r = Reaction(line[0])
                    r.name = line[2]
                    if line[3] != "":
                        r.subsystem = line[3]
                    r.lower_bound = float(line[4])
                    r.upper_bound = float(line[5])
                    if line[1] != "":
                        r.gpr = GPR.from_string(line[1])
                    for i in range(int(len(line[6:])/2)):
                        metaboliteID = line[6+i*2]
                        if metaboliteID != "":
                            if metaboliteID in m.metabolites:
                                metaboliteID = m.metabolites.get_by_id(metaboliteID)
                            else:
                                metaboliteID = metabolite_dict[metaboliteID]
                            bin = float(line[7+i*2])
                            r.add_metabolites({metaboliteID:bin})
                    print(f'Adding reaction {r} | enzyme {r.gpr}')
                    m.add_reactions([r])
        m.reactions.CHLSTI.gpr = GPR.from_string("EBP")
        print(f'Adding gene annotation (EBP) to reaction: {m.reactions.CHLSTI}')
        m.add_boundary(m.metabolites.get_by_id("methylpentanal_c"), type="sink")
        print(f'Adding reaction {m.reactions.SK_methylpentanal_c.name}: {m.reactions.SK_methylpentanal_c.reaction}')


reaction_dict = {model:("R07498","R05703","R01456","ECYP11A1","R02216","R03353","R07215","R04804")}
import_reactions(reaction_dict,new_metabolites,"data/reactions.csv")


For model iMM904:
Adding reaction R07498: h_c + nadph_c + zymst_c <=> cholesta8en3betaol_c + nadp_c | enzyme DHCR24
Adding reaction R05703: cholesta724dien3betaol_c + h_c + nadph_c <=> lathosterol_c + nadp_c | enzyme DHCR24
Adding reaction R01456: dehydrocholesterol_c + h_c + nadph_c --> cholesterol_c + nadp_c | enzyme DHCR7
Adding reaction ECYP11A1: cholesterol_c + 6.0 h_c + 6.0 nadph_c + 3.0 o2_c --> 4.0 h2o_c + methylpentanal_c + 6.0 nadp_c + pregnenolone_c | enzyme CYP11A1
Adding reaction R02216: nad_c + pregnenolone_c <=> h_c + nadh_c + progesterone_c | enzyme HSD3B
Adding reaction R03353: cholesta8en3betaol_c --> lathosterol_c | enzyme YMR202W
Adding reaction R07215: h_c + lathosterol_c + nadph_c + o2_c --> dehydrocholesterol_c + 2.0 h2o_c + nadp_c | enzyme YLR056W
Adding reaction R04804: zymst_c --> cholesta724dien3betaol_c | enzyme YMR202W
Adding gene annotation (EBP) to reaction: CHLSTI: amet_c + o2_c + zymst_c --> ahcys_c + ergtetrol_c + 2.0 h2o_c + h_c
Adding reaction 4-meth

In [6]:
# New model statistics
print("Metabolites:", len(model.metabolites), f"(change: {len(model.metabolites) - before_add['metabolites']})") 
print("Reactions:", len(model.reactions), f"(change: {len(model.reactions) - before_add['reactions']})") 
print("Genes:",len(model.genes), f"(change: {len(model.genes) - before_add['genes']})") 

Metabolites: 1234 (change: 8)
Reactions: 1586 (change: 9)
Genes: 910 (change: 5)


In [7]:
# Saving new model
outfilename = "models/iMM904_progesterone_manual.xml"
model.id = outfilename.split("/")[-1].split(".")[0]
print(f"Saving to {outfilename}")
write_sbml_model(model, outfilename)

Saving to models/iMM904_progesterone_manual.xml


As seen from the code output and from the statistics of the new model, we successfully implemented eighth new metabolites, nine new reactions, and four new genes to the model. 

## Alternative pathways


The "pathway_prediction" algorithm from cameo was used to find four different heterlogous pathways producing progesterone using the iMM904 model.

In [8]:
# Generating progesterone pathways with cameo
# from cameo import models
# from cameo.strain_design import pathway_prediction

# bigg_model = models.bigg.iMM904

# predictor = pathway_prediction.PathwayPredictor(bigg_model)

# pathways = predictor.run(product="progesterone", max_predictions=4)

The reactions from the above pathways were extracted and the ones that differentiate from the already implemented reactions were visualized (**Figure 2.2**). Reactions containing the FAD(H2) co-factor are not included since FADH2 does exists in the cytoplasm in model iMM904. 


All cameo pathways agree with the implemented pathway in the way that zymosterol is in four steps converted into cholesterol which is afterwards converted by two steps to progesterone. However, different paths towards cholesterol are used. Together it shows a grid pathway structure (**Figure 2.2**). Therefore, to test which way through the grid is optimal, new models were constructed with pathway 1, 2, 3, or all combined. Pathway 2 goes from zymosterol via 7-dehydrodesmosterol and 7-dehydrocholesterol towards cholesterol and pathway 3 goes from zymosterol via desmosterol towards cholesterol (**Figure 2.2**). Interestingly, cameo have found another reaction (MNXR4011) between cholesterol and pregnenolone where only one NADP(H), instead of six in the manually implemented pathway, is needed. Therefore, this reaction was implemented instead of the existing reaction (CYP11A1).
To avoid futile cycling all reactions pointing downwards in figure 2.2 are unidirectional.


![figures/heterologous_new.png](figures/heterologous_new.png)
**Figure 2.2.** Hetrologous pathways. The enzymes are represented by their gene name where implemented genes are represented in black and genes found using the cameo PathwayPredictor algorithm which differs from the impemented genes are represented in red. Arrows indicate the direction of reaction. Co-enzymes and co-substrates are shown in light grey. Large blue, green, and yellow arrows indicate the path of pathway 1, 2, and 3, respectively.

In [9]:
# Loading model
infilename = 'models/iMM904.xml'
print(f"Loading {infilename}")
iMM904 = read_sbml_model(infilename)


Loading models/iMM904.xml


In [10]:
# Copying models
model_pathway1 = iMM904.copy()
model_pathway2 = iMM904.copy()
model_pathway3 = iMM904.copy()
model_pathway_combine = iMM904.copy()
model_pathway1.id = "iMM904_progesterone_pathway1"
model_pathway2.id = "iMM904_progesterone_pathway2"
model_pathway3.id = "iMM904_progesterone_pathway3"
model_pathway_combine.id = "iMM904_progesterone_pathway_combine"
print(f"Model {model_pathway1.id} were made as a copy of {iMM904.id}")
print(f"Model {model_pathway2.id} were made as a copy of {iMM904.id}")
print(f"Model {model_pathway3.id} were made as a copy of {iMM904.id}")
print(f"Model {model_pathway_combine.id} were made as a copy of {iMM904.id}")

Model iMM904_progesterone_pathway1 were made as a copy of iMM904
Model iMM904_progesterone_pathway2 were made as a copy of iMM904
Model iMM904_progesterone_pathway3 were made as a copy of iMM904
Model iMM904_progesterone_pathway_combine were made as a copy of iMM904


In [11]:
# Add all new metabolites from metabolites_new.csv
with open("data/metabolites_new.csv","r") as infile:
    infile.readline()
    for line in infile:
        line = line.rstrip().split(",")
        m = Metabolite(
            line[0],
            formula=line[1],
            name=line[2],
            compartment=line[3])
        print(f'Adding metabolite {m.name} (id: {m})')
        new_metabolites[line[0]] = m

Adding metabolite 7-dehydrodesmosterol (id: dehydrodesmosterol_c)
Adding metabolite desmosterol (id: desmosterol_c)


In [12]:
# Add new reactions from reactions.csv
model_pathway1
model_pathway_combine
reaction_dict = {model_pathway1:("MNXR4011","R07498","R05703","R01456","R02216","R03353","R07215","R04804"),
                 model_pathway2:("MNXR4011","R01456","R02216","R04804","MNXR3338","MNXR11345"),
                 model_pathway3:("MNXR4011","R02216","R04804","MNXR3338","MNXR1551","MNXR5471"),
                 model_pathway_combine:("MNXR4011","R07498","R05703","R01456","R02216","R03353","R07215","R04804","MNXR3338","MNXR1551","MNXR5471","MNXR11345")
}
import_reactions(reaction_dict,new_metabolites,"data/reactions.csv")

For model iMM904_progesterone_pathway1:
Adding reaction R07498: h_c + nadph_c + zymst_c <=> cholesta8en3betaol_c + nadp_c | enzyme DHCR24
Adding reaction R05703: cholesta724dien3betaol_c + h_c + nadph_c <=> lathosterol_c + nadp_c | enzyme DHCR24
Adding reaction R01456: dehydrocholesterol_c + h_c + nadph_c --> cholesterol_c + nadp_c | enzyme DHCR7
Adding reaction R02216: nad_c + pregnenolone_c <=> h_c + nadh_c + progesterone_c | enzyme HSD3B
Adding reaction R03353: cholesta8en3betaol_c --> lathosterol_c | enzyme YMR202W
Adding reaction R07215: h_c + lathosterol_c + nadph_c + o2_c --> dehydrocholesterol_c + 2.0 h2o_c + nadp_c | enzyme YLR056W
Adding reaction R04804: zymst_c --> cholesta724dien3betaol_c | enzyme YMR202W
Adding reaction MNXR4011: cholesterol_c + h_c + nadph_c + o2_c <=> 2.0 h2o_c + methylpentanal_c + nadp_c + pregnenolone_c | enzyme MNXR4011
Adding gene annotation (EBP) to reaction: CHLSTI: amet_c + o2_c + zymst_c --> ahcys_c + ergtetrol_c + 2.0 h2o_c + h_c
Adding reaction

In [13]:
# Add demand reaction for progesterone
models = [model,model_pathway1,model_pathway2,model_pathway3,model_pathway_combine]
for m in models:
    m.add_boundary(m.metabolites.get_by_id('progesterone_c'), type='demand')
    print(f'Model {m.id}: Adding reaction {m.reactions.DM_progesterone_c.name}: {m.reactions.DM_progesterone_c.reaction}')

Model iMM904_progesterone_manual: Adding reaction progesterone demand: progesterone_c --> 
Model iMM904_progesterone_pathway1: Adding reaction progesterone demand: progesterone_c --> 
Model iMM904_progesterone_pathway2: Adding reaction progesterone demand: progesterone_c --> 
Model iMM904_progesterone_pathway3: Adding reaction progesterone demand: progesterone_c --> 
Model iMM904_progesterone_pathway_combine: Adding reaction progesterone demand: progesterone_c --> 


In [14]:
# Calculate maximum growth
biomass = []
for i, m in enumerate(models):
    biomass.append(m.optimize().objective_value)
    print(f'{m.id} | Maximum growth: {round(biomass[i],3)} gDW/h')

iMM904_progesterone_manual | Maximum growth: 0.288 gDW/h
iMM904_progesterone_pathway1 | Maximum growth: 0.288 gDW/h
iMM904_progesterone_pathway2 | Maximum growth: 0.288 gDW/h
iMM904_progesterone_pathway3 | Maximum growth: 0.288 gDW/h
iMM904_progesterone_pathway_combine | Maximum growth: 0.288 gDW/h


In [15]:
# Calculate maximum progesterone productivity
pp = []
for i, m in enumerate(models):
    with m:
        objective_reaction = m.reactions.DM_progesterone_c
        m.objective = objective_reaction
        pp.append(m.optimize().objective_value)
    print(f'{m.id} | Maximum progesterone productivity: {round(pp[i],3)} mmol/gDW*h')

iMM904_progesterone_manual | Maximum progesterone productivity: 0.143 mmol/gDW*h
iMM904_progesterone_pathway1 | Maximum progesterone productivity: 0.167 mmol/gDW*h
iMM904_progesterone_pathway2 | Maximum progesterone productivity: 0.167 mmol/gDW*h
iMM904_progesterone_pathway3 | Maximum progesterone productivity: 0.167 mmol/gDW*h
iMM904_progesterone_pathway_combine | Maximum progesterone productivity: 0.167 mmol/gDW*h


In [16]:
# Calculate procentage increase in progesterone productivity
print(f"Procentage increase in progesterone productivity: {round((pp[1]/pp[0]-1)*100,2)}%")

Procentage increase in progesterone productivity: 16.67%


From the results it is clear that no matter how the pathways go through the “grid” from zymosterol to cholesterol, neither maximum growth rate nor flux towards progesterone are chanced. However, when swapping the CYP11A1 enzyme with MNXR4011 an increase of 16.67% in progesterone productivity is obtained. Therefore, we will be using one of the new models for the further research. Since pathway 1 is the pathway closest to the manually obtained pathway where only CYP11A1 is substituted with MNXR4011, we choose that for further research. The pathway 1 model is saved as “iMM904_progesterone.xml”:

In [17]:
# Saving new model
model_pathway1.id = "iMM904_progesterone"
for m in models:
    outfilename = 'models/'+m.id+'.xml'
    print(f"Saving to {outfilename}")
    write_sbml_model(m, outfilename)

Saving to models/iMM904_progesterone_manual.xml
Saving to models/iMM904_progesterone.xml
Saving to models/iMM904_progesterone_pathway2.xml
Saving to models/iMM904_progesterone_pathway3.xml
Saving to models/iMM904_progesterone_pathway_combine.xml
