Tilføj Frederiks pathways fra Gene_Targets_Test.ipynb

# Heterologous pathway implementation

Yeast cells naturally produce steroids. The main steroid in yeast is ergosterol which is produced in a long biosynthetic pathway from the precursor squalene (Figure 1). In this pathway the intermediates zymosterol and 5-dehydroepisterol are produced. Interestingly, from these compounds it should be theoretically possible to produce progesterone through the implementation of new pathways. However, the biosynthesis of progesterone from 5-dehydroepisterol rely on an enzymatic reaction that to our knowledge is not validated. Therefore, the production of progesterone from zymosterol is of most interest.

In order to produce progesterone in yeast a new pathway needs to be established. This pathway rely on four heterologous enzymes with origin from mammals and two naturally yeast enzymes. Zymosterol is converted into lathosterol via the yeast enzyme ERG2 and DHCR24 in random order forming a diamond shaped path. The yeast enzyme ERG3 converts lathosteol into 7-dehydrocholsterol and subsequently three mammalian enzymes convert 7-dehydrocholsterol into progesterone. 7-dehydrocholsterol in converted to cholesterol by DHCR7, cholesterol is converted to pregnenolone by CYP11A1, and finally HSD3B converts pregnenolone into progesterone.

The co-enzymes NADP(H) and NAD(H) are used multiple times either directly or indirectly by enzymes in the progesterone pathway. However, for simplification we have treated all co-enzymes as direct substrates of the enzymes. Furthermore, the mammalian CYP11A1 enzyme which converts cholesterol into pregnenolone in three independent reactions were simplified to one comprehensive reaction. 

![figures/pathway_med_strukturer_v3.png](figures/pathway_med_strukturer_v3.png)

Figure 1. Steroid biosynthesis. Natural ergosterol pathway is shown with a green box and implemented progesterone pathway is shown with a blue box. The enzymes are represented by their gene name where endogenous yeast genes are represented in black and heterologous genes are represented in red. Arrows indicate the direction of reaction. Co-enzymes and co-substrates are shown in light grey.

The new reactions and metabolites needed in the iMM904 model in order to implement the progesterone pathway were added to the reactions.csv and metabolites.csv files in the data folder. Subsequently, the following code were used to load the model, add the reactions and metabolites from the csv files, and finally save a new model:

In [15]:
# Load libraries
import numpy as np
from cobra.io import read_sbml_model, write_sbml_model
from cobra.util import create_stoichiometric_matrix
from cobra import Reaction, Metabolite
from cobra.core.gene import GPR

In [16]:
# Loading model
infilename = 'models/iMM904.xml'
print(f"Loading {infilename}")
model = read_sbml_model(infilename)

Loading models/iMM904.xml


In [17]:
# Model statistics
before_add = {"metabolites": len(model.metabolites), "reactions": len(model.reactions), "genes": len(model.genes)}
print("Metabolites:", before_add['metabolites']) #1226
print("Reactions:", before_add['reactions']) #1577
print("Genes:", before_add['genes']) #905

Metabolites: 1226
Reactions: 1577
Genes: 905


In [18]:
# Add all new metabolites from metabolites.csv
new_metabolites = dict()
with open("data/metabolites.csv","r") as infile:
    infile.readline()
    for line in infile:
        line = line.rstrip().split(",")
        m = Metabolite(
            line[0],
            formula=line[1],
            name=line[2],
            compartment=line[3])
        print(f'Adding metabolite {m.name} (id: {m})')
        new_metabolites[line[0]] = m

Adding metabolite cholesterol (id: cholesterol_c)
Adding metabolite 7-dehydrocholesterol (id: dehydrocholesterol_c)
Adding metabolite pregnenolone (id: pregnenolone_c)
Adding metabolite progesterone (id: progesterone_c)
Adding metabolite 5alpha-Cholesta-7_24-dien-3beta-ol (id: cholesta724dien3betaol_c)
Adding metabolite 5alpha-Cholest-8-en-3beta-ol (id: cholesta8en3betaol_c)
Adding metabolite lathosterol (id: lathosterol_c)
Adding metabolite 4-methylpentanal (id: methylpentanal_c)


In [19]:
# Add all new reactions from reactions.csv
with open("data/reactions.csv","r") as infile:
    infile.readline()
    for line in infile:
        line = line.rstrip().split(",")
        r = Reaction(line[0])
        r.name = line[2]
        if line[3] != "":
            r.subsystem = line[3]
        r.lower_bound = float(line[4])
        r.upper_bound = float(line[5])
        if line[1] != "":
            r.gpr = GPR.from_string(line[1])
        for i in range(int(len(line[6:])/2)):
            metaboliteID = line[6+i*2]
            if metaboliteID != "":
                if metaboliteID in model.metabolites:
                    metaboliteID = model.metabolites.get_by_id(metaboliteID)
                else:
                    metaboliteID = new_metabolites[metaboliteID]
                bin = float(line[7+i*2])
                r.add_metabolites({metaboliteID:bin})
        print(f'Adding reaction {r} | enzyme {r.gpr}')
        model.add_reactions([r])

model.reactions.CHLSTI.gpr = GPR.from_string("EBP")
print(f'Adding gene annotation (EBP) to reaction: {model.reactions.CHLSTI}')

model.add_boundary(model.metabolites.get_by_id("methylpentanal_c"), type="sink")
print(f'Adding reaction {model.reactions.SK_methylpentanal_c.name}: {model.reactions.SK_methylpentanal_c.reaction}')


Adding reaction R07498: h_c + nadph_c + zymst_c <=> cholesta8en3betaol_c + nadp_c | enzyme DHCR24
Adding reaction R05703: cholesta724dien3betaol_c + h_c + nadph_c <=> lathosterol_c + nadp_c | enzyme DHCR24
Adding reaction R01456: dehydrocholesterol_c + h_c + nadph_c --> cholesterol_c + nadp_c | enzyme DHCR7
Adding reaction ECYP11A1: cholesterol_c + 6.0 h_c + 6.0 nadph_c + 3.0 o2_c --> 4.0 h2o_c + methylpentanal_c + 6.0 nadp_c + pregnenolone_c | enzyme CYP11A1
Adding reaction R02216: nad_c + pregnenolone_c <=> h_c + nadh_c + progesterone_c | enzyme HSD3B
Adding reaction R03353: cholesta8en3betaol_c --> lathosterol_c | enzyme YMR202W
Adding reaction R07215: h_c + lathosterol_c + nadph_c + o2_c --> dehydrocholesterol_c + 2.0 h2o_c + nadp_c | enzyme YLR056W
Adding reaction R04804: zymst_c --> cholesta724dien3betaol_c | enzyme YMR202W
Adding gene annotation (EBP) to reaction: CHLSTI: amet_c + o2_c + zymst_c --> ahcys_c + ergtetrol_c + 2.0 h2o_c + h_c
Adding reaction 4-methylpentanal sink: m

In [20]:
# New model statistics
print("Metabolites:", len(model.metabolites), f"(change: {len(model.metabolites) - before_add['metabolites']})") 
print("Reactions:", len(model.reactions), f"(change: {len(model.reactions) - before_add['reactions']})") 
print("Genes:",len(model.genes), f"(change: {len(model.genes) - before_add['genes']})") 

Metabolites: 1234 (change: 8)
Reactions: 1586 (change: 9)
Genes: 910 (change: 5)


In [21]:
# Saving new model
outfilename = "models/iMM904_progesterone.xml"
model.id = outfilename.split("/")[-1].split(".")[0]
print(f"Saving to {outfilename}")
write_sbml_model(model, outfilename)

Saving to models/iMM904_progesterone.xml


As seen from the code output and from the statistics of the new model, we successfully implemented eighth new metabolites, nine new reactions, and four new genes to the model. However, in order to varify that this pathway is the optimal...

The "pathway_prediction" algorithm from cameo was used to find four different pathways produ

In [22]:
# Generating progesterone pathways with cameo
from cameo import models
from cameo.strain_design import pathway_prediction

model = models.bigg.iMM904

predictor = pathway_prediction.PathwayPredictor(model)

pathways = predictor.run(product="progesterone", max_predictions=4)

Unnamed: 0,equation,lower_bound,upper_bound
MNXR82275,H(+) + cholesterol + FAD <=> FADH2 + cholesta-...,-1000.0,1000.0
MNXR1551,"H(+) + NADPH + cholesta-5,7,24-trien-3beta-ol ...",-1000.0,1000.0
MNXR82170,"H(+) + FAD + 5alpha-cholesta-7,24-dien-3beta-o...",-1000.0,1000.0
MNXR1851,"5alpha-cholesta-8,24-dien-3beta-ol <=> 5alpha-...",-1000.0,1000.0
MNXR3037,H(+) + NADH + progesterone <=> pregnenolone + ...,-1000.0,1000.0
MNXR4011,H(+) + cholesterol + 2.0 O2 + NADPH <=> 2.0 H2...,-1000.0,1000.0


Unnamed: 0,equation,lower_bound,upper_bound
MNXR1552,cholesterol + NADP(+) <=> H(+) + NADPH + chole...,-1000.0,1000.0
MNXR82170,"H(+) + FAD + 5alpha-cholesta-7,24-dien-3beta-o...",-1000.0,1000.0
MNXR1851,"5alpha-cholesta-8,24-dien-3beta-ol <=> 5alpha-...",-1000.0,1000.0
MNXR3037,H(+) + NADH + progesterone <=> pregnenolone + ...,-1000.0,1000.0
MNXR11345,"H(+) + NADPH + cholesta-5,7,24-trien-3beta-ol ...",-1000.0,1000.0
MNXR4011,H(+) + cholesterol + 2.0 O2 + NADPH <=> 2.0 H2...,-1000.0,1000.0


Unnamed: 0,equation,lower_bound,upper_bound
MNXR5471,cholesterol + NADP(+) <=> H(+) + NADPH + chole...,-1000.0,1000.0
MNXR82170,"H(+) + FAD + 5alpha-cholesta-7,24-dien-3beta-o...",-1000.0,1000.0
MNXR82403,"H(+) + FAD + cholesta-5,24-dien-3beta-ol <=> F...",-1000.0,1000.0
MNXR1851,"5alpha-cholesta-8,24-dien-3beta-ol <=> 5alpha-...",-1000.0,1000.0
MNXR3037,H(+) + NADH + progesterone <=> pregnenolone + ...,-1000.0,1000.0
MNXR4011,H(+) + cholesterol + 2.0 O2 + NADPH <=> 2.0 H2...,-1000.0,1000.0


Unnamed: 0,equation,lower_bound,upper_bound
MNXR1552,cholesterol + NADP(+) <=> H(+) + NADPH + chole...,-1000.0,1000.0
MNXR3338,"H(+) + O2 + NADPH + 5alpha-cholesta-7,24-dien-...",-1000.0,1000.0
MNXR1851,"5alpha-cholesta-8,24-dien-3beta-ol <=> 5alpha-...",-1000.0,1000.0
MNXR3037,H(+) + NADH + progesterone <=> pregnenolone + ...,-1000.0,1000.0
MNXR11345,"H(+) + NADPH + cholesta-5,7,24-trien-3beta-ol ...",-1000.0,1000.0
MNXR4011,H(+) + cholesterol + 2.0 O2 + NADPH <=> 2.0 H2...,-1000.0,1000.0


Some comments