In [2]:
from IPython.display import FileLink, FileLinks

# Group 69420: Lycopene production in *Saccharomyces cerevisiae*

## 1. Introduction

### 1.1 Literature review of the compound
#### Applications of the product
Lycopene (C40H56) is a red carotenoid pigment containing 13 double bonds, with strong antioxidant activity. It moreover has important industrial application values, and is widely used in pharmaceutical, food, feed, cosmetic, and nutritional supplement industries as a natural colorant (Shi et al. 2019; Hong et al. 2019). The excellent antioxidant properties of lycopene include favorable physiological effects such as anti-aging and anti-cancer activity, and these effects along with lycopene’s vibrant red color are the underlying reasons for the pigment being widely used in the aforementioned industries (Shi et al. 2019; Hong et al. 2019).

Lycopene is extensively found in fruits and vegetables such as tomatoes, and numerous microorganisms including the yeast *Xanthophyllomyces dendrorhous* and the bacterium *Pantoea agglomerans* naturally produce lycopene (Shi et al. 2019). Therefore, lycopene production currently consists of extracting the pigment from plants with nonpolar solvent or synthesizing it chemically via microbial fermentation. Lycopene can also be synthesized by chemical methods, however these are only used in a limited sense. Because of the risks associated with chemical synthesis, the low yields obtained when extracting lycopene from natural plant sources, and the unstable supply of natural plant sources (caused by climate changes and seasonal shifts as well as rising pollution), microbial production of lycopene is a more economical and sustainable (Shi et al. 2019; Chen et al., 2016). Moreover, successful industrial production of lycopene by microbial fermentation, would both decrease the consumption of natural plant sources for lycopene extraction and increase the market supply of lycopene (Shi et al. 2019).

#### Evaluation of market potential
As a result of the many applications of lycopene, the pigment has a high market value. The global lycopene market size was valued at USD 107.2 million in 2020, and the compound annual growth rate (CAGR) of the lycopene market is forecasted at 5.2% from 2021 to 2030 (Himanshu et al., 2021). This market growth is majorly driven by the rising demand for natural colorants in ready to eat food products, natural antioxidants as well as the growing utilization of carotenoids in the food, cosmetic and pharmaceutical industries. Furthermore, the increasing research activities regarding the development of anti-cancer drugs is anticipated to drive the lycopene market to a larger extent in the coming years (Himanshu et al., 2021).

#### Biosynthetic pathway 
Lycopene can be produced from the MVA pathway (endogenous to eukaryotes, native to *Saccharomyces cerevisiae*) and the MEP pathway (endogenous to prokaryotes and plants), both pathways can be seen in Figure 1A. The two pathways are very similar, however the MEP pathway produces both IPP and DMAPP, unlike the MVA pathway which yields only IPP and requires an isomerase (IDI) to generate DMAPP. As seen in Figure 1B, IPP and DMAPP are then condensed to geranylgeranyl diphosphate (GGPP) by GGPP synthase (GGPPS/CrtE), followed by the condensation of two GGPP molecules by phytoene synthase (CrtB) which results in the formation of phytoene. Hereafter, the catalytic activity of phytoene desaturase (CrtI) results in the synthesis of lycopene (Shi et al. 2019). For both Figure 1A and 1B, it should be noted that not all details of each intermediate reaction step are shown.

![260624651_407270717796037_2421228233972692860_n.png](attachment:260624651_407270717796037_2421228233972692860_n.png)
Figure 1. The biosynthetic pathway of lycopene. Figure 1A shows the MVA and MEP pathway (the figure is adapted from Dissook et al., 2021), whereas Figure 1B shows the rest of the biosynthetic pathway of lycopene from IPP/DMAPP to lycopene (the figure is adapted from Hong et al., 2019).

### 1.2 Literature review of the cell factory

The cell factory utilized in this report is *S. cerevisiae*, and its general advantages and disadvantages are discussed below. The suitability of this cell factory for lycopene production and suitable alternative cell factories are moreover discussed.

#### General advantages
*S. cerevisiae* is one of the most applied microorganisms in industry and has been used for production of a wide variety of biological compounds. As *S. cerevisiae* has been used in alcohol fermentation and baking processes for centuries, it remains one of the most intensively studied eukaryotic organisms. As it is a well-known workhorse, the genomic sequence is highly annotated in several genome databases and molecular cloning techniques are already well-established, thereby enabling easy knock-out of genes and introduction of recombinant gene constructs. This makes it possible to engineer *S. cerevisiae* for heterologous production of high-value compounds and fine chemicals that are not naturally produced by the organism.
*S. cerevisiae* is generally recognized as safe (GRAS), which often makes it the preferred chassis for industrial production (Chen et al., 2016). The wildtype (WT) yeast strain shows a maximum specific growth rate of 0.44 h−1 on glucose (Paalme, 1997). In comparison, this is almost half the growth rate of more simple prokaryotic organisms such as E. coli, making it a sufficiently fast-growing eukaryote.

#### General disadvantages
Even though *S. cerevisiae* is generally a good chassis for the production of several heterologous products, some issues remain to be addressed. As eukaryotic cells are divided into different compartments opposed to prokaryotic organisms, various metabolites and enzymes are separated. This has to be taken into consideration during optimization of introduced recombinant pathways, as enzymes may potentially be located in a different organelle than its substrate. This may lead to bottlenecks and hence limit the synthesis of the end-product. On the contrary, eukaryotic organisms generally tend to be better at expression of heterologous genes from other eukaryotes.

#### Suitability of the cell factory for the product
*S. cerevisiae* does not naturally produce lycopene, however, the host is generally well suited as the MVA pathway is endogenous to (most) WT strains. The pathway ends with the metabolite geranylgeranyl pyrophosphate, which makes synthetic pathway extension necessary in order to make *S. cerevisiae* able to produce lycopene (Shi et al. 2019). By using *S. cerevisiae* as chassis for heterologous lycopene production, only three additional enzymes are required. These enzymes are geranylgeranyl pyrophosphate (GGPP) synthase, phytoene synthase, and phytoene desaturase - the ladder also known as lycopene synthase (Hong et al. 2019).
*S. cerevisiae* is therefore considered a promising host for heterologous lycopene production. In previous studies, it has been demonstrated that heterologous production of lycopene in *S. cerevisiae* is possible. However, limiting factors for high-level production includes reducing the toxicity of lycopene (Hong et al. 2019), which may be due the hydrophobic property of the tetraterpene. Additionally, the current low yield of lycopene produced in *S. cerevisiae* might be attributed to incompatibility between the endogenous and heterologous pathways (Shi et al. 2019).

#### What would be suitable alternative cell factories, and why is the selected one more interesting/suitable?
At present, heterologous production of lycopene has been successful in *Blakeslea trispora*, *Escherichia coli* and *S. cerevisiae*. However, as both *B. trispora* and *E. coli* release endotoxins, their industrial application is limited due to food safety issues (Chen et al., 2016). Currently, the yield of lycopene obtained from production in *S. cerevisiae* is lower than in *E. coli*, and the downstream extraction process is more difficult. Thus, to obtain an overproduction in *S. cerevisiae* further pathway engineering is needed (Chen et al., 2019).

## 2. Problem definition

Currently lycopene production depends on extraction from plants with nonpolar solvent or synthesizing it chemically via microbial fermentation, of which microbial production is more economical and sustainable (Shi et al. 2019; Chen et al., 2016). In this report we want to examine how to obtain an overproduction of lycopene in *S. cerevisiae* by further pathway engineering. We will perform this assessment in silico using the genome scale metabolic model (GSM) Yeast8.

In order for *S. cerevisiae* to be a successful production platform, the fermentation process needs to result in adequate titers, rates and yields, to ensure the production being economically viable. With this in mind, we will in our approah focus on optimizing the production of lycopene, by increasing the flux of carbon towards lycopene production. We will investigate different methods to achieve this, including media optimization, knockout and upregulation of genes, phenotypic phase plane analysis, and co-factor swapping.  

The heterologous lycopene synthetic downstream pathway shown in Figure 1 will be added to the GSM, after which we will engineer the cell factory by using different computational methods to identify metabolic changes that will result in the (over)production of lycopene in *S. cerevisiae*.

## 3. Selection and assessment of existing GSM

*S. cerevisiae* is extensively used as a cell factory and model organism in basic biological research. Due to recent developments in systems biology and strong research interests in yeast, there are multiple GSMs available for *S. cerevisiae*, which have undergone multiple rounds of curation and improvements since the first version was published. This have led to *S. cerevisiae* GEMs contributing significantly to studies of yeast, their use as platforms for multi-omics integration and their use for in silico strain design (Lu et al., 2019).

The GSM with the identifier iMM904 (MM is the abbreviation for Monica Mo, the main model developer and 1st author on its publication; 904 is the number of ORFs accounted for in the model) is just one example of a *S. cerevisiae* GSM (Mo et al., 2009). The iMM904 GSM was reconstructed based on an already existing GSM, iND750, and includes 1,577 reactions and 905 genes. The network model was validated by comparing 2,888 in silico single-gene deletion strain growth phenotype predictions to published experimental data, and the predicted intracellular flux changes were shown to be consistent with published measurements on intracellular metabolite fluxes (Mo et al., 2009). Because of this, iMM904 was chosen as a promising cell factory for this project.

Another promising cell factory for this project, is the Yeast8 GSM developed by Lu et al., 2019. All functional gene annotations of *S. cerevisiae* from databases such as KEGG35, SGD32 and UniProt36 were collected and compared for the reconstruction of the GSM, and the gene coverage and performance of the GSM were improved during the iterative update process, resulting in the model now including 4,058 reactions and 1,150 genes. The Yeast8 GSM was developed aided by version control and open collaboration, which has provided a platform for the continued expansion of the model, that can greatly accelerate iterative updates of the GSM. Therefore, Yeast8 is currently the most comprehensive reconstruction of yeast metabolism and is well suited for simulations (Lu et al., 2019).


#### Performance comparison with Memote

Since both GSMs were deemed promising, the performance of the two models were compared with Memote, in order to select the best cell factory model for our project. This analysis is summarized in the below table and can be found in the file “02 Comparison of models” in the main folder, and the resulting Memote reports can be found in the “model assessment” folder.

__Model__ | __Total score (%)__ | __Total reactions__ | __Total metabolites__ | __Total genes__ | __Metabolic coverage__ | __Consistency score (%)__ | __Annotation-metabolites score (%)__ | __Annotation-reactions score (%)__
------------ | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | -------------
iMM904 | 68 | 1,577 | 1,226 | 905 | 1.74 | 53 | 80 | 82
Yeast8 | 65 | 4,058 | 2,742 | 1,150 | 3.53 | 50 | 41 | 65

From the table we see that the iMM904 model has a better total score, however, we chose to work with the Yeast8 GSM. This choice was made on the basis of Yeast8 being more comprehensive with double the metabolic coverage; the model also has almost the same total score, while containing significantly more reactions, metabolites, and genes. Moreover, because of the models comprehensively, we expect it is better suited for cell engineering with the target of (over)production of lycopene. The Yeast8 mainly scores lower in the Memote categories of "Annotation" due to less gene annotations to more of the metabolic pathway databases (such as BRENDA, KEGG etc.). This should though not be a hindrance for our usage of the model.


#### Reliable predictions

We expect the Yeast8 GSM to facilitate reliable predictions, based both on our assessment with Memote and its high number of publications and thorough experimental validation, as well as on it being the most comprehensive reconstruction of yeast metabolism; the metabolic scope of the consensus GEM of *S. cerevisiae* is significantly improved with this model. Moreover, the fact that quality of the Yeast8 model is continuously controlled in a standardized manner with Memote50, and the fact that the model will continue to be developed together with its ecosystem of models, makes us expect reliable predictions. However, as the number of reactions, metabolites, and genes in the model still are a great deal smaller than in *S. cerevisiae*, the model predictions are not expected to be completely accurate.

In [3]:
print("Memote outputs given as .html files:")
FileLinks("model assessment")

Memote outputs given as .html files:


## 4. Computer-Aided Cell Factory Engineering


### Cell factory engineering strategies used for lycopene production in yeast

Strategies used to optimize lycopene production in yeast are listen in the following table. Along with these other strategies were tested without success such as finding candidates for gene knockout and finding heterologous pathways for lycopene. The reason these did not work were mainly the lack of computational power since the model employed has a lot of reactions, making it hard to optimize.

| Strategy                                              | Link to file |
|:--------------------------------------------------------|---|
| Introduction of heterolougus pathway                    | ? |
| Phenotypic phase plane analysis                         | ? |
| Identifying overexpression and downregulation targets | [05_overexpression](05_overexpression.ipynb) |
| Media optimization                                      | ? |
| Identifying co-factor swap targets                      | ? |



### Introduction of heterologous pathway

To allow for the simulation of the biosynthesis of lycopene, the following metabolites and reactions were added to the Yeast8 GSM. 


| Metabolite added 	| Metabolite ID 	|
|:---	|:---	|
| Phytoene 	| phytoene 	|
| Lycopene 	| lycopene 	|


| Reaction added 	| Reaction ID 	|
|:---	|:---	|
| Phytoene synthase 	| CrtB 	|
| Lycopene synthase 	| CrtI 	|


As opposed to the information on the synthesis pathway shown in [Figure 1](#figure_cell), the Yeast8 model already includes the reactions and metabolites up untill GGPP synthesis, thus only the last two reactions (catalysed by CrtB and CrtI) and the metabolites "phytoene" and "lycopene" had to be added to the model. Furthermore, the implicit reactions mentioned earlier, which include the condensation of IPP and DMAPP, the formation og geranyl-pyrophosphate and farnesyl-pyrophosphate, had to be found and investigated in the model in order to check their viability for our strain design. In addtion, lycopene synthesis relieas on the use of a cofactor, the electron carrier FAD, which we had to draw from the endogenous metabolite pools in the last reaction step. 


In [4]:
FileLink("02_loading_model.ipynb")

### Determining maximal theoretical yields and productivity

In [5]:
FileLink("03_theoretical_yields.ipynb")

### Phenotypic phase plane analysis
Using the cobra tool for determining porduction envelopes, we have assessed the phenotype phase planes for different process conditions set for the Yeast8 model with the integrated lycopene pathway. Not surpisingly the phase plane analysis shows an increased biomass formation (biomass drain flux) as well as lycopene production (flux through the CrtI catalysed reaction) as a function of increasing glucose uptake (glucose exchange rate). Although, the flux towards biomass formation reaches an optimum at a certain uptake level of 600 mmol/g DW/h (exchange flux represented as -600) and then starts decreasing when glucose uptake is furhter increased. This could indicate the occurance of overflow metabolism also known as the Crabtree effect - a process known to occur in *S. cerevisae* under aerobic conditions and high glucose concentrations (Barford and Hall 1979). Lycopene production increases steadily with glucose uptake as well, but stalls when the uptake limit mentioned above is reached. Less biomass means less cells which in term leads to lower productivity. Oxygen uptake is crucial for lycopene production as the strain is not able to produce lycopene in anaerobic conditions. Although, a large oxygen uptake will negatively impact both growth and lycopene formation, which could be due to oxygen toxicity. It is probably not physiologically feasible (or even possible) to increase oxygen uptake to a certain extent - the uptake has to be greater than 260 mmol/g DW/h before we see the drop in biomass and lycopene flux. Apparently, lycopene can still be produced even though oxygen uptake is decreased to zero. It might be anaerobic conditions are better for the regeneration of the FAD pools since the cofactor is not used up in the TCA cycle that is downregulated when *S. cerevisiae* is growing in fermenting conditions (Pfeiffer and Morley 2014).

In [6]:
FileLink("04_phenotype_phase_plane_analysis.ipynb")

### Identifying overexpression and downregulation targets

We want to know which reactions, if over expressed or down regulated, affect lycopene production. This will require going through all the reactions to see whether they affect lycopene production or are affected by it. Using flux scanning based enforced objective flux (FSEOF) it is possible to see which fluxes increase or decrease as the product flux increases. This method has been validated for lycopene production in  E. coli  as FSEOF accuretly predicted increased lycopene production when certain genes were over expressed (Choi et al. 2010).

Out of 110 reactions identified with the algorithm only about 20 reactions were deemed signifigant. These reactions show a signifigant change in flux when lycopene flux increased. Most of these reactions were related to the central metabolism, production of precursors for lycopene (ie. the pentose phosphate pathway, glycolysis, the TCA cycle and mevalonate production) and transport to and from the mitochondrion. This can be explained by the fact that the precursor for lycopene in our heterologous pathway is acetyl-CoA.

Nevertheless a few potential targets were found for overexpression, for example 

#### soluble fumarate reductase (r_0455)
FADH2 [cytoplasm] + fumarate [cytoplasm] &rarr; FAD [cytoplasm] + H<sup>+</sup> [cytoplasm] + succinate [cytoplasm]

#### succinate-fumarate transport (r_1265)
fumarate [mitochondrion] + succinate [cytoplasm] &rarr; fumarate [cytoplasm] + succinate [mitochondrion]

#### succinate dehydrogenase (ubiquinone-6) (r_1021)
succinate [mitochondrion] + ubiquinone-6 [mitochondrion] &rarr; fumarate [mitochondrion] + ubiquinol-6 [mitochondrion]

all relate to fumarate which can be utalized to oxidize FADH<sub>2</sub> back to FAD<sup>2+</sup>, the cofactor which reduces phytoene to lycopene. Overexpressing these reactions would lead to more FAD<sup>2+</sup> being available. 

Another one could be
#### nucleoside diphosphate kinase (r_0800)
ATP [cytoplasm] + GDP [cytoplasm] &rarr; ADP [cytoplasm] + GTP [cytoplasm]

which makes GTP, which participates in the reaction going from mevalonate to IPP. Other examples can be found in this separte [analysis document](07_Overexpression.ipynb)

In [7]:
FileLink("05_overexpression.ipynb")

#### Media optimization:
In the Yeast8 model, the default medium is based on the most essential components for growth (i.e. a minimal medium). Glucose is the carbon source, ammonium is the nitrogen source, and the most essential ions and trace metals are also included. In this medium, biosynthesis of all amino acids and various other central metabolites are required. Thus, a lot of the carbon is used for biosynthesis of central metabolites instead of lycopene synthesis. To optimize the medium, we have changed the medium, so it mimics the YEPD medium, which often is applied for fungal growth. This was done by adding all amino acids to the medium, in order to limit the requirement for biosynthesis. In addition, the glucose uptake is increased to 20 mmol/gDW/h to improve the growth of *S. cerevisiae*. 
A change of the medium significantly increases the biomass productivity and the productivity of lycopene. Biomass productivity is increased 119-fold and lycopene productivity is increased by 99-fold. The maximum theoretical yield of lycopene is increased by a factor 5. 
In addition to examining the impact of the amino acids on the growth of S. cerevisiae, the carbon source was also changed. The carbon sources that were tested are glucose, fructose, succinate, pyruvate and citrate. Glucose and fructose resulted in the highest biomass and lycopene productivity, but no major differences between the various carbon sources were observed.

Evaluation parameters | Minimal medium | YEPD
:------------- | :-------------: | -------------
Maximum theoretical biomass productivity (h<sup>-1</sup>) | 0.084 | 10.238
Maximum theoretical productivity of lycopene (mmol &bull; g DW<sup>-1</sup> &bull; h<sup>-1</sup>) | 0.169 | 16.710
Maximum theoretical yield of lycopene (mmol<sub>lycopene</sub> &bull; mmol<sub>carbon</sub><sup>-1</sup>) | 0.170 | 0.8355



In [8]:
FileLink("07_media_optimization.ipynb")

### Cofactor swap targets

In the pathway added to our model, FAD is used as a cofactor in the production of lycopene. It is therefore interesting to see whether it is possible to improve yield by swapping out cofactors in some reactions in the model. A algorithm was used from the cameo package, this algorithm searches for all reactions containing the cofactors, swaps them out and checks the result so one can see which reactions benefit from co-factor swapping in the production of lycopene. Interestingly a few of the reactions recommended changing out FAD in the following reaction

#### soluble fumarate reductase (r_0455)
FADH2 [cytoplasm] + fumarate [cytoplasm] ==> FAD [cytoplasm] + H+ [cytoplasm] + succinate [cytoplasm]

and replacing it with NADP. This would be done while also changing NADP in another reaction for FAD, indicating that this reaction does not have enough flux compared to other reactions which regenrate NADP. It should be mentioned that it is not necessarily possible to switch out FAD for NAD(P) in many cases but it seems from the analysis that finding other ways f

In [9]:
FileLink("06_cofactor_swap.ipynb")

## 5. Discussion

**In this section, discuss your results. Is it likely that you will be successful? What are the underlying assumptions? How could risks be minimized? What are the next steps towards implementation?**

During our work simulating an engineered *S. cerevisiae* strain for the production of lycopene, we have succesfully implemented the heterologous synthesis pathway based on current reasearch showing promising results. Although, we have confirmed that the model allows for the formation of lycopene, the productivity and yields are merely theoretical as the engineered strain has not been tested in an experimental setting.  

One of the key approaches to increase lycopene and biomass productivities is optimization of the medium. In this project, the medium content and carbon source together with glucose concentrations have been varied to identify the most optimal conditions. It was found that using a YEPD mimicked medium which contains all amino acids, the lycopene productivity was increased by 99-fold. Similarly, biomass was increased by a factor 119. During analysis, it was found that glucose or fructose were the most optimal carbon sources, which is not surprising as these tap directly into glycolysis and central metabolism. It was additionally found that high glucose concentrations in the medium resulted in elevated lycopene yields. 
Overall, these findings indicate that the content of the medium is an easy way to increase productivity.

Looking at the phenotypic phase planes, it is worth considering the optimal growth conditions for lycopene production. Since it is most likely infeasible to increase glucose up to the extent where we see the growth and lycopene productivity optima (~ 580 mmol/g DW/h). Oxygen uptake could perhaps be increased by increasing the mass transfer during the growth process, but that is more in the sense of the process adaptation, where the goal is to make oxygen a non-limiting factor. The maximum theoretical productivity reached for lycopene is in the range of ~26 mmol/g DW/h for each optimal condition, but the yield is greatly decreased (0.03 mmol-lycopene/mmol-glc) since more glucose is needed in the growth medium. Surprisingly enough, lycopene can be produced anaerobically, although not effectively. 

However, it should be noted that the different increases are only based on the computational model and not real data. Hence, it is likely that conducting actual experiments will provide an altered outcome, although the modification are likely to increase the biomass productivity and lycopene yield. 



## 6. Conclusion

A summary of your strategy, expected outcome, and the impact of the project, should you achieve what you are trying to do.

## References

1. Chen et al. Lycopene overproduction in *Saccharomyces cerevisiae* through combining pathway engineering with host engineering. Microbial Cell Factories (2016). 15:113. DOI 10.1186/s12934-016-0509-4


2. Dissook et al. Stable isotope and chemical inhibition analyses suggested the existence of a non-mevalonate-like pathway in the yeast *Yarrowia lipolytica*. Scientific Reports (2021). 11:5598. DOI 10.1038/s41598-021-85170-0


3. Himanshu et al. Lycopene Market by Form, Nature and Application, Global Opportunity Analysis and Industry Forecast, 2021–2030. Allied Market Research (2021). The article is from https://www.alliedmarketresearch.com/lycopene-market-A06684


4. Hong et al. Efficient production of lycopene in *Saccharomyces cerevisiae* by enzyme engineering and increasing membrane flexibility and NAPDH production. Applied Microbiology and Biotechnology (2019). 103:211–223. DOI 10.1007/s00253-018-9449-8


5. Lu et al. A consensus *S. cerevisiae* metabolic model Yeast8 and its ecosystem for comprehensively probing cellular metabolism. Nature Communications (2019). 10:3586. DOI 10.1038/s41467-019-11581-3. The github repository: https://github.com/SysBioChalmers/yeast-GEM


6. Mo et al. Connecting extracellular metabolomic measurements to intracellular flux states in yeast. BMC Systems Biology (2009). 3:37. DOI 10.1186/1752-0509-3-37


7. Paalme et al. Growth efficiency of *Saccharomyces cerevisiae* on glucose/ethanol media with a smooth change in the dilution rate (A-stat). Enzyme and Microbial Technology (1997). 20:174-181. DOI 10.1016/S0141-0229(96)00114-7 


8. Shi et al. Systematic Metabolic Engineering of *Saccharomyces cerevisiae* for Lycopene Overproduction.  Journal of Agricultural and Food Chemistry (2019). 67:11148−11157. DOI 10.1021/acs.jafc.9b04519


9. Barford and Hall. An Examination of the Crabtree Effect in Saccharomyces cerevisiae: the Role of Respiratory Adaptation. Microbiology (1979). 114:267-275. DOI 10.1099/00221287-114-2-267