# Group 69420: Lycopene production in *Saccharomyces cerevisiae*

## 1. Introduction

### 1.1 Literature review of the compound
#### Applications of the product
Lycopene (C40H56) is a red carotenoid pigment containing 13 double bonds, with strong antioxidant activity. It moreover has important industrial application values, and is widely used in pharmaceutical, food, feed, cosmetic, and nutritional supplement industries as a natural colorant (Shi et al. 2019; Hong et al. 2019). The excellent antioxidant properties of lycopene include favorable physiological effects such as anti-aging and anti-cancer activity, and these effects along with lycopene’s vibrant red color are the underlying reasons for the pigment being widely used in the aforementioned industries (Shi et al. 2019; Hong et al. 2019).

Lycopene is extensively found in fruits and vegetables such as tomatoes, and numerous microorganisms including the yeast *Xanthophyllomyces dendrorhous* and the bacterium *Pantoea agglomerans* naturally produce lycopene (Shi et al. 2019). Therefore, lycopene production currently consists of extracting the pigment from plants with nonpolar solvent or synthesizing it chemically via microbial fermentation. Lycopene can also be synthesized by chemical methods, however the use of these are limited. Because of the risks associated with chemical synthesis, the low yields obtained when extracting lycopene from natural plant sources, and the unstable supply of natural plant sources (caused by climate and season changes), microbial production of lycopene is a more economical and sustainable (Shi et al. 2019; Chen et al., 2016). Moreover, successful industrial production of lycopene by microbial fermentation, would both decrease the consumption of natural plant sources for lycopene extraction and increase the market supply of lycopene (Shi et al. 2019).

#### Evaluation of market potential
As a result of the many applications of lycopene, the pigment has a high market value. The global lycopene market size was valued at USD 107.2 million in 2020, and the compound annual growth rate (CAGR) of the lycopene market is forecasted at 5.2% from 2021 to 2030 (Himanshu et al., 2021). This market growth is majorly driven by the rising demand for natural colorants in ready to eat food products, natural antioxidants as well as the growing utilization of carotenoids in the food, cosmetic and pharmaceutical industries. Furthermore, the increasing research activities regarding the development of anti-cancer drugs is anticipated to drive the lycopene market to a larger extent in the coming years (Himanshu et al., 2021).

#### Biosynthetic pathway
The lycopene biosynthetic pathway can be divided into two at the metabolic node formed by isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). IPP and DMAPP are produced via the mevalonate (MVA) pathway that is native to *Saccharomyces cerevisiae*. The MVA pathway produces IPP by utilizing acetyl-CoA, and the IPP is subsequently isomerized in order to generate DMAPP. IPP and DMAPP are then condensed to geranylgeranyl diphosphate (GGPP) by GGPP synthase (GGPPS/CrtE), followed by the condensation of two GGPP molecules by phytoene synthase (CrtB) which results in the formation of phytoene. Hereafter, the catalytic activity of phytoene desaturase (CrtI) results in the synthesis of lycopene (Shi et al. 2019).

![Lycopene_biosynthetic.jpg](attachment:Lycopene_biosynthetic.jpg)

Figure 1. The biosynthetic pathway of lycopene in *S. cerevisiae*. The native MVA pathway (the pink part of the figure), the heterologous lycopene synthetic downstream pathway (the green part of the figure) and the key enzymes for lycopene production (highlighted with blue font) are shown. The figure is taken from (Shi et al., 2019).

### 1.2 Literature review of the cell factory

The cell factory utilized in this report is *S. cerevisiae*, and its general advantages and disadvantages are discussed below. The suitability of this cell factory for lycopene production and suitable alternative cell factories are moreover discussed.

#### General advantages
*S. cerevisiae* is one of the most applied microorganisms in industry and has been used for production of a wide variety of biological compounds. As *S. cerevisiae* has been used in alcohol fermentation and baking processes for centuries, it remains one of the most intensively studied eukaryotic organisms. As it is a well-known workhorse, cloning techniques are already well-established, thereby enabling easy knock-out of genes and introduction of recombinant pathways. This makes it possible to engineer *S. cerevisiae* for heterologous production of high-value compounds and fine chemicals that are not naturally produced by the organism.
*S. cerevisiae* is generally recognized as safe (GRAS), which often makes it the preferred chassis for industrial production (Chen et al., 2016). The yeast has a maximum specific growth rate of  0.44 h−1 on glucose (Paalme, 1997). In comparison, this is almost half the growth rate of more simple prokaryotic organisms such as E. coli.

#### General disadvantages
Even though *S. cerevisiae* is generally a good chassis for recombinant production, some issues remain to be addressed. As eukaryotic cells are divided into different compartments opposed to prokaryotic organisms, various metabolites and enzymes are separated. This has to be taken into consideration during optimization of recombinant pathways, as enzymes may potentially be located in a different organelle than its substrate. This may lead to bottlenecks and hence limit the synthesis of the end-product. On the contrary, eukaryotic organisms generally tend to be better at expression of heterologous genes from other eukaryotes.

#### Suitability of the cell factory for the product
*S. cerevisiae* does not naturally produce lycopene, however, the host is generally well suited as the MVA pathway is conserved. The pathway ends at geranylgeranyl pyrophosphate, which makes synthetic pathway extension necessary in order to make *S. cerevisiae* able to produce lycopene (Shi et al. 2019). By using *S. cerevisiae* as chassis for heterologous lycopene production, only three additional enzymes are required. These enzymes are geranylgeranyl pyrophosphate (GGPP) synthase, phytoene synthase, and phytoene desaturase (Hong et al. 2019).
*S. cerevisiae* is therefore considered a promising host for heterologous lycopene production. In previous studies, it has been demonstrated that heterologous production of lycopene in *S. cerevisiae* is possible. However, limiting factors for high-level production includes reducing the toxicity of lycopene (Hong et al. 2019). Additionally, the current low yield of lycopene produced in *S. cerevisiae* might be attributed to incompatibility between the endogenous and heterologous pathways (Shi et al. 2019).

#### What would be suitable alternative cell factories, and why is the selected one more interesting/suitable?
At present, heterologous production of lycopene has been successful in *Blakeslea trispora*, *Escherichia coli* and *S. cerevisiae*. However, as both *B. trispora* and *E. coli* release endotoxins, their industrial application is limited due to food safety issues (Chen et al., 2016). Currently, the yield of lycopene obtained from production in *S. cerevisiae* is lower than in *E. coli*, and the downstream extraction process is more difficult. Thus, to obtain an overproduction in *S. cerevisiae* further pathway engineering is needed (Chen et al., 2019).

## 2. Problem definition

Currently lycopene production depends on extraction from plants with nonpolar solvent or synthesizing it chemically via microbial fermentation, of which microbial production is more economical and sustainable (Shi et al. 2019; Chen et al., 2016). In this report we want to examine how to obtain an overproduction of lycopene in *S. cerevisiae* by further pathway engineering. We will perform this assessment in silico using the genome scale metabolic model Yeast8 for the industrially relevant strain XXX.

In order for *S. cerevisiae* to be a successful production platform, the fermentation process needs to result in adequate titers, rates and yields, to ensure the production being economically viable. With this in mind, we will in our approah focus on XXX because this will XXX. 

The heterologous lycopene synthetic downstream pathway shown in Figure 1 will be added to the GSM, after which we will engineer the cell factory by using different computational methods to identify metabolic changes that will result in the (over)production of lycopene in *S. cerevisiae*.

## 3. Selection and assessment of existing GSM

*S. cerevisiae* is a widely used cell factory, and is extensively used as a model organism in basic biological and medical research. Recently, the emergence of technologies, such as CRISPR15 and single cell omics data generation, have accelerated the developments in systems biology. Consistent with strong research interests in yeast, the relevant GEMs have also undergone numerous rounds of curation since the first published version in 2003. These GEMs have contributed significantly to systems biology studies of yeast including their use as platforms for multi-omics integration, and use for in silico strain design. However, the hitherto latest version, Yeast722, with only 909 genes, falling behind the latest genome annotation, presents a bottleneck for the use of yeast GEMs as a scaffold for integrating omics datasets.


We sys- tematically improved the yeast GEM while moving from Yeast7 to Yeast8 through several rounds of updates (Fig. 1c and Sup- plementary Fig. 2). To improve the genome coverage, we added additional genes from iSce92631. Besides, all functional gene annotations of S. cerevisiae from SGD32, BioCyc33, Reactome34, KEGG35 and UniProt36 were collected and compared (Supple- mentary Fig. 3) to update gene–protein-reaction relations (GPRs), as well as adding more GPRs. With Biolog experiments, i.e. evaluation of growth on a range of different carbon and nitrogen sources (Supplementary Data 1), and metabolomics mapping (see methods), extra reactions were added to enable the model to ensure growth on the related substrates, as well as connecting those metabolites with high confidence with the GEM. The bio- mass equation was modified by adding nine trace metal ions and eight cofactors. Additionally, 37 transport reactions were added in order to eliminate 45 dead-end metabolites. To improve lipid constraints, we reformulated reactions of lipid metabolism using the SLIMEr formalism, which Splits Lipids Into Measurable Entities37. As SLIMEr imposes additional constraints on both the lipid classes and the acyl chain distribution from metabolomics data, it improved the model performances in lipid metabolism.
In each round of model updates, standard quality-control tests, such as reaction mass balance check and ATP yield analysis, were performed. The results in Supplementary Fig. 2 and Supplemen- tary Fig. 4 indicate that the gene (reaction) coverage in the model and its performance were improved during the iterative update process, which was also shown by comparing Yeast8 to Yeast7 (Fig. 1d–f). To facilitate the multi-omics integrative analysis and visualisation, we established a map of yeast metabolic pathways in SBGN (System Biology Graphical Notation) format (Supplemen- tary Fig. 5) using CellDesigner38.


As part of this study, we have developed Yeast8 aided by version control and open collaboration, which has provided a platform for a continued community-driven expansion of the model. This platform can greatly accelerate iterative updates of the model, and we believe that this approach should become the future standard for developing GEMs for other organisms. Yeast8 is the currently most comprehensive reconstruction of yeast metabolism, but it also represents a model that can be used for simulations. The platform provided through the GitHub repository enables addi- tion of new knowledge when it is acquired as well as using this for further improving the model for simulations. Yeast8 is in line with the latest trend of performing model quality-control analysis in a standardised manner with memote50. Integrating consistent model evaluation with community model development will be instrumental to accelerate high quality development of GEMs.
Through developing Yeast8, we have significantly improved the metabolic scope of the consensus GEM of S. cerevisiae. As more evidences from experiments and bioinformatics analyses are revealed and utilised to update GEMs, these models will move closer to the in vivo network. Based on Yeast8, we developed strain specific GEMs, enzyme-constrained GEMs (ecYeast8), etc., which together form a model ecosystem around the yeast GEM and improve cellular phenotype predictions. As an example, ecYeast8 verifies that the yeast phenotypes are to a large extent determined by protein resources allocation, which is consistent with recent research39. We expect predictions of ecYeast8 to further improve as more organism-specific kinetic data becomes available, hopefully generated in a high-throughput and sys- tematic way51. By comparing panYeast8, coreYeast8 and 1,011 strain specific models, it can be concluded that metabolic capabilities are largely conserved for all S. cerevisiae strains, which is consistent with a recent study52. However, through strain specific GEM simulations, we have found subtle metabolic dif- ferences among the strains in the utilisation of substrates and the maximum yield of 26 chemicals. Exploring these differences constrained with more physiological data can guide future metabolic engineering and help to evaluate the potential of any given strain for any desired product, as well as provide clues about the mechanisms of evolutionary adaption. Currently, only in silico simulations were conducted using 1011 yeast ssGEMs; therefore, future experimental evidence for other non-reference strains will be important to evaluate the reliability of our model predictions.
With the increasing use of adaptive laboratory evolution in metabolic engineering it is important to have methods for rapid assessment of mutations in strains with improved phenotypes. Our multi-scale model analysis will be easily applicable in this area. As a further extension of Yeast8, we developed proYeast8DB by collecting and evaluating the yeast metabolic protein 3D structures from public databases53,54. This enabled identification of mutational hotspots associated with specific phenotypes. Fur- thermore, through combining the predicted targets from ecYeast8 and proYeast8DB, we demonstrated how to identify mutations in enzymes with high flux control over a given pathway, which may
also be associated with desirable phenotypes. It should be noted that although proYeast8DB is useful for connecting GEMs to protein structure information like PDB identifiers and protein parameters, it is still challenging to directly predict phenotypes using the model with protein structure variations as input. High quality 3D protein structures at genome scale are still scarce53, thus the breakthrough in 3D proteins structure simulations55 and the related residues functional predictions are strongly expected. Nevertheless, like Recon3D6 and the E. coli GEM-PRO27, the proYeast8DB holds value as a means to explore the relation between the cell genotype and phenotype with clear evidence.
Yeast8 will continue to be developed together with its ecosys- tem of models. As a whole, they are expected be a solid basis for developing a whole cell model of an eukaryal cell, which may serve as a stepping stone to a wider use of model simulations in life sciences, resulting in reducing the costs of developing bio- technology processes and drug discovery.



#### Describe the existing GSM that you've selected for you project and how you assessed it: 

The old model iMM904
The GSM has the identifier iMM904 (MM -> Monica Mo, the main model developer and 1st author on its publication; 904 -> ORFs accounted for in the model) and was obtained from the Systems Biology Research Group (Mo, M.L., Palsson, B.O., Herrgard, M.J., Connecting extracellular metabolomic profiles to intracellular metabolic states in yeast. BMC Systems Biology. 3:37 (2009)).




 


The GEM is from Lu et al., 2019 (https://github.com/SysBioChalmers/yeast-GEM)

An overview of the model can be seen in the table below (Lu et al., 2019)

Taxonomy | Template model | Reactions | Metabolites | Genes
------------ | ------------- | ------------- | ------------- | -------------
*Saccharomyces cerevisiae* | Yeast 7.6 | 4058 | 2742 | 1150


Various industrial workhorse strains exist of *S. cerevisiae*, and some of the most common ones are XX, used in the production of XXX

Since lycopene is a secondary metabolite, XX was chosen as the GSM

One of the existing GSMs for XX was recently updated with additional information gathered from X publications, including X reactions, X metabolites and X genes meaning that in total the model contains X genes, X metabolites, and X reactions

As the recently added data is based on data from a high number of publications, it is expected to be correct. However, as the number of genes, metabolites, and reactions are a great deal smaller than in the organism, the model predictions are not expected to be completely accurate.

#### In case there were multiple different GSMs available for your host, how did you choose?
???

Performance comparison with MEMOTE



#### Do you expect it to facilitate reliable predictions based on its publication and experimental validation?
???

#### Do you expect it to facilitate reliable predictions based on your assessment (memote, other considerations)?
???

In the event that the XX model was of significantly higher quality then YY, memote was used to compare the performance of the two available models by running memote report diff YY XX in the terminal, resulting in the index.html in the link below. Memote does this by running the two stoichiometric models through a series of tests based on the current community standard.

The low score was expected due to the high complexity of fungal genomes compared to prokaryotes. Based on x genome data available from KEGG, the gene coverages of the YY model and XX models were y % and x %. However, the total reactions and metabolic coverage is high for each model. The sub-total scores of the model consistency, metabolite annotation, and reaction annotation are sufficient to perform FBA. Although the models were nearly identical, YY model was confirmed as the best choice due to a slightly higher metabolic coverage.

A more recent GSM model (z) for x organism was published. The model contains x unique genes, x unique metabolites, and x reactions.

Upton D. et al compared Z and X models and found that Z contains x genes, x metabolites and x reactions that were not present in X model. Moreover, they found disagreements in x of the reactions that the two models shared. These are lack of reversibility, different compartments, different stoichiometries, disagreements in how to balance the reactions, etc. On the other hand, X model contains x genes, x metabolites and x reactions not present in Z model. The advantages to adding this new information to the X model used in this report is discussed further in Section ?.

![Example%20Performance%20comparison%20with%20MEMOTE.png](attachment:Example%20Performance%20comparison%20with%20MEMOTE.png)

## 4. Computer-Aided Cell Factory Engineering (<1500 words if Category II project; <500 words for Category I project)


#### In this part you're going to describe how you used your GSM to compute cell factory engineering strategies for your chosen product or product category.

#### Expectations (everyone):

Add heterologous pathway to GSM (if needed)

Calculate theoretical maximum yields for chosen product for suitable carbon sources

Plot phenotypic phase planes for relevant process conditions (anaerobic vs aerobic for example, what does make more sense?)

#### Additional expectations for Category II projects:

Computationally enumerate all potential production pathways to your chosen product (even if it is a native product) and score them by different metrics (yield, number of steps added,)

Compute gene knockout strategies using algorithms like OptKnock, OptGene and OptCouple

Compute over-expression and down-regulation targets

Compute co-factor swap targets

Assess your predicted strain designs using simulations and pathway visualizations

Assess manually derived strain designs using model simulations

Simulate batch cultivations using dynamic FBA

Based on your computations, provide a top 10 list of most promising cell factory designs. The criteria for "most promising" are the number of modifications, yield, growth rate and others you might define



#### Media optimization:
In the model, the default medium is based on the most essential components for growth. Glucose is the carbon source, ammonium is the nitrogen source, and the most essential ions and trace metals are also included. In this medium, biosynthesis of all amino acids and various other central metabolites are required. Thus, a lot of the carbon is used for biosynthesis of central metabolites instead of lycopene synthesis. To optimize the medium, we have changed the medium, so it mimics the YEPD medium, which often is applied for fungal growth. This was done by adding all amino acids to the medium, in order to limit the requirement for biosynthesis. In addition, the glucose uptake is increased to 20 mmol/gDW/h to improve the growth of S. cerevisiae. 
A change of the medium significantly increases the biomass productivity and the productivity of lycopene. Biomass productivity is increased 119-fold and lycopene productivity is increased by 99-fold. The maximum theoretical yield of lycopene is increased by a factor 5. 
In addition to examining the impact of the amino acids on the growth of S. cerevisiae, the carbon source was also changed. The carbon sources that were tested are glucose, fructose, succinate, pyruvate and citrate. Glucose and fructose resulted in the highest biomass and lycopene productivity, but no major differences between the various carbon sources were observed.

Evaluation parameters | Minimal medium | YEPD
------------- | ------------- | -------------
Maximum theoretical biomass productivity ($h^{-1}$) | 0.084 | 10.238
Maximum theoretical productivity of lycopene ($mmol*gDW^{-1}*h^{-1}$) | 0.169 | 16.710
Maximum theoretical yield of lycopene ($mmol_{lycopene} * mmol_{carbon}^{-1}$) | 0.170 | 0.8355



## 5. Discussion (<500 words)

In this section, discuss your results. Is it likely that you will be successful? What are the underlying assumptions? How could risks be minimized? What are the next steps towards implementation?

## 6. Conclusion (<200 words)

A summary of your strategy, expected outcome, and the impact of the project, should you achieve what you are trying to do.

## References

1. Chen et al. Lycopene overproduction in *Saccharomyces cerevisiae* through combining pathway engineering with host engineering. Microbial Cell Factories (2016). 15:113. DOI 10.1186/s12934-016-0509-4

2. Himanshu et al. Lycopene Market by Form, Nature and Application, Global Opportunity Analysis and Industry Forecast, 2021–2030. Allied Market Research (2021). The article is from https://www.alliedmarketresearch.com/lycopene-market-A06684

3. Hong et al. Efficient production of lycopene in *Saccharomyces cerevisiae* by enzyme engineering and increasing membrane flexibility and NAPDH production. Applied Microbiology and Biotechnology (2019). 103:211–223. DOI 10.1007/s00253-018-9449-8

4. Lu et al. A consensus *S. cerevisiae* metabolic model Yeast8 and its ecosystem for comprehensively probing cellular metabolism. Nature Communications (2019). 10:3586. DOI 10.1038/s41467-019-11581-3 

5. Paalme et al. Growth efficiency of *Saccharomyces cerevisiae* on glucose/ethanol media with a smooth change in the dilution rate (A-stat). Enzyme and Microbial Technology (1997). 20:174-181. DOI 10.1016/S0141-0229(96)00114-7 

6. Shi et al. Systematic Metabolic Engineering of *Saccharomyces cerevisiae* for Lycopene Overproduction.  Journal of Agricultural and Food Chemistry (2019). 67:11148−11157. DOI 10.1021/acs.jafc.9b04519