In [2]:
from IPython.display import FileLink, FileLinks

# Eicosapentaenoic Acid (EPA) production in *Yarrowia Lipolytica*

## 1. Introduction

### 1.1 Literature review of the compound (<500 words)

Omega-3 fatty acids has been known to have beneficial effects on human health, due to their anti-inflammatory, antithrombotic and antiarrhythmic properties [1]. Fatty acids have a carboxyl group in one end and a methyl group in the other end, where the carboxyl group is the reactive group. Fatty acids that contain a double bond in the hydrocarbon chain is known as an unsaturated fatty acid. If it contains more than one double, it is known as a polyunsaturated fatty acid (PUFA). Omega-3 fatty acids are PUFA and they are named this way due to the fact that the first double bond is located on carbon number 3 [2]. Among the omega-3 fatty acids eicosapentaenoic acid (EPA) has shown to have beneficial properties. EPA can be found in algae and fatty fish such as tuna and salmon [1]. EPA consist of 20 carbons and 5 double bonds. The human body cannot naturally synthesize these fatty acids, therefore we need to get them through our diet [2]. Studies have also shown that EPA have a critical role in fetal retina and brain development, which is why pregnant women are advised to eat food with omega-3 fatty acids in their diet, since the developing fetus’ only source of fatty acids is coming from the mother via the placenta [3]. However, pregnant woman has to be careful when eating fatty fish in order to get EPA, since the presence of methyl mercury (MeHg) in fish can be harmful to the fetus [4]. Since EPA have these beneficial properties, it could therefore be advantageous to find other ways for people to intake these fatty acids beside eating fatty fish. Vegans do not eat fish, so they would also lack EPA in their diet. Supplements in the form of fish oil pills exist, however these pills are produced from the remains of fish in aquaculture [5], which vegans would still be unable to consume. Therefore, a way to produce EPA without the use of fish, but having cells produce the PUFA would be beneficial. Underneath is figure 1, showcasing the heterologous pathway we have to insert into our model to produce EPA. The pathway consist of omega-6 fatty acids such as linoleic acid (LA), Eicosadieonic acid (EDA), Dihomo-G-linoleic acid (DGLA) and arachidonic acid (ARA) which are precursors to EPA. The enzymes needed to synthesize the omega-6 fatty into EPA are Δ9-elongase (LA to EDA), Δ8-desaturase (EDA to DGLA), Δ5-desaturase (DGLA to ARA) and Δ17-desaturase (ARA to EPA) [6].

![Glycerol PPP](Figures/EPA_pathway.png "Figure xx: Shows the EPA pathway") 
Figure 1: Biosynthetic pathway of EPA. Red number from the right indicate the which carbon the first double bond is located on and the blue numbers indicate the carbon number from the carboxyl end. Delta9-elongase, delta8-desaturase, delta5-desaturase and delta17-desaturase are all needed to synthesize EPA from LA. Figure is adapted from Zhixiong Xue et al. [6] 

### 1.2 Literature review of the cell factory (<500 words)
The cell factory chosen for this project, is the ascomycete yeast *Yarrowia lipolytica*, that is a certified GRAS (generally recognized as safe) organism [7].  It is a versatile organism capable of utilizing a wide array of carbon substrates [8], and simultaneously also has a diverse portfolio of products from weak acids and proteins to biofuels and lipids. In general, the products of interest are derived from the yeasts’ ability to secrete enzymes / proteins and accumulate high amounts of lipid [9]. 
			
For the lasts 50 years, extensive work has been done to genetically engineer *Y. lipolytica* and several engineering strategies have been established such as CRISPR-Cas9 genome editing, DNA assembly methods, replicative vectors etc. Amongst these are 3 in silico models that will be mentioned later [10]. To this end, various strains are now being developed to support the engineering strategies, leading to specialized cell factories.

The yeast however also has its drawbacks, as there exist several challenges for *Y. lipolytica* as a cell factory at an industrial level. The most immediate of them is the growth rate, which generally is low for oleaginous yeasts, resulting in lower yields of the desired products. However, the extensive research into *Y. lipolytica* can help improve growth rates and production yield for the specific product [11]. 

For this project *Y. lipolytica* was chosen to produce the omega-3 fatty acids, EPA and DHA. As mentioned, the yeast at its core is an outstanding producer of lipids and already has a strong base for genetic engineering. 
Other considerations for the cell factory of choice, include another oleaginous yeast named *Trichosporon oleaginosus*, which can utilize aromatic based feedstock as substrate, while accumulating up to 70% of its biomass as lipids [12]. Even though this is highly interesting to the project, the industrial applications and genetic engineering strategies are not on the same level compared to *Y. lipolytica*. 

Another contender is probably the most well researched yeast, *Saccharomyces cerevisiae*. The yeast is well known as a reliable cell factory with robust genetic tools and high relevance in the industry. Several papers have been written on the accumulation of fatty acid production from the yeast [12]. However, for this paper, *Y. lipolytica* is of higher interest, as the initial high accumulation of lipids pave the way for a theoretically higher yield than S. cerevisiae would provide, though it might be easier to practically engineer.  

## 2. Problem definition (<300 words)

The main contributors of omega-3 fatty acids, more specifically EPA, in a normal diet are naturally fatty sea food products [13]. They can also be obtained through dietary supplements, which are either directly extracted from algae, or from fish [14]. One of the disadvantages of extracting the fatty acids from fish, is that it limits who can use these supplements, as vegans and vegetarians will not use them. Another disadvantage is that it is tedious to both farm the fish, and extract the oils from them [15]. The problem with algae bioreactors is that they are more challenging than other types of bioreactors, and require more space for the same volume. That is because in order for the algae to grow, they need sunlight for photosynthesis, which cannot penetrate very deep into the reactor, resulting in a reactor design with a larger surface area, and not very efficient in terms of volume [16].

In order to solve these problems the omega-3 fatty acids could be produced by heterogeneous expression in *Y. lipolytica*. By using a yeast such as *Y. lipolytica*, both the disadvantages from production in fish, and in algae are circumvented. It is both easier and less strenuous on the environment than production in fish, and there would be none of the dietary restrictions as mentioned before. It would also be more space efficient than production in algae.

## 3. Selection and assessment of existing GSM 

As mentioned in section 1.2 various Genome-Scale Models have been constructed for the oleogenic yeast *Yarrowia lipolytica*. Through extensive literature review three GSM were identified. The first ever organism specific model (iNL895) was reported by Loira *et al.* [17] in 2012 and constructed via a newly developed automatic reconstruction method. As a scaffold model consensus model version 4.36 of *Saccharomyces cerevisiae* [18] was used together with two more *S. cerevisiae* models (iMM904 [19], IN800 [20]) for specific metabolism clusters such as the fatty acid metabolism. After manual curation validation was done via growth experiments with different media compositions. In 2015 another model (iMK735) was reconstructed by Kavšček *et al.* [21]. For this model iND750 model of *S. cerevisiae* [22] was used as the scaffold model for reconstruction. Additional biomass equations were added as well based on biomass composition experiments with different lipid concentrations. As a validation method FBA was used to determine if the model predicted growth based on different carbon sources and compared with literature. The third model (iYali4), constructed by Kerkhoven *et al.* [23] was reconstructed using Yeast 7.11 consensus network model [24] together with curation with iNL895 [17] and literature. No validation method was described for this model.

#### Performance comparison with MEMOTE

Having obtained those three models we used the MEMOTE test suite [25] to score the different GSM models. The resulting html reports are attached at the end of the chapter and some of the metrics are tabulated in the table below.


**Memote Scores for different *Y. lipolytica* models from literature**

| **Model** | **Total score (%)** | **Total reactions** | **Total metabolites** | **Total genes** | **Metabolic Coverage** | **Consistency Score (%)** | **Metabolites annotation score (%)** | **Reactions annotation score (%)** |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
| iNL895 | 30 | 2262 | 1847 | 899 | 2.52 | 29 | 25 | 50 |
| iMK735 | 19 | 1464 | 1239 | 735 | 1.99 | 28 | 25 | 27 |
| iYali4 | 19 | 1985 | 1683 | 901 | 2.20 | 39 | 25 | 25 |


*Y. lipolytica* is part of the eukaryota domain and has therefore a higher complexity genome compared to prokaryotes. This also explains the low overall score for the three models, even though all three of them have a significant amount of metabolites and reactions. Worringly low however is the consistency score, which is why for this score a more detailed listing is shown in the following table.


**Detailed consistency score for different *Y. lipolytica* models from literature**

| **Model** | **Overall consistency score (%)** | **Stochiometric consistency (%)** | **Mass balance (%)** | **Charge Balance (%)** | **Metabolite Connectivity (%)** | **Unbound flux in default medium (%)** |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
| iNL895 | 29 | 0 | 0 | 100 | 100 | **Errored** |
| iMK735 | 28 | 0 | 0 | 95.9 | 99.4 | **Errored** |
| iYali4 | 39 | 0 | 0 | 100 | 97.9 | 77 |


For all the three models stochiometric consistency as well as mass balance yielded a result of 0%. Since those two metrics are however the most important metrics when it comes to accurate predictions with FBA and dFBA, further investigation was performed. This yielded that none of the models provided a chemical formula for their metabolites, making all three models not usable for our purposes. Due to this we tried to improve the current models by including a chemical formula with the metabolites. For this purpose we wrote a python script to automatically add the chemical formula based on the name of the metabolite, this however was only possible for the iMK735 model. The other two models would have needed manual addition of the chemical formula, which was deemed impossible due to the amount of metabolites in the model. For the iMK735 model however, the automatic addition of the formula with the python script worked and the memote scores for the improved model are listed in the following table.


**Detailed consistency score for iMK735 model before and after improvement**

| **Model** | **Overall consistency score (%)** | **Stochiometric consistency (%)** | **Mass balance (%)** | **Charge Balance (%)** | **Metabolite Connectivity (%)** | **Unbound flux in default medium (%)** |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
| iMK735  | 28 | 0 | 0 | 95.9 | 99.4 | **Errored** |
| iMK735 improved | 41 | 0 | 92.9 | 95.9 | 99.4 | **Errored** |


These scores demonstrate the successful addition of the chemical formula to the model iMK735, since the mass balance after addition totalled to 92.9% compared to 0.0% before. Even though stochiometric consistency still couldn't be acchieved for the model it was deemed good enough for progressing to the next steps. Here however we were not able to acchieve results in the FBA, all simulations yielded 'Insufficient solution' results. We were sadly not able to resolve this issue even after considerable time-effort.

We then decided to have a look at the scaffold models of the three identified *Y. lipolytica* models and identified iND750 [23], the scaffold model for iMK735 as a promising model, since good lipid metabolic coverage was reported. MEMOTE scoring was also obtained for this model and the results are listed in the following two tables.


**Memote Scores for iND750 model**

| **Total score (%)** | **Total reactions** | **Total metabolites** | **Total genes** | **Metabolic Coverage** | **Consistency Score (%)** | **Metabolites annotation score (%)** | **Reactions annotation score (%)** |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
| 86 | 1266 | 1059 | 750 | 1.69 | 97 | 83 | 80 |

**Detailed consistency score for iND750 model**

| **Overall consistency score (%)** | **Stochiometric consistency (%)** | **Mass balance (%)** | **Charge Balance (%)** | **Metabolite Connectivity (%)** | **Unbound flux in default medium (%)** |
|:-:|:-:|:-:|:-:|:-:|:-:|
| 97 | 100 | 97.3 | 100 | 100 | 83.2 |


Based on the very good MEMOTE scores for iND750, with a reasonable amount of metabolic coverage and nearly perfect consistency, we decided to shift to this model for the further tasks in this project. Since this model however was developed not for *Y. lipolytica* but *S. cerevisiae* addaption of the model would be necessary. This could be done by the help of the iMK735 model, by identifying the differences in the models and manually curating iND750 to mirror those changes. Due to the siginificant amount of time already invested until this point and the time constraints that arose from this, as well as the still to be expected time needed for the next steps of the project however, this step was ommitted. We therefore used the iND750 model as is for heterologous pathway insertion and optimization. Results have therefore to be taken with consideration, since the model corresponds to *S. cerevisiae* rather than *Y. lipolytica*

In [5]:
FileLinks('Memote_reports/')

## 4. Computer-Aided Cell Factory Engineering (<1500 words)

## Heterologous insertion of pathway leading to Eicosapentaenoic Acid (EPA)

Inserting the pathway involved annotating the metabolites and enzymes utilized in the reaction, as mentioned in section 1.1. Here, the enzyme were just annotated as the reaction, and both the metabolites and reaction enzymes can be seen in the table below.

**Heterologous Insert**

| **Metabolite** | **Reactions** |	**Stoichiometry** |
| ---- | ---- | - |
|Linoleate (C18H31O2)|$\Delta$ 12-desaturase|$-Oleate -2NADPH -2O_{2} => Linoleate + 2NADP+ + 2H_2O$|
|Eicosadienoic acid (EDA) (C20H35O2)|$\Delta$ 9-elongase|$-Malonyl-CoA -Linoleate -2NADPH -2H+ => EDA + CoA + H_2O + CO_2 + 2NADP+$|
|Dihomo-$\gamma$-linolenic acid (DGLA) (C20H33O2)|$\Delta$ 8-desaturase|$-EDA -2O_2 -NADPH => DGLA + H_2O + 2NADP+$|
|Arachidonic acid (ARA) (C20H31O2)|$\Delta$ 5-desaturase|$-DGLA -2O_2 -NADPH => ARA + H_2O + 2NADP+$|
|Eicosapentaenoic Acid (EPA) (C20H29O2)|$\Delta$ 17-desaturase|$-ARA -2O_2 -NADPH => EPA + H_2O + 2NADP+$|


The pathway was succesfully inserted as can be seen from the maximum yield analysis and the selection and optimization of media.

In [2]:
FileLink('Analysis/Scaffold_Y_lipo.ipynb')

## Selection and optimization of media

When selecting a media for the model, the first thing to do is check the default media included in the model. In the iND750 model the default media is not something you would see in practice with very high concentrations of all substrates. Therefore, a new media has to be constructed and as it is a yeast model, Yeast Extract Peptone Dextrose (YEPD or YPD) media was chosen. It should be noted that the model does lack both zinc and iron, which should have been added with the media.

With the media constructed we can now find the maximum biomass and EPA productivities, showcased in the table below:


| **Media** | **Biomass (/h)** |	**EPA (mmol/gDW⋅h)** |
| ---- | ---- | - |
|Default|0.097|0.159|
|YPD + Glc|6.385|9.797|
|YPD + Gly|5.169|8.062|
|YPD + Suc|4.492|7.214|
|YPD + Xyl|5.941|9.021|

To better visualize these results, they have also been plotted:

<p align="center">
  <img src="Figures/Biomass_and_EPA_productivities.png" alt="Carbon source significance">
  <br>
  <em>Figure 2. Significance of carbon source in media</em>
</p>

As it is evident from the figure above, the yeast performs best on glucose, closely followed by xylose. But also, the other tested sugar compounds perform relatively well, meaning that the cost could be reduced in production (even though yield would be reduced), or an alternative carbon source could be used in lack of better carbon sources. The alternative carbon sources are also benefecial as they have less uses in other industries (such as the food industry).

#### Glucose optimization

With the media chosen, we can find the optimal concentration of glucose, by several iterations of changing glucose concentrations, and plotting them:

<p align="center">
  <img src="Figures/Max_Glucose_concentration.png" alt="Sugar concentration significance">
  <br>
  <em>Figure 3. Significance of carbon source concentration in media</em>
</p>

As can be seen on the figure above, the growth stops at approximately 300 mmol/L glucose, and EPA production stops at approximately 400 mmol/L glucose. following is the conversion from mmol/L to g/L:

$$
m = n \cdot M
$$
$$
n_{Growth curve} = 0.3 mol/L
$$
$$
n_{EPA curve} = 0.4 mol/L
$$
$$
M = 180.156 g/mol
$$
$$
m_{Growth curve} = 0.3 mol/L \cdot 180.156 g/mol = 54 g/L
$$
$$
m_{EPA curve} = 0.4 mol/L \cdot 180.156 g/mol = 72 g/L
$$

According to an article about growth and fermentation of *S. cerevisiae* [26], the optimal growth conditions are achieved at a glucose concentration of 200 g/L. This is significantly higher than the results shown above, but their study focused on another product (ethanol) which could be the reason, or it might be due to other limiting factors in the media.

In [3]:
FileLink('Analysis/Medium_Opt_New_Model.ipynb')

## Co-factor Swap analysis

A co-factor swap analysis was performed in order to asses where in the cell NADPH could be exchanged with NADH, as to increase the available NADPH for our heterologous pathway. The analysis was done on EPA formation reaction and showed that other than the inserted reactions, no other reactions using NADPH were viable for swapping. So optimization of product yield using cofactor swapping could maybe be possible if one or more of the inserted reaction could be made to use NADH. An ideal candidate could be if one of the inserted reactions is slower than the other, then a co-factor swap to NADH might be beneficial for the cell productivity.

In [3]:
FileLink('Analysis/Cofactor_Swap_withtext.ipynb')

## Phenotypic phase plane 

A Phenotypic Phase Plane (PPP) analysis was made to access which substrate our cells grew best on and how the production of our desired product (EPA) was in each of the substrates and on different levels of the substrate. In the medium composition analysis, it was shown that our model grew best on glucose, but could also grow on xylose and glycerol. The production of EPA was also best on glucose. First a PPP was made on glycerol, which showed that our cell is able to grow on glycerol and also produce EPA. Afterwards a PPP was made on oxygen, and from that it can also be concluded that the cell is able to grow and produce EPA with oxygen present, however at lower rates than glycerol. 
Next, a PPP was made on xylose and glucose respectively, and it showed that the cell could grow and produce EPA as well, however both in higher rates than glycerol and oxygen. Glucose showed the biggest biomass and EPA production, which falls in line with what we concluded from the medium composition analysis. 
All plots can be seen in figure 4.

![Glycerol PPP](Figures/Biomass_and_EPA_on_glycerol.png "Figure xx: Plots of biomass and EPA production with glycerol, oxygen,xylose and glucose")
![Glycerol PPP](Figures/Biomass_and_EPA_on_oxygen.png "Figure xx: Plots of biomass and EPA production with glycerol, oxygen,xylose and glucose")
![Glycerol PPP](Figures/Biomass_and_EPA_on_xylose.png "Figure xx: Plots of biomass and EPA production with glycerol, oxygen,xylose and glucose")
![Glycerol PPP](Figures/Biomass_and_EPA_on_glucose.png "Figure xx: Plots of biomass and EPA production with glycerol, oxygen,xylose and glucose")

Figure 4: Plots of biomass and EPA production with glycerol, oxygen, xylose and glucose

We also wanted to analyze how the cell grew without the presence of oxygen in the medium. The same methods was used as above, with a changed medium composition. It was shown that the cell cannot grow and produce EPA when there is no oxygen present with glycerol. However, it could grow on xylose and glucose with no oxygen present, but at much lower rates with little to no production of EPA. The plots of this analysis can be seen in the analysis. 

In [4]:
FileLink('Analysis/PPP_analysis.ipynb')

## Prediction of Genetic Targets

In order to find genes in the model that can be up- or down-regulated *in silico*, an analysis called Flux Variability Scanning based on Enforced Objective Flux (FVSEOF) [27] was utilized. This analysis pushes a flux towards a selected objective, in order to observe a change in flux of reactions that are linked to the objective reaction. In this way, targets for both up- and down-regulation can be found, identified at different levels of forced flux.

For this project, the objective is of course EPA and for this analysis a change of flux in 111 reactions was found. The level of change in flux can be seen in figure 5 below.

![image.png](attachment:f1501e90-d705-47cc-a602-627f6290026b.png)
Figure 5: Flux scanning based on forced objective. To the left is the pathways that could be downregulated and to the right is the pathways that could be upregulated. 

It is quickly observed that there are some genetic targets that are of high value due to the massive increase in flux, and some that are prone to be downregulated due to a decrease in flux. However, figure 5 is just an overview of how many reactions see a change in flux, while we are interested in the ones that could potentially contribute most to EPA production via increased flux levels of EPA also. Therefore, targets of high interest were identified and selected to be the ones that change their flux above 95%, when the flux towards EPA is increased up to a tenfold, seen in figure 6.

![image.png](attachment:054e2e72-576f-4464-994f-d70242a1471b.png)
Figure 6: Reactions with a relative change in flux > 95%. 

This plot may seem somewhat confusing, but a general trend can be seen from this, where not many targets for downregulation are seen at this level of flux change. However, lots of targets are seen for **up-regulation** with the most significant ones being:

**Phosphogluconate dehydrogenase (GND) and PGL**: Both enzymes of the Pentose Phosphate Pathway, which is why they are highly similar in flux changes. 

**L-cysteine reversible transport via proton symport (CYSt2r)**: A reversible transport system of L cysteine. 

**Methionine Synthase (METS_1)**: Responsible for the regeneration of methionine from homocysteine, a part of the biosynthesis cycle. 

**5, 10 methylenetetrahydrofolate reductase (MTHFR3_1)**: Is the main actor in the conversion of 5,10-methylenetetrahydrofolate to 5-methyltetrahydrofolate, which in turn plays a part in the regeneration of methionine from homocysteine.

**Glukokinase (GLU)**: Converts glucose to glucose-6-phosphate by phosphorylation and is a major player in the control of carbohydrate metabolism.

For **down-regulation**, we still see some potential target, though at high flux of EPA, some of them seem to return to a relative flux of 0:

**Tyrosine Mitochondrial Transport (TYRt2m) via mitochondrial transport via proton symport**

**Phosphate transport via hydroxide ion symport  mitochondrial**

**Dicarboxylate transport (DICm) mitochondrial**

What these have in common is that they're highly important reactions for the overall metabolism of the cell, and it would be unwise to down-regulate these targets. 

In [4]:
FileLink('Analysis/Flux_Based_Analysis.ipynb')

## Dynamic Flux Based Analysis

A dynamic flux based analysis was performed in order to assess how different fluxes changed over time, such as the flux for EPA and biomass production, in relation to glucose. This was done in order to simulate batch conditions for our cell, utilizing the optimal medium composition. The result were as expected considering biomass, where a lag-phase is seen followed by an exponential phase and lastly a plateu of growth, which all follows the downward curve of glucose utilization. This is all seen in *figure 7*.

![image.png](attachment:f0b8ed44-ffdd-490b-a44b-923fdab7eae1.png)

However, as can be seen from the *figure 7*, the value of Biomass is quite contradictory to what is expected, which is a scale between 0 and 1, where we here see the y-axis from 0-35. It is not wholly clear what the exact value of the scale is, and therefore it is not certain whether values can be considered meaningful. Looking at *figure 8*, the graph for EPA production vs. Glucose utilization, shows somewhat the same results. A great form of the graph, where EPA follows biomass production and glucose utilization, but with values that cannot be considered meaningful. 

![image.png](attachment:9964f310-aedc-44e8-a43e-d892e60c7865.png)



## Genetic Manipulation

A genetic analysis could be performed in order to assess which genes could be removed from the cell in order to optimize our production of EPA. However, due to severe challenges with OptKnock and OptGene, this analysis was not able to be performed. OptGene seemed to be missing a module, while OptKnock was simply running for days and days on end. 

In [3]:
FileLink('Analysis/Genetic_manipulation.ipynb')

## 5. Discussion (<500 words)

When choosing the model for *Y. lipolytica* we encountered several problems, including bad consistensy and low to none mass balances in the memote report. This opted us to choosing another model reffered to as "iND750" which is the scaffold model used for construction of the *Y. lipolytica* models. The iND750 model is a model of *S. cerevisiae*. In this work, to mimic *Y. lipolytica*, a pathway native to *Y. lipolytica* was added to the scaffold model, and then treated as a model of *Y. lipolytica* which is not optimal, but theoretically the chosen strain would be better at producing EPA than *S. cerevisiae*, as it is a strain with high production of fatty acids.

Another thing to consider for this model, is that the heterologous pathway inserted, is pretty "straight forward", meaning it goes directly from the starting point to the end product (including some intermediates), and here it is not accounted for, if any of the intermediates are utilized in another pathway (which in the real world would have to be knocked out to increase yield).

The default media in the model is not representative of a real world scenario, and is adjusted to more reasonable levels in the constructed media. This could result in either higher or lower limits, or it could have other limiting factors, which have not been considered in the constructed media. For the flux analysis, only reactions that changed 95% were considered for regulation. However, some reactions that didn't change as much or could have a higher impact at different objective flux levels, could have been considered. Looking at *figure 5*, there at least seems to be some reactions that are worth considering for this, where down-regulation reactions may be found.

Lastly, the simulation of a batch reaction by DFBA could've been produced with a more optimal composition, considering oxygen level changes or xylose utilization. This could've given a more in depth understanding of the mechanisms by which our model organism utilizes the medium. 

## 6. Conclusion (<200 words)

In theory the organism *Yarrowia Lipolytica* is the prime candidate to produce omega-3 fatty acids such as EPA due to their lipid producing nature. However, a proper functioning GSM model of the organism still needs to be made to make accurate simulations and cell factory design. Several GSM models were analyzed to find a suitable model that could work for our organism. In the end, a model called iND750 was used in this project to try and simulate EPA production. iND750 is however developed for *S.cerevisiae* instead of *Y.lipolytica*, so additional modelling would have been necessary to make it a true *Y.lipolytica* GSM model. This was not possible due to time constraints, so a heterologous pathway was inserted into the iND750 model and optimized to analyze its potential to produce EPA. Several results showed that the model was able to adapt to the heterologous pathway added and able to produce EPA with different media and different fluxes, and with hopes of optimizing the model via up-regulation.

## References

[1] Covington, Maggie B. "Omega-3 fatty acids." American family physician 70.1 (2004): 133-140. 

[2] Calder, Philip C., and Parveen Yaqoob. "Understanding omega-3 polyunsaturated fatty acids." Postgraduate medicine 121.6 (2009): 148-157. 

[3] Shrestha, Nirajan, et al. "Role Of Omega‐6 and Omega‐3 fatty acids in fetal programming." Clinical and Experimental Pharmacology and Physiology 47.5 (2020): 907-915. 

[4] Jinadasa, B. K. K. K., et al. "Mitigating the impact of mercury contaminants in fish and other seafood—A review." Marine Pollution Bulletin 171 (2021): 112710.

[5] https://maring.org/a-sustainable-industry/fish-meal-and-fish-oil-production/ 

[6] Xue, Zhixiong, et al. "Production of omega-3 eicosapentaenoic acid by metabolic engineering of Yarrowia lipolytica." Nature biotechnology 31.8 (2013): 734-740. 

[7] Zhu, Quinn, et al. "Metabolic engineering of an oleaginous yeast for the production of omega-3 fatty acids." Single cell oils. AOCS press, 2010. 51-73.

[8] https://backend.orbit.dtu.dk/ws/files/124169843/RRB_poster_Patrice.pdf

[9] Madzak, Catherine. "Engineering Yarrowia lipolytica for use in biotechnological applications: a review of major achievements and recent innovations." Molecular Biotechnology 60.8 (2018): 621-635.

[10] Zhang, Tang-Lei, Hong-Wei Yu, and Li-Dan Ye. "Metabolic engineering of yarrowia lipolytica for terpenoid production: tools and strategies." ACS Synthetic Biology 12.3 (2023): 639-656.

[11] Gonçalves, F. A. G., G. Colen, and J. A. Takahashi. "Yarrowia lipolytica and its multiple applications in the biotechnological industry." The Scientific World Journal 2014 (2014).

[12] Yaguchi, Allison, et al. "Metabolism of aromatics by Trichosporon oleaginosus while remaining oleaginous." Microbial Cell Factories 16.1 (2017): 1-12.

[13] Tur, J. A., et al. "Dietary sources of omega 3 fatty acids: public health risks and benefits." British Journal of Nutrition 107.S2 (2012): S23-S52. https://doi.org/10.1017/S0007114512001456  
  
[14] Adarme-Vega, T. Catalina, et al. "Microalgal biofactories: a promising approach towards sustainable omega-3 fatty acid production." Microbial cell factories 11.1 (2012): 1-10. https://doi.org/10.1186/1475-2859-11-96 

[15] Singh, R. N., and Shaishav Sharma. "Development of suitable photobioreactor for algae production–A review." Renewable and Sustainable Energy Reviews 16.4 (2012): 2347-2353. https://doi.org/10.1016/j.rser.2012.01.026 

[16] Wall, Rebecca, et al. "Fatty acids from fish: the anti-inflammatory potential of long-chain omega-3 fatty acids." Nutrition reviews 68.5 (2010): 280-289. https://doi.org/10.1111/j.1753-4887.2010.00287.x
 
[17] Loira, Nicolas, et al. "A genome-scale metabolic model of the lipid-accumulating yeast Yarrowia lipolytica." BMC systems biology 6 (2012): 1-9. doi:10.1186/1752-0509-6-35 

[18] Herrgård, Markus J., et al. "A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology." Nature biotechnology 26.10 (2008): 1155-1160. https://doi.org/10.1038/nbt1492 

[19] Mo, Monica L., Bernhard Ø. Palsson, and Markus J. Herrgård. "Connecting extracellular metabolomic measurements to intracellular flux states in yeast." BMC systems biology 3.1 (2009): 1-17.  https://doi.org/10.1186/1752-0509-3-37 

[20] Nookaew, Intawat, et al. "The genome-scale metabolic model iIN800 of Saccharomyces cerevisiae and its validation: a scaffold to query lipid metabolism." BMC systems biology 2 (2008): 1-15. https://doi.org/10.1186/1752-0509-2-71 

[21] Kavšček, Martin, et al. "Optimization of lipid production with a genome-scale model of Yarrowia lipolytica." BMC systems biology 9.1 (2015): 1-13. DOI 10.1186/s12918-015-0217-4 

[22] Duarte, Natalie C., Markus J. Herrgård, and Bernhard Ø. Palsson. "Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model." Genome research 14.7 (2004): 1298-1309. https://doi.org/10.1101/gr.2250904 

[23] Kerkhoven, Eduard J., et al. "Regulation of amino-acid metabolism controls flux to lipid accumulation in Yarrowia lipolytica." NPJ systems biology and applications 2.1 (2016): 1-7. http://dx.doi.org/10.1038/npjsba.2016.5 

[24] Aung, Hnin W., Susan A. Henry, and Larry P. Walker. "Revising the representation of fatty acid, glycerolipid, and glycerophospholipid metabolism in the consensus model of yeast metabolism." Industrial biotechnology 9.4 (2013): 215-228. https://doi.org/10.1089%2Find.2013.0013 

[25] Lieven, Christian, et al. "MEMOTE for standardized genome-scale metabolic model testing." Nature biotechnology 38.3 (2020): 272-276. https://doi.org/10.1038/s41587-020-0446-y

[26] Lee, Jong-Sub, et al. "Growth and fermentation characteristics of Saccharomyces cerevisiae NK28 isolated from kiwi fruit." Journal of microbiology and biotechnology 23.9 (2013): 1253-1259. 

[27] Park, Jong Myoung, et al. "Flux variability scanning based on enforced objective flux for identifying gene amplification targets." BMC systems biology 6 (2012): 1-11. 
