# [Title]

## 1. Introduction

### 1.1 Literature review of the compound (<500 words)

The product introduced in this project is the fat-soluble (source) vitamin D3, most commonly referred to as simply vitamin D. The vitamin is essential to the intestinal absorption of calcium, magnesium and phosphate. It can be produced by the skin trhough a reaction dependent on UVB light – from sunlight ((Holick 1980).), and is also present in egg yolks and fish  (Brown #).

The most popular use of vitamin D is as a supplement, specially for countries in the Northern hemisphere, with long winter-nights without sunlight available . However, deficiency is also widespread in Asia, specially China . More than 10% of Europeans have severe vitamin D deficiency, while the number is for non-severe deficiency is <20% for Northern Europeans and 30-60% for the rest of Europe.

A new usage – while still as supplement – is taking vitamin D as a preventative against infection with COVID-19 (a respiratory infection), which is a pandemic that begun in 2019 and is still ravaging through the world at over one million dead . Some trials have found no impact when taking vitamin D , while other researchers are still conducting experiments to elucidate the matter further . Nonetheless, it is known that vitamin D can alleviate the risk of respiratory infections in general.

While research is, as of yet, inconclusive with regards to vitamin D and its effect on COVID-19, the market has reacted to this increased interest. For example, Danish news agency DR (Danmarks Radio) recently put out an article explaining that the consumption of vitamin D has increased up to 50% since September 2020. (Salget af D-vitamin er eksploderet – men Giftlinjen og overlæge advarer om overforbrug)

Furthermore, with the increasing wages in Asia, an increase vitamin D supplements is noted, which affects the market increases for vitamin D. the average annual growth rate (CAGR) of vitamin D is around 7% for 2019-2024 . The market is quite competitive, without one dominant player . Instead multiple companies produce and sell vitamin D (Pfizer, GlaxoSmithKline etc) .
While the companies do not release exact information on their production methods, patents suggest a purely chemical production rather than biological (patentWO2001072286A1)
While the industrial production is chemical, as mentioned vitamin D is also produced in the human body, as well as experimentally in Saccharomyces cerevisieae. 


### 1.2 Literature review of the cell factory (<500 words)


Budding yeast (Saccharomyces cerevisiae) is one of the most used microorganisms in human history [X].The first evidence of the use of microorganisms is suspected to involve some kind of yeast, plausibly S. cerevisiae, and dates back to XXXX BCE [X]. Today, this wonderful organism is used in a wide variety of productions, ranging from ethanol [X] to biomass[X], and small organic compounds[X] to cancer medication[X].

One important difference between S. cerevisiae and all used bacterial microorganisms is that yeasts are eukaryotes, which means they have, among other things, cellular compartments. This makes a large difference in terms of complexity, as transport between different compartments needs to be factored in, but it also allows for more complex pathways to be split up between  locations where the local environment might be more hospitable. Furthermore, due to the large usage of the organism, it has been extensively studied, and is popular for usage in the industry, making it a good candidate for the production of vitamins and other biochemicals as well as proteins. [10.1186/s12934-015-0281-x]
The yeast also has many well-established DNA cassettes, making construction of the strain more efficient[kilde]

While the organism has many advantages and is widely used, it has some drawbacks as well. Usually, industry and academia has used glucose and disaccharides as substrate and carbon source, however this is neither environmentally sustainable, nor is it cheap. From these perspectives, it would be more advantageous to use a strain which can naturally use cellulose, lignin or xylose as substrate, which are abundant in nature.
Another issue using S. cerevisieae can be the unwanted production of ethanol (even under aerobic conditions, dubbed the crabtree effect), since the cell factor will divert the carbon and energy towards production of EtOH rather than the wanted metabolite. [10.1186/s12934-015-0281-x]

There are many reasons why S. cerevisiae is so prevalent in the biotechnological sphere, among them being its fast growth rate, high production capacity, and it relative simplicity for an eukaryote.  [kilde?]

Of course, other sources can be used such as prokaryotic organisms or other yeasts. However, as outlined above, S. cerevisieae allows for higher complexity – specially when working with more complex molecules such as vitamins and hormone(e.g. insulin). Other yeasts could be used, however S. cerevisieae remains the yeast most researched, and is, as mentioned, a model organism. 

Yeast does not naturally have the biosynthetic pathway for D vitamin precursor. However, other researchers have been able to insert the pathway into yeast, with some amount of success. For example doi.org/10.1186/s13068-018-1194-9 has produced 7-dehydrocholesterol through metabolic engineering, by overexpressing genes in the mevalonate pathway as well as introduce the gene for Δ24-dehydrocholesterol reductase from the organism Gallus gallus, a type of bird. The researchers furthermore deleted some genes and introduced specific promoters, leading to a titer of 1.07 g/L of the D vitamin precursor

## 2. Problem definition (<300 words)

In this project, there is a main focus on engineering S. cerevisaea as a cell factory to produce a D-vitamin precursor. Currently, D vitamin is mainly produced by extracting the precursor from sheepswool. However, with this method the production would be vegan as well as possibly more optimised and less dependent on external factors such as healthy sheep.

To produce D vitamin precursor in S. cereviseae, firstly the relevant genes must be introduced in an already existing GSM model. This must then be optimized in a way that focuses on both biomass production as well as vitamin D precursor production, in such a way that the yield can become industrially feasible or near-feasible. 


## 3. *If Project category II:* Selection and assessment of existing GSM (<500 words)

For our project, we need a metabolic model of yeast. Currently, the model Yeast8 seems to be the most comprehensive yeast genome-scale metabolic model (GEM). This model was first published in 2019 by Lu et al. (source) and has been updated in an open source manner using GitHub. 
Apart from being regularly updated, the Yeast8 framework also contains a lot of additional information about enzyme structures and kinetics, which can be used to create increasingly detailed models for varying purposes.

We will be using the ecYeast8 model which is based on the Yeast8 GEM, but with added enzyme constraints based on proteomics data, as performed by GECKO (source). More specifically, we will be using an already spcified version of the GEM as found on the [GECKO GitHub.](https://github.com/SysBioChalmers/GECKO/blob/master/models/prot_constrained/ecYeastGEM_prot/ecYeastGEM_prot.xml)
The specific experimental conditions for this model can be found [here.](https://github.com/SysBioChalmers/GECKO/tree/master/Databases)

### Loading of the model
The first thing we have to do is to acquire the model and load it into our repository. We have used the easy way out and downloaded the .xml file from the GitHub repository, but the whole repository could also have been cloned and the model extracted from there. 
This method of a simple download was chosen to minimize the size of the project repository.

In [25]:
from IPython.display import display
import re

import cplex

In [26]:
# First, we can import some functions so we can use the model
from cobra.io import read_sbml_model
from cobra import Reaction, Metabolite
from cameo.strain_design import pathway_prediction

# Second, we can read the GEM and save it as ‘model’
model = read_sbml_model('ecYeastGEM_prot.xml')

# Thrid, we can show general information about the loaded model
model

0,1
Name,M_ecYeastGEM_prot_v8__46__3__46__4
Memory address,0x07fa43f1528d0
Number of metabolites,4180
Number of reactions,8144
Number of groups,0
Objective expression,-1.0*prot_pool_exchange + 1.0*prot_pool_exchange_reverse_c813a
Compartments,"cell envelope, cytoplasm, extracellular, mitochondrion, nucleus, peroxisome, endoplasmic reticulum, Golgi, lipid particle, vacuole, endoplasmic reticulum membrane, vacuolar membrane, Golgi membrane, mitochondrial membrane"


From this, we can see that the model has 4180 metabolites, 8144 reactions, and 14 compartments.

To get a quick overview of the model and its throughness, we can use Memote (source) to analyze the GEM and score how good it currently looks. Hopefully, our changes will improve the model, or at least not make it any worse. 

In [18]:
%%time
!memote report snapshot ecYeastGEM_prot.xml --filename ecYeastGEM_prot.html

platform linux -- Python 3.6.12, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /usr/local/lib/python3.6/dist-packages/memote/suite/tests
collected 146 items / 1 skipped / 145 selected                                 [0m

../../../../usr/local/lib/python3.6/dist-packages/memote/suite/tests/test_annotation.py [31mF[0m[31m [  0%]
[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[32m.[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[32m.[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[31mF[0m[32m.[0m[31m         [ 44%][

The file "ecYeastGEM_prot.html" is the memote analysis.
From this analysis, we can see that the total score is 16%, which is quite low. However, the tests for "Charge Balance", "Metabolite Connectivity", and a few of the reaction annotations get a score of 100%. This indicates that (Write something clever here.)

In [27]:
predictor = pathway_prediction.PathwayPredictor(model)

In [28]:
pathways = predictor.run(product="vanillin", max_predictions=4)

In [31]:
pathways = predictor.run(product="ascorbate", max_predictions=2)

ValueError: Specified product 'ascorbate' could not be found. Try searching pathway_predictor_obj.universal_model.metabolites

In [29]:
#pathways = predictor.run(product="7-Dehydrocholesterol", max_predictions=1)

In [38]:
pathway_predictor_obj.model.metabolites

NameError: name 'pathway_predictor_obj' is not defined

In [30]:
pathways

Unnamed: 0,targets


In [41]:
pathways = predictor.run(product="ethanol", max_predictions=1)

## 4. Computer-Aided Cell Factory Engineering (<1500 words if Category II project; <500 words for Category I project)

## 5. Discussion (<500 words)

## 6. Conclusion (<200 words)

## References