# Exercice 2

In this exercice we will aim to increase the production of succinate in Yeast resorting to GECKO and ETFL formulations.

## GECKO model

Start by installing the `geckopy` package containing a model for yeast

In [None]:
!pip install geckopy --upgrade --quiet

and perform the necessary imports.

In [None]:
from geckopy.gecko import GeckoModel
from mewpy.problems import GeckoOUProblem
import pandas as pd

We may now load the GECKO yeast model

In [None]:
model = GeckoModel('single-pool')

We can use GECKO model to find genetic modifications that favor the production of succinate. However, instead of applying genetic modifications (deletions / up / down-regulations) by evaluating GPR rules, the modifications are imposed on the enzyme usage pseudo-reactions.

As the optimization tasks can be long, we will simulate previously obtained modification solutions. We will, nonetheless, define a problem but without any optimization objectives.

Start by defining the required identifiers and the medium. We will considere an aerobic medium with a maximum glucose and oxygen of uptake of 10 and 20 mmol/(gDW.h)

In [None]:
BIOMASS = 'r_2111'
GLC = 'r_1714'
O2 = 'r_1992'
# succinate
PRODUCT = 'r_2056'

medium = {GLC:(-10,10000),O2:(-20,10000)}

Define a GECKO up/down-regulation problem:

In [None]:
problem = GeckoOUProblem(model,[],envcond=medium)

In [None]:
sim = problem.simulator

## Exercice 1

**1.1:** Identify the list of essential proteins.


**1.2:** For one randomly selected essential proteins identify the reactions it calatyzes and respective turnover, as well as its molecular weight.

**1.3:** Compute the wildtype production of succinate. 

**1.4:** Compute the theoretical maximum production rate of succinate with a 95% confidence on growth.

Consider the sets of genetic modifications below that resulted from GECKO optimizations: 

In [None]:
df = pd.read_csv('data/succ_yeast.csv')
df

You can retrieve the first solution using, for example, the command below:

In [None]:
solution = eval(df.iloc[0,0]) # the first 0 identifies the row index

### Exercice 2
Briefly analyse the solutions running phenotypic simulations (pFBA, lMOMA, ROOM and FVA) and plot the production envelopes

## ETFL model

In [None]:
!pip install https://github.com/EPFL-LCSB/pytfa/archive/refs/heads/master.zip --quiet
!pip install https://github.com/EPFL-LCSB/etfl/archive/refs/heads/master.zip --quiet

In [None]:
from etfl.io.json import load_json_model
from mewpy.problems import ETFLGOUProblem

In [None]:
model = load_json_model('data/yeast8_vEFL_2584_enz_128_bins__20210908_192334.json')

Create a ETFLGOUProblem using the ETFL model

In [None]:
from mewpy.simulation import get_simulator
sim = get_simulator(model,envcond=medium)

In [None]:
problem = ETFLGOUProblem(model,[],envcond=medium)

In [None]:
sim = problem.simulator

You may use the `find` function to identify gene (/protein) transcription, translation and degration pseudo reactions.

Genetic modifications on ETFL models use the gene identifier instead of the uniprot identifier, as such, we need to convert the IDs in the GECKO solutions for them to be simulated on ETFL models. The function below does that for you.

In [None]:
gp = pd.read_csv('data/prot_gene.csv')

def prot_to_gene(solution):
    ''' Convert a solution with protein identifier to a solution
    with gene identifiers
    '''
    def p2g(p):
        return gp.loc[gp.Protein==p,'Gene'].item()
    return {p2g(k):v for k,v in solution.items()}

### Exercice 3

We can now simulate the sets of genetic modification, used with the GECKO formulation, on the ETFL model.