# Phenotype prediction Hands-on session

## Exercise 7

For this practical session, we will be using the following model:
- _E. coli_ core model which contains the central carbon metabolism of _Escherichia coli_ -> file: ../data/e_coli_core.xml

This exercise consists of searching for potential candidate genes whose knock out would lead to increased Ethanol production. Thus, the following steps will be followed:
- Loading the model with COBRApy;
- Perform a FBA simulation using an anaerobic medium;
- Load the escher map with the FBA simulation to visualize the flux distribution;
- Identify potential reaction targets;
- Determine the essential genes of the model;
- Check the target reaction GPRs for essential genes;
- Perform the gene knock outs;

In [1]:
# importing cobra
import cobra

# Loading a model
model_path = '../data/e_coli_core.xml'
model = cobra.io.read_sbml_model(model_path)

model

0,1
Name,e_coli_core
Memory address,0x01e7a980d100
Number of metabolites,72
Number of reactions,95
Number of groups,0
Objective expression,1.0*BIOMASS_Ecoli_core_w_GAM - 1.0*BIOMASS_Ecoli_core_w_GAM_reverse_712e5
Compartments,"extracellular space, cytosol"


### Phenotype Simulation

By deafault, the environmental condictions of the _E. coli_ core model represent an aerobic medium. However, for _E. coli_ to produce ethanol, an anaerobic medium must be simulated. Hence, the first step of this exercise is to stop the Oxygen supply, by changing the lower bound of the o2 drain to zero. 

Following that, a standard FBA simulation can be performed with the goal of maximizing biomass production.

In [2]:
#setting the environmental conditions to replicate an anaerobic medium
o2 = model.reactions.get_by_id('EX_o2_e')
o2.bounds = (0,1000)
o2

0,1
Reaction identifier,EX_o2_e
Name,O2 exchange
Memory address,0x01e7de241fd0
Stoichiometry,o2_e -->  O2 O2 -->
GPR,
Lower bound,0
Upper bound,1000


In [3]:
#performing a FBA simulation
wt_solution = model.optimize()
wt_solution

Unnamed: 0,fluxes,reduced_costs
PFK,9.789459,2.602085e-18
PFL,17.804674,0.000000e+00
PGI,9.956609,0.000000e+00
PGK,-19.437336,-0.000000e+00
PGL,0.000000,0.000000e+00
...,...,...
NADH16,0.000000,-5.538015e-03
NADTRHD,0.000000,-1.107603e-02
NH4t,1.154156,0.000000e+00
O2t,0.000000,0.000000e+00


### Metabolic Pathway Visualization

Now that the environmental conditions were contrained and the model simulated with FBA, one can run the `escher.Builder` method to obtain the escher map of the _E. coli_ core model. This will help the user to better visualize the flux distribution of the model, thus, making it simppler to manually detect any reaction that is deflecting flux from ethanol production.

In [4]:
# first import escher
import escher

In [5]:
# create the builder object wich contains the metabolic map
builder = escher.Builder(map_name='e_coli_core.Core metabolism', model=model, reaction_data=wt_solution.fluxes)
builder

Downloading Map from https://escher.github.io/1-0-0/6/maps/Escherichia%20coli/e_coli_core.Core%20metabolism.json


Builder(reaction_data={'PFK': 9.789458863898286, 'PFL': 17.804674217935304, 'PGI': 9.95660909530426, 'PGK': -1…

From this map, one can quickly identify two reactions that are deflecting flux from ethanol production. These are the Citrate Synthase and the Phosphotransacetylase.

### Search for essential Genes

Now that we have two candidate reactions, we can use the `find_essential_genes` function of the flux analysis package to determine a list of essential genes for this model. This list contains all the genes whose knock out would result in no biomass production. 

Hence, reactions whose genes present in GPRs are essential genes, are not suitable candidates for metabolic engineering.

In [6]:
#searching for essential genes
essential = cobra.flux_analysis.find_essential_genes(model)
essential

{<Gene b0720 at 0x1e7de18f160>,
 <Gene b1136 at 0x1e7de1a92e0>,
 <Gene b1779 at 0x1e7de18fc40>,
 <Gene b2415 at 0x1e7de18fa00>,
 <Gene b2416 at 0x1e7de18f9a0>,
 <Gene b2779 at 0x1e7de18f460>,
 <Gene b2926 at 0x1e7de1bf1c0>,
 <Gene b3919 at 0x1e7de1bfb50>,
 <Gene b3956 at 0x1e7de1bf3a0>,
 <Gene b4025 at 0x1e7de1bf280>}

### Check target candidates

After discovering the essential genes of the model, one can now check is the target reactions are associated with any of those essential genes.

In [7]:
model.reactions.PTAr

0,1
Reaction identifier,PTAr
Name,Phosphotransacetylase
Memory address,0x01e7de1ea820
Stoichiometry,accoa_c + pi_c <=> actp_c + coa_c  Acetyl-CoA + Phosphate <=> Acetyl phosphate + Coenzyme A
GPR,b2297 or b2458
Lower bound,-1000.0
Upper bound,1000.0


In [8]:
model.reactions.CS

0,1
Reaction identifier,CS
Name,Citrate synthase
Memory address,0x01e7de1d7d90
Stoichiometry,accoa_c + h2o_c + oaa_c --> cit_c + coa_c + h_c  Acetyl-CoA + H2O H2O + Oxaloacetate --> Citrate + Coenzyme A + H+
GPR,b0720
Lower bound,0.0
Upper bound,1000.0


As we can see, the Phosphotransacetylase reaction is a viable knock out target. By checking its GPR, we can see that neither of those genes are in the list of essential genes, and thus could be knocked out.

On the other hand, the Citrate synthase should not be knocked out or the organism will not be able to grow.

### Gene Knock outs

In [9]:
print('Wt strain ethanol production:', wt_solution['EX_etoh_e'])

with model:
    model.genes.b2297.knock_out()
    model.genes.b2458.knock_out()
    mutant_solution = model.optimize()
    print('Mutant strain ethanol production:', mutant_solution['EX_etoh_e'])

Wt strain ethanol production: 8.279455380486581
Mutant strain ethanol production: 16.584255740929652


In [10]:
builder = escher.Builder(map_name='e_coli_core.Core metabolism', model=model, reaction_data=mutant_solution.fluxes)
builder

Downloading Map from https://escher.github.io/1-0-0/6/maps/Escherichia%20coli/e_coli_core.Core%20metabolism.json


Builder(reaction_data={'PFK': 9.271232684579616, 'PFL': 3.9563473930779764, 'PGI': 8.339429028836893, 'PGK': -…