
# Shikimate in E. coli

## Introduction

In this short tutorial, we will use a core model of [Escherichia coli](
http://bigg.ucsd.edu/models/e_coli_core) and extend it to synthesize shikimate.
This compound is an important precursor to aromatic compounds such as
phenylalanine, tyrosine, tryptophan.
The authors of the article
[Metabolic engineering of Escherichia coli for improving shikimate synthesis
from glucose](
https://doi.org/10.1016/j.biortech.2014.05.035) engineered five strains of
*E. coli*, whose certain genes for the regulation of shikimate were deactivated.
In this test case, we will compare the control strain (W3110) with the strain
without the *aroL*, *aroK* and *ptsG* genes (SA3). Using CobraMod we can add the shikimate pathway from Ecocyc and visually compare the strains.

[EcoCyc](https://www.ecocyc.org/) is a specialized database for biochemical
data of E. coli, which is part of the BioCyc database family. The shikimate
pathway is showed as: 

<img src="https://websvc.biocyc.org/ECOLI/diagram-only?type=PATHWAY&object=ARO-PWY&pfontsize=normal"/>

and is divided in two sub-pathway:

<table>
<thead>
  <tr>
    <th>Pathway Identifier</th>
    <th>Pathway name</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td>PWY-6164</td>
    <td>3-dehydroquinate biosynthesis I</td>
  </tr>
  <tr>
    <td>PWY-6163<br></td>
    <td>chorismate biosynthesis from 3-dehydroquinate<br><br></td>
  </tr>
</tbody>
</table>





## Loading CobraMod and defining environment

First, we need to load the core model for E. coli. Our package includes it
in its module [cobramod.test] and its the same `textbook` model that COBRApy
uses. Additionally, we will load the Python native module [pathlib](
https://docs.python.org/3/library/pathlib.html) to define the location to store
the data, and the function [cobramod.add_pathway] to add the complete pathways
into our model.

In [1]:
from pathlib import Path

from cobramod import add_pathway, __version__
from cobramod.test import textbook

print(__version__)

Scaling...
 A: min|aij| =  1.000e+00  max|aij| =  1.000e+00  ratio =  1.000e+00
Problem data seem to be well scaled
0.5.4


We will define the directory for our data as `dir_data`. The whole pathway with the
corresponding reaction and metabolite information will be downloaded and stored there.
Moreover, we will load our model `textbook`, which represents the core model of E. coli  

In [2]:
dir_data = Path.cwd().joinpath("data")

model = textbook.copy()

The article mentions that the medium for the strains included glucose (15g) and
multiple substances. Because our model is a core model, it only includes the
fundamental metabolic pathways and lacks the uptake of theses subtrates.
For this reason, we will limit only the constraint of glucose and oxygen:

In [3]:
# Limiting Glucose
model.exchanges.EX_glc__D_e.bounds = ( -15, 0)
# Limiting Oxygen
model.exchanges.EX_o2_e.bounds = (0, 0)
#model.summary(threshold=0.0000001)

## Adding the pathway

To add a pathway to the model, we will use the function `cobramod.add_pathway`.
We need to specify the directory for the data (`dir_data`), the compartment for
the reactions to take place, the database and the identifier of the pathway
(`pathway`). By default, CobraMod willl create an single `cobramod.Pathway` for
every pathway added. Nonetheless, users can merge pathways by using the argument
`group`. Additionally, we will use the method `Pathway.modify_graph()` to join
the pathways visually and use `Pathway.visualize()` to show the pathway.


In [4]:
# model = textbook.copy()
# First pathway
add_pathway(
    model=model,
    directory=dir_data,
    pathway="PWY-6164",
    compartment="c",
    database="ECOLI",
    # Shared identified
    group="PWY-SHIKIMATE",
)

# Second Pathway
add_pathway(
    model=model,
    directory=dir_data,
    pathway="PWY-6163",
    compartment="c",
    database="ECOLI",
    # Shared identified
    group="PWY-SHIKIMATE",
)

# Merge reactions of the sub-pathways
model.groups.get_by_id("PWY-SHIKIMATE").modify_graph(
    reaction="3_DEHYDROQUINATE_SYNTHASE_RXN_c",
    next_reaction="3_DEHYDROQUINATE_DEHYDRATASE_RXN_c"
)
# Show pathway map
model.groups.get_by_id("PWY-SHIKIMATE").visualize()



Quantity of     new   | removed entities in
Reactions        2    |    0              
Metabolites      2    |    0              
Exchange         0    |    0              
Demand           0    |    0              
Sinks            1    |    0              
Genes            4    |    0              
Groups           1    |    0              





Quantity of     new   | removed entities in
Reactions        5    |    0              
Metabolites      5    |    0              
Exchange         0    |    0              
Demand           0    |    0              
Sinks            1    |    1              
Genes            6    |    0              
Groups           0    |    0              



Builder(reaction_scale={}, reaction_styles=['color', 'text'])

CobraMod will try to avoid creating duplicates Reactions and Metabolites. In
our case, `textbook` uses the BiGG identifier convention and the data retrieved
uses  the BioCyc identifier convention. CobraMod will check the cross-references
from the downloaded data and will try to find corresponding objects in the
model. Warnings will appear if occurences are found.

Our package will also include the gene information of the reactions and
pathways that are added into the model. We can loop through the members of
pathway (Reactions) and print the new genes:

In [5]:
for reaction in model.groups.get_by_id("PWY-SHIKIMATE").members:
    print(reaction.id, "->", reaction.name, [x.name for x in reaction.genes])

DAHPSYN_RXN_c -> 3-deoxy-7-phosphoheptulonate synthase ['aroG', 'aroF', 'aroH']
3_DEHYDROQUINATE_SYNTHASE_RXN_c -> 3-dehydroquinate synthase ['aroB']
3_DEHYDROQUINATE_DEHYDRATASE_RXN_c -> 3-dehydroquinate dehydratase ['aroD']
SHIKIMATE_5_DEHYDROGENASE_RXN_c -> shikimate dehydrogenase ['aroE']
SHIKIMATE_KINASE_RXN_c -> shikimate kinase ['aroK', 'aroL']
2.5.1.19_RXN_c -> 3-phosphoshikimate 1-carboxyvinyltransferase ['aroA']
CHORISMATE_SYNTHASE_RXN_c -> chorismate synthase ['aroC']


## Creating biomass reactions

Table 2 of the article illustrates the multiple biomass and shikimate production
of the different E. coli strains. We will use the wild strain (W3110) and the
SA3 strain for our demonstration.

<table>
<thead>
  <tr>
    <th>Strains</th>
    <th>Biomass (g/L)</th>
    <th>Shikimate (mg/L)</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td>W3110</td>
    <td>3.42± 0.26</td>
    <td>1.31± 0.12</td>
  </tr>
  <tr>
    <td>SA3</td>
    <td>5.24± 0.34</td>
    <td>417.20± 50.01</td>
  </tr>
</tbody>
</table>

Our current model already contains a biomass reaction. However, it does not
include shikimate in its stoichiometry. For this reason, we will use the table
above and calcurate the ratio of shikimate for both strains and apply it to
two new biomass reactions.

First, we need to calculate the amount of biomass without taking
growth-associated maintenance into
consideration. The original [article](
https://doi.org/10.1038/s41596-018-0098-2) for the *E. coli* core model hints
us that we need to ignore `ATP` and `ADP`. Additionally, we will ignore `NAD`,
`NADPH`, `NADH`, `NADPH` and `WATER`.

In [6]:
model.reactions.Biomass_Ecoli_core.reaction

'1.496 3pg_c + 3.7478 accoa_c + 59.81 atp_c + 0.361 e4p_c + 0.0709 f6p_c + 0.129 g3p_c + 0.205 g6p_c + 0.2557 gln__L_c + 4.9414 glu__L_c + 59.81 h2o_c + 3.547 nad_c + 13.0279 nadph_c + 1.7867 oaa_c + 0.5191 pep_c + 2.8328 pyr_c + 0.8977 r5p_c --> 59.81 adp_c + 4.1182 akg_c + 3.7478 coa_c + 59.81 h_c + 3.547 nadh_c + 13.0279 nadp_c + 59.81 pi_c'

In [7]:
# Sum of relevant components from stoichiometry
bio_ecore = float()
for (
    metabolite,
    coeficient,
) in model.reactions.Biomass_Ecoli_core.metabolites.items():
    if metabolite.id in ("atp_c", "adp_c", "h_c", "nad_c", "nadph_c", "h2o_c"):
        continue
    if coeficient < 0:
        bio_ecore += -1 * coeficient
bio_ecore

17.243100000000002

We are interested in the ratios of shikimate in the biomass reactions from the
strains and thus, we have to use the biomass quantity and calculate the 
shikimate proportion for the original biomass. Following this logic, we obtain: 

$$
\frac{shikimate_{ strain}}{bio_{ strain}} = \frac{x}{bio_{ecore}}\\
\frac{shikimate_{ strain}}{bio_{strain}}\cdot bio_{ecore} = x
$$

where $x$ is the shikimate proportion for the original biomass reaction. This
proportion will change depending on the strain.

In [8]:
# Calculating proportions
shikimate = {
    "W3110": (1.31 / 1000) / 3.42 * bio_ecore,
    "SA3": (417.20 / 1000) / 5.24 * bio_ecore
}
shikimate

{'W3110': 0.006604813157894738, 'SA3': 1.3728666641221374}

After calculating the proportions, we will use the original biomass as a base
and will add the new proportion of shikimate into the equation. For this, we
nedd the identifier or shikimate. In this case, it is the same one as the [entry in EcoCyc](
https://biocyc.org/compound?orgid=ECOLI&id=SHIKIMATE). CobraMod formats the
identifiers and shikimate becomes `SHIKIMATE_c`.

In [9]:
# Creating biomass for wild type
biomass_W3110 = model.reactions.get_by_id("Biomass_Ecoli_core").copy()
biomass_W3110.id = "Biomass_Ecoli_core_W3110"
biomass_W3110.add_metabolites(
    {model.metabolites.get_by_id("SHIKIMATE_c"): -1 * shikimate["W3110"]}
)

# Creating biomass for SA1 strain
biomass_SA3 = model.reactions.get_by_id("Biomass_Ecoli_core").copy()
biomass_SA3.id = "Biomass_Ecoli_core_SA3"
biomass_SA3.add_metabolites(
    {model.metabolites.get_by_id("SHIKIMATE_c"): -1 * shikimate["SA3"]}
)

# Adding Reaction objects into our model
model.add_reactions([biomass_SA3, biomass_W3110])

## Comparing biomasses

We want to replicate to the results by knocking the corresponfing genes for
the SA3 strain. However, because our model is very simple, if we knockout one
essential gene, then we receive infeasible solutions. For this reason and
in contrast to the article, the gene *ptsG* cannot be deactivated. Additionally,
since there is no reaction demanding shikimate besides the biomass reaction, the
reactions after the synthesis of shikimate will not be activated by the model.
i. e. there is no need to knockdown *aroK* and *ptsG*.

We will change the objective function for our new biomass reactions and optimize them.



In [10]:
# Wild type
model.objective = "Biomass_Ecoli_core_W3110"

solution_W3110 = model.optimize()
model.summary(solution=solution_W3110)

Metabolite,Reaction,Flux,C-Number,C-Flux
co2_e,EX_co2_e,0.65,1,0.72%
glc__D_e,EX_glc__D_e,15.0,6,99.28%
h2o_e,EX_h2o_e,10.03,0,0.00%
nh4_e,EX_nh4_e,1.984,0,0.00%
pi_e,EX_pi_e,1.338,0,0.00%

Metabolite,Reaction,Flux,C-Number,C-Flux
ac_e,EX_ac_e,-12.43,2,33.07%
etoh_e,EX_etoh_e,-12.04,2,32.04%
for_e,EX_for_e,-26.22,1,34.89%
h_e,EX_h_e,-45.95,0,0.00%


In [11]:
# SA1 strain
model.objective = "Biomass_Ecoli_core_SA3"
solution_SA3 = model.optimize()
model.summary(solution=solution_SA3)

Metabolite,Reaction,Flux,C-Number,C-Flux
co2_e,EX_co2_e,0.6047,1,0.67%
glc__D_e,EX_glc__D_e,15.0,6,99.33%
h2o_e,EX_h2o_e,8.917,0,0.00%
nh4_e,EX_nh4_e,1.845,0,0.00%
pi_e,EX_pi_e,1.245,0,0.00%

Metabolite,Reaction,Flux,C-Number,C-Flux
ac_e,EX_ac_e,-12.07,2,33.08%
etoh_e,EX_etoh_e,-11.71,2,32.10%
for_e,EX_for_e,-25.41,1,34.83%
h_e,EX_h_e,-44.72,0,0.00%


## Visualization with Escher

The `cobramod.Pathway` object has a special method to visualize the
metabolic pathway through Escher. Users can use `Pathway.visulize()` and 
CobraMod will automatically create a map that can be customized()

In [12]:
model.groups.get_by_id("PWY-SHIKIMATE").visualize()

Builder(reaction_scale={}, reaction_styles=['color', 'text'])

For the visual comparison we will change the color of the positive and
negative fluxes to blue and red, respectively. The orientation of the
map with be horizontal and the colors be depend from the quantile of
the flux solutions. 

In [13]:
model.groups[-1].vertical = False
model.groups[-1].color_positive = "blue"
model.groups[-1].color_negative = "red"
model.groups[-1].color_quantile = True
model.groups[-1].color_min_max = [-1,1]

In [14]:
model.groups[-1].visualize(solution_W3110)

Builder(reaction_data={'ACALD': -12.03976080152985, 'ACALDt': 1.5948213258617017e-15, 'ACKr': -12.425016580979…

In [15]:
model.groups[-1].visualize(solution_SA3)

Builder(reaction_data={'ACALD': -11.706838677509918, 'ACALDt': 1.5948213258617017e-15, 'ACKr': -12.06521412860…

From here the user, can compare the graphs visually and proceed with different
types of analysis.

## References

1. Chen, Xianzhong, Mingming Li, Li Zhou, Wei Shen, Govender Algasan, You Fan, and Zhengxiang Wang. “Metabolic Engineering of Escherichia Coli for Improving Shikimate Synthesis from Glucose.” Bioresource Technology 166 (August 1, 2014): 64–71. https://doi.org/10.1016/j.biortech.2014.05.035.
2. Heirendt, Laurent, Sylvain Arreckx, Thomas Pfau, Sebastián N. Mendoza, Anne Richelle, Almut Heinken, Hulda S. Haraldsdóttir, et al. “Creation and Analysis of Biochemical Constraint-Based Models Using the COBRA Toolbox v.3.0.” Nature Protocols 14, no. 3 (March 2019): 639–702. https://doi.org/10.1038/s41596-018-0098-2.
3. Orth, Jeffrey D., R. M. T. Fleming, and Bernhard Ø. Palsson. “Reconstruction and Use of Microbial Metabolic Networks: The Core Escherichia Coli Metabolic Model as an Educational Guide.” EcoSal Plus 4, no. 1 (February 1, 2010). https://doi.org/10.1128/ecosalplus.10.2.1.
