# Exploring *E. coli* Genome-Scale Metabolic Model

This notebook provides a comprehensive workflow for inspecting the *Escherichia coli* genome-scale metabolic model(*E_coli_core*). GSMMs are detailed reconstructions of metabolic networks that serve as powerful tools for understanding cellular metabolism, predicting phenotypic behaviors, and exploring metabolic engineering strategies.

In this workflow, we will:

Load and examine the *E. coli* metabolic model: Explore its structure, including reactions, metabolites, and genes.

#### Loading and Reading the Model

In [2]:
# Importing the COBRApy library, which is widely used for working with genome-scale metabolic models.
import cobra

# Importing the function for reading SBML (Systems Biology Markup Language) files.
# SBML is a standard format for representing metabolic models.
from cobra.io import read_sbml_model

# Loading the E. coli genome-scale metabolic model from an SBML file.
# Provide the file path to the SBML file ('e_coli_core.xml') containing the metabolic model.
model = read_sbml_model('C:/Users/User/Downloads/e_coli_core.xml')

# The 'model' object now contains the metabolic model, which can be analyzed and inspected.

#### Reactions, Metabolites, and Genes attributes

The reactions, metabolites, and genes attributes of the cobrapy model are a special type of list called a <mark>cobra.DictList</mark>, and each one is made up of <mark>cobra.Reaction</mark>, <mark>cobra.Metabolite</mark> and <mark>cobra.Gene</mark> objects respectively.

In [3]:
# Investigating the length of model attributes

print(len(model.reactions))
print(len(model.metabolites))
print(len(model.genes))

95
72
137


In [4]:
# Investigating model properties

model

0,1
Name,e_coli_core
Memory address,21d16aa68c8
Number of metabolites,72
Number of reactions,95
Number of genes,137
Number of groups,0
Objective expression,1.0*BIOMASS_Ecoli_core_w_GAM - 1.0*BIOMASS_Ecoli_core_w_GAM_reverse_712e5
Compartments,"extracellular space, cytosol"


In [5]:
# List of reactions

model.reactions

[<Reaction PFK at 0x21d25693b48>,
 <Reaction PFL at 0x21d25693b08>,
 <Reaction PGI at 0x21d25697f08>,
 <Reaction PGK at 0x21d256a3e08>,
 <Reaction PGL at 0x21d256a3e88>,
 <Reaction ACALD at 0x21d256a30c8>,
 <Reaction AKGt2r at 0x21d256a8548>,
 <Reaction PGM at 0x21d256a3448>,
 <Reaction PIt2r at 0x21d256aa888>,
 <Reaction ALCD2x at 0x21d256b3f88>,
 <Reaction ACALDt at 0x21d256b3f48>,
 <Reaction ACKr at 0x21d256bb788>,
 <Reaction PPC at 0x21d256bb808>,
 <Reaction ACONTa at 0x21d256bef88>,
 <Reaction ACONTb at 0x21d256bb748>,
 <Reaction ATPM at 0x21d256bef48>,
 <Reaction PPCK at 0x21d256c7048>,
 <Reaction ACt2r at 0x21d256c5e08>,
 <Reaction PPS at 0x21d256c52c8>,
 <Reaction ADK1 at 0x21d256cff08>,
 <Reaction AKGDH at 0x21d256cfe08>,
 <Reaction ATPS4r at 0x21d256d5dc8>,
 <Reaction PTAr at 0x21d256dbdc8>,
 <Reaction PYK at 0x21d256e56c8>,
 <Reaction BIOMASS_Ecoli_core_w_GAM at 0x21d256e24c8>,
 <Reaction PYRt2 at 0x21d256e2548>,
 <Reaction CO2t at 0x21d256d5c88>,
 <Reaction RPE at 0x21d256e

In [6]:
# List of metabolites

model.metabolites

[<Metabolite glc__D_e at 0x21d2460e108>,
 <Metabolite gln__L_c at 0x21d2460e0c8>,
 <Metabolite gln__L_e at 0x21d2460ea08>,
 <Metabolite glu__L_c at 0x21d24610908>,
 <Metabolite glu__L_e at 0x21d2461d588>,
 <Metabolite glx_c at 0x21d2461d7c8>,
 <Metabolite h2o_c at 0x21d2461f308>,
 <Metabolite h2o_e at 0x21d2461f408>,
 <Metabolite h_c at 0x21d2461fc88>,
 <Metabolite h_e at 0x21d24621f08>,
 <Metabolite icit_c at 0x21d2461e5c8>,
 <Metabolite lac__D_c at 0x21d24623208>,
 <Metabolite lac__D_e at 0x21d24625848>,
 <Metabolite mal__L_c at 0x21d24625f08>,
 <Metabolite mal__L_e at 0x21d24626748>,
 <Metabolite nad_c at 0x21d24626f48>,
 <Metabolite nadh_c at 0x21d24627748>,
 <Metabolite nadp_c at 0x21d24627ec8>,
 <Metabolite nadph_c at 0x21d2462a888>,
 <Metabolite nh4_c at 0x21d2462b0c8>,
 <Metabolite 13dpg_c at 0x21d2462bac8>,
 <Metabolite nh4_e at 0x21d2462e348>,
 <Metabolite o2_c at 0x21d2462e548>,
 <Metabolite 2pg_c at 0x21d2462da08>,
 <Metabolite o2_e at 0x21d2462dc48>,
 <Metabolite 3pg_c at 

In [7]:
# List of genes

model.genes

[<Gene b1241 at 0x21d25637508>,
 <Gene b0351 at 0x21d25637548>,
 <Gene s0001 at 0x21d25637648>,
 <Gene b1849 at 0x21d25637bc8>,
 <Gene b3115 at 0x21d256400c8>,
 <Gene b2296 at 0x21d25640208>,
 <Gene b1276 at 0x21d256406c8>,
 <Gene b0118 at 0x21d25640c88>,
 <Gene b0474 at 0x21d256411c8>,
 <Gene b0116 at 0x21d25641708>,
 <Gene b0727 at 0x21d25640708>,
 <Gene b0726 at 0x21d25645208>,
 <Gene b2587 at 0x21d256457c8>,
 <Gene b0356 at 0x21d25645d08>,
 <Gene b1478 at 0x21d256462c8>,
 <Gene b3734 at 0x21d25646848>,
 <Gene b3733 at 0x21d25646dc8>,
 <Gene b3736 at 0x21d25648388>,
 <Gene b3737 at 0x21d25648948>,
 <Gene b3739 at 0x21d25648f08>,
 <Gene b3738 at 0x21d2564a508>,
 <Gene b3735 at 0x21d2564aac8>,
 <Gene b3731 at 0x21d2564c088>,
 <Gene b3732 at 0x21d2564c648>,
 <Gene b0720 at 0x21d2564cc08>,
 <Gene b0733 at 0x21d2564d208>,
 <Gene b0734 at 0x21d23b7dd88>,
 <Gene b0979 at 0x21d2564dd48>,
 <Gene b0978 at 0x21d2564e2c8>,
 <Gene b3603 at 0x21d2564e808>,
 <Gene b2975 at 0x21d2564edc8>,
 <Gene b

In [8]:
# Investigating properties of a reaction by Index

model.reactions[5]

0,1
Reaction identifier,ACALD
Name,Acetaldehyde dehydrogenase (acetylating)
Memory address,0x21d256a30c8
Stoichiometry,acald_c + coa_c + nad_c <=> accoa_c + h_c + nadh_c  Acetaldehyde + Coenzyme A + Nicotinamide adenine dinucleotide <=> Acetyl-CoA + H+ + Nicotinamide adenine dinucleotide - reduced
GPR,b0351 or b1241
Lower bound,-1000.0
Upper bound,1000.0


In [9]:
# Investigating properties of a metabolite by ID

model.metabolites.get_by_id("co2_e")

0,1
Metabolite identifier,co2_e
Name,CO2 CO2
Memory address,0x21d25620d08
Formula,CO2
Compartment,e
In 2 reaction(s),"CO2t, EX_co2_e"


### Reaction Bounds

Reaction bounds refer to the constraints on the allowable flux values (reaction rates) of a given metabolic reaction. These bounds define the range within which the flux can vary during simulations, such as Flux Balance Analysis (FBA).

Types of Bounds

1) Lower Bound:

- Represents the minimum allowable flux for the reaction.
        
- Determines whether the reaction can occur in the reverse direction
        
- A negative lower bound indicates that the reaction is reversible and can proceed in the reverse direction.
        
- A lower bound of 0 indicates the reaction is irreversible in the forward direction.
        

2) Upper Bound:

- Represents the maximum allowable flux for the reaction.
        
- A high or infinite upper bound means no limit is imposed on the forward flux.

In [10]:
# Bounds of a reaction by ID

model.reactions.EX_glc__D_e.bounds

(-10.0, 1000.0)

In [11]:
# List of all the IDs of the reaction list

for r in model.reactions:
    print(r.id)

PFK
PFL
PGI
PGK
PGL
ACALD
AKGt2r
PGM
PIt2r
ALCD2x
ACALDt
ACKr
PPC
ACONTa
ACONTb
ATPM
PPCK
ACt2r
PPS
ADK1
AKGDH
ATPS4r
PTAr
PYK
BIOMASS_Ecoli_core_w_GAM
PYRt2
CO2t
RPE
CS
RPI
SUCCt2_2
CYTBD
D_LACt2
ENO
SUCCt3
ETOHt2r
SUCDi
SUCOAS
TALA
THD2
TKT1
TKT2
TPI
EX_ac_e
EX_acald_e
EX_akg_e
EX_co2_e
EX_etoh_e
EX_for_e
EX_fru_e
EX_fum_e
EX_glc__D_e
EX_gln__L_e
EX_glu__L_e
EX_h_e
EX_h2o_e
EX_lac__D_e
EX_mal__L_e
EX_nh4_e
EX_o2_e
EX_pi_e
EX_pyr_e
EX_succ_e
FBA
FBP
FORt2
FORt
FRD7
FRUpts2
FUM
FUMt2_2
G6PDH2r
GAPD
GLCpts
GLNS
GLNabc
GLUDy
GLUN
GLUSy
GLUt2r
GND
H2Ot
ICDHyr
ICL
LDH_D
MALS
MALt2_2
MDH
ME1
ME2
NADH16
NADTRHD
NH4t
O2t
PDH


In [12]:
model.reactions.get_by_id("PDH")

0,1
Reaction identifier,PDH
Name,Pyruvate dehydrogenase
Memory address,0x21d257ab708
Stoichiometry,coa_c + nad_c + pyr_c --> accoa_c + co2_c + nadh_c  Coenzyme A + Nicotinamide adenine dinucleotide + Pyruvate --> Acetyl-CoA + CO2 CO2 + Nicotinamide adenine dinucleotide - reduced
GPR,b0114 and b0115 and b0116
Lower bound,0.0
Upper bound,1000.0


In [13]:
# Retrieving reaction by ID and storing in a variable

oxygen = model.reactions.get_by_id("O2t")
oxygen

0,1
Reaction identifier,O2t
Name,O2 transport diffusion
Memory address,0x21d257ab848
Stoichiometry,o2_e <=> o2_c  O2 O2 <=> O2 O2
GPR,s0001
Lower bound,-1000.0
Upper bound,1000.0


In [14]:
# Calling upper bound specifically

oxygen.upper_bound

1000.0

In [15]:
for x in model.metabolites:
    print(x.id)

glc__D_e
gln__L_c
gln__L_e
glu__L_c
glu__L_e
glx_c
h2o_c
h2o_e
h_c
h_e
icit_c
lac__D_c
lac__D_e
mal__L_c
mal__L_e
nad_c
nadh_c
nadp_c
nadph_c
nh4_c
13dpg_c
nh4_e
o2_c
2pg_c
o2_e
3pg_c
oaa_c
pep_c
6pgc_c
pi_c
6pgl_c
pi_e
ac_c
pyr_c
pyr_e
q8_c
q8h2_c
r5p_c
ru5p__D_c
ac_e
acald_c
s7p_c
acald_e
accoa_c
succ_c
succ_e
succoa_c
acon_C_c
xu5p__D_c
actp_c
adp_c
akg_c
akg_e
amp_c
atp_c
cit_c
co2_c
co2_e
coa_c
dhap_c
e4p_c
etoh_c
etoh_e
f6p_c
fdp_c
for_c
for_e
fru_e
fum_c
fum_e
g3p_c
g6p_c


In [16]:
# Retrieving metabolite by ID, storing in a variable and investigating attributes

glucm = model.metabolites.get_by_id('glc__D_e')

In [17]:
glucm.name

'D-Glucose'

In [18]:
glucm.compartment

'e'

In [19]:
glucm.charge

0

In [20]:
# Investigating the reactions a metabolite is involved in

glucm.reactions

frozenset({<Reaction EX_glc__D_e at 0x21d2572bc48>,
           <Reaction GLCpts at 0x21d2575c288>})

In [21]:
glucm.formula

'C6H12O6'

In [22]:
for r in glucm.reactions:
    print(r.id)

GLCpts
EX_glc__D_e


In [23]:
reaction1 = model.reactions.get_by_id('ACALD')
reaction1.genes

frozenset({<Gene b0351 at 0x21d25637548>, <Gene b1241 at 0x21d25637508>})

In [24]:
model.reactions.get_by_id('ACALD')

0,1
Reaction identifier,ACALD
Name,Acetaldehyde dehydrogenase (acetylating)
Memory address,0x21d256a30c8
Stoichiometry,acald_c + coa_c + nad_c <=> accoa_c + h_c + nadh_c  Acetaldehyde + Coenzyme A + Nicotinamide adenine dinucleotide <=> Acetyl-CoA + H+ + Nicotinamide adenine dinucleotide - reduced
GPR,b0351 or b1241
Lower bound,-1000.0
Upper bound,1000.0


In [25]:
# checking the reversibility of a reaction
# If the lower bound of a reaction <0 and the upper bound of a reaction is >0, the reaction is reversible.

reaction1.reversibility

True

In [26]:
# Checking if a reaction is mass balanced. Empty list indicates balanced.

reaction1.check_mass_balance()

{}

In [27]:
for r in model.genes:
    print(r.id)

b1241
b0351
s0001
b1849
b3115
b2296
b1276
b0118
b0474
b0116
b0727
b0726
b2587
b0356
b1478
b3734
b3733
b3736
b3737
b3739
b3738
b3735
b3731
b3732
b0720
b0733
b0734
b0979
b0978
b3603
b2975
b2779
b2925
b1773
b2097
b3925
b4232
b2492
b0904
b4152
b4154
b4153
b4151
b1819
b1817
b2416
b2415
b1818
b1611
b4122
b1612
b3528
b1852
b1779
b1101
b2417
b1621
b1297
b3870
b0809
b0811
b0810
b1761
b1524
b0485
b1812
b3213
b3212
b4077
b2029
b0875
b1136
b4015
b1380
b2133
b4014
b2976
b3236
b1479
b2463
b2281
b2277
b2280
b2286
b2287
b2284
b2276
b2282
b2279
b2283
b2285
b2288
b2278
b1603
b3962
b1602
b0451
b0114
b0115
b3916
b1723
b3114
b2579
b3951
b0902
b3952
b0903
b4025
b2926
b0767
b3612
b4395
b0755
b3493
b2987
b3956
b3403
b1702
b2297
b2458
b1676
b1854
b3386
b4301
b2914
b4090
b0721
b0722
b0724
b0723
b0729
b0728
b0008
b2464
b2465
b2935
b3919


In [28]:
# Retrieving gene information by ID

model.genes.get_by_id('b1241')

0,1
Gene identifier,b1241
Name,adhE
Memory address,0x21d25637508
Functional,True
In 2 reaction(s),"ALCD2x, ACALD"
