# Workflow for construction of ECM_eco1515

In [1]:
import sys
sys.path.append(r'./')
from COBRApy_ECM_funcation import *

Inputing files and parameters.

In [2]:
modelname = 'iML1515.xml'#The genome-scale metabolic model for constructing the enzyme-constrained model.
ptot = 0.56 #the total protein fraction in cell.
sigma = 0.5 #the approximated average saturation of enzyme.
lowerbound = 0 #Lowerbound  of enzyme concentration constraint. 
ID_kcat = "./ID_kcat_file.csv"#reaction_kcat file. eg. AADDGT,49389.2889
gene_subunit = "./gene_subunit_file.csv"#gene_subunit file. eg. b0002,thrA,4
subunit_molecular_weight = "./subunit_molecular_weight_file.csv"#gene_subunit_molecular_weight file. eg. b0001,thrL,2.13846
gene_abundance = "./gene_abundance_file.csv"#gene_abundance file. eg. b0789,1.1
model = cobra.io.read_sbml_model(modelname)

Step1:Preprocessing of model

The reversible reactions in the GEM model are divided into two irreversible reactions. The input is iML1515 with 2712 reactions. The output is a model with 3375 irreversible reactions.

In [3]:
irreversible_model(model)

0,1
Name,iML1515
Memory address,0x07fc155664da0
Number of metabolites,1877
Number of reactions,3375
Objective expression,-1.0*BIOMASS_Ec_iML1515_core_75p37M_reverse_35685 + 1.0*BIOMASS_Ec_iML1515_core_75p37M
Compartments,"periplasm, extracellular space, cytosol"


Step2: Retrieving enzyme kinetics and proteomics data

The inputs are GEM model and 'ID_kcat_file.csv'. The outputs are 'genes' data (contains all genes in the iML1515 for the calculating of f) and 'ID_GPR' data (contains 1767 reaction ids with the kcat value and the GPR relationship for the calculating of molecular weight of the enzyme).

In [4]:
genes = get_genes_and_GPR(model,ID_kcat)

Calculating the molecular weight of the enzyme (MW). The inputs are 'ID_GPR' data, 'gene_subunit_file.csv' and  'subunit_molecular_weight_file'. The output is 'ID_MW' data which contains the 1767 reaction ids and the molecular weight information for each reaction.

In [5]:
ID_GPR = "./ID_GPR_file_new.csv"#Because we corrected the error of the gene_reaction_rule of a small amount of reactions in iML1515 (See Supplementary Table S1 for details), a modified file is re-entered here.This file does not need to be re-entered when there is no need to correct the GPR relationship.
ID_MW = calculate_MW(ID_GPR,gene_subunit,subunit_molecular_weight)

Calculating kcat/MW. The inputs are 'ID_kcat' and 'ID_MW' data for the calculating of kcat/MW (When the reaction is catalyzed by several isozymes, the maximum is retained).

In [6]:
ID_kcat_MW = calculate_kcat_MW(ID_kcat,ID_MW) 

Calculating f. The input is 'genes' data, 'gene_abundance_file.csv' and 'subunit_molecular_weight_file.csv'.

In [7]:
f = calculate_f(genes,gene_abundance,subunit_molecular_weight)

Step3: Introducing enzyme concentration constraint by COBRApy.

The inputs are GEM model, 'ID_kcat_MW' data, the lowerbound and upperbound of enzyme concentration constraint. The output is an enzyme-constrained model.

In [8]:
upperbound = round(ptot * f * sigma,3)#Calculating the upperbound of the enzyme concentration constraint obtained by the above process.
model = set_enzyme_constraint(model, ID_kcat_MW, lowerbound, upperbound)

# Result

Simulating overflow metabolism of E. coli

In [9]:
overflow_result = pd.DataFrame()
i = 0.1
while i <= 0.7:
    with model as overflow_model:
        overflow_model.reactions.get_by_id('EX_glc__D_e_reverse').bounds = (0.0, 0.0)
        overflow_model.reactions.get_by_id('EX_glc__D_e').bounds = (-1000.0, 0.0)  
        overflow_model.reactions.get_by_id('BIOMASS_Ec_iML1515_core_75p37M').bounds = (i, i)
        overflow_model.objective = 'EX_glc__D_e'
        pfba_solution = cobra.flux_analysis.pfba(overflow_model)
        overflow_result.loc[i, 'glucose'] = str(-pfba_solution.fluxes['EX_glc__D_e'])
        overflow_result.loc[i, 'ac'] = str(pfba_solution.fluxes['EX_ac_e'])
        overflow_result.loc[i, 'o2_reverse'] = str(pfba_solution.fluxes['EX_o2_e_reverse'])
        i = i + 0.1
overflow_result.to_csv("./pfba_overflow_result.csv", sep=',')
        

Predicting E. coli’s maximum growth rate using different carbon sources

In [10]:
subs = ['EX_acgam_e','EX_ac_e','EX_akg_e','EX_ala__L_e','EX_fru_e','EX_fum_e','EX_g6p_e','EX_gal_e','EX_gam_e','EX_glcn_e','EX_glc__D_e','EX_glyc_e','EX_gsn_e','EX_lac__L_e','EX_malt_e','EX_mal__L_e','EX_man_e','EX_mnl_e','EX_pyr_e','EX_rib__D_e','EX_sbt__D_e','EX_succ_e','EX_tre_e','EX_xyl__D_e']
growth = pd.DataFrame()
for sub in subs:
    with model as growth_model: 
        growth_model.reactions.get_by_id('EX_dha_e').bounds = (0.0, 0.0) 
        growth_model.reactions.get_by_id('EX_pyr_e').bounds = (0.0, 0.0) 
        growth_model.reactions.get_by_id('EX_5dglcn_e').bounds = (0.0, 0.0) 
        growth_model.reactions.get_by_id('EX_xan_e').bounds = (0.0, 0.0) 
        growth_model.reactions.get_by_id('EX_fum_e').bounds = (0.0, 0.0) 
        growth_model.reactions.get_by_id('EX_succ_e').bounds = (0.0, 0.0)
        growth_model.reactions.get_by_id('EX_for_e').bounds = (0.0, 0.0)
        growth_model.reactions.get_by_id('EX_glcn_e').bounds = (0.0, 0.0)
        growth_model.reactions.get_by_id('EX_glc__D_e_reverse').bounds = (0.0, 0.0) 
        growth_model.reactions.get_by_id(sub).bounds = (-1000.0, 0.0)
        pfba_solution = cobra.flux_analysis.pfba(growth_model)
        growth.loc[sub,'pfba_flux']= str(pfba_solution.fluxes['BIOMASS_Ec_iML1515_core_75p37M'])
        growth.loc[sub,'sub_flux']= str(pfba_solution.fluxes[sub])
growth.to_csv("./growth_pfba.csv",sep=',')