# <span style="color: blue;">Part 2.3.1: Curation of model - Transport and catabolism of galactokinase</span>

In [256]:
import reframed
import pandas as pd
import copy

In [257]:
model = reframed.load_cbmodel('model_c_H10_part2_2_1.xml')

In [258]:
model_universe = reframed.load_cbmodel('bigg_universe.xml')

### <span style="color: blue;">False positive on galactose - Galactokinase (Ccel_3228/G_WP_015926586_1) (This will be handled along with the curation of the transporters) </span>

<span style="color: red;"> **Do something about this?** The protein for galactokinase is involved in two very similar reactions (probably duplicates of each other)</span>
- R_GALK2: M_a_gal__D_c + M_atp_c --> M_adp_c + M_gal1p_c + M_h_c: G_WP_015926586_1
- R_GALKr: M_atp_c + M_gal_c <-> M_adp_c + M_gal1p_c + M_h_c: G_WP_015926586_1 

**False positive on galactose (This will be handled along with the curation of the transporters)**
- Model is catabolizing the galactose with the PTS system, thereby bypassing galactokinase. 
    - R_GALpts: M_gal_e + M_pep_c <-> M_dgal6p_c + M_pyr_c: G_WP_015924346_1
    

## <span style="color: blue;"> B: Transport reactions - PTS and ABC</span>

**Galactokinase (Ccel_3238/G_WP_015926586_1):** 

- **False positive on galactose (This will be handled along with the curation of the transporters)**
    - Model is catabolizing the galactose with the PTS system, thereby bypassing galactokinase. 
        - R_GALpts: M_gal_e + M_pep_c <-> M_dgal6p_c + M_pyr_c: G_WP_015924346_1

There are several transport reactions for sugars in the model. The question is if they are correctly represented. According to Fosses et al. (2017) there are no genes for the PTS system in R. cellulolyticum, but ABC transporters are very active for the bacterium. Only the ABC transporter of xyloglucan oligosaccharides and cellulose have been studied. 


Fosses, A., Maté, M., Franche, N. et al. A seven-gene cluster in Ruminiclostridium cellulolyticum is essential for signalization, uptake and catabolism of the degradation products of cellulose hydrolysis. Biotechnol Biofuels 10, 250 (2017). https://doi.org/10.1186/s13068-017-0933-7

In [259]:
%store -r gene_protein_map

**Find all transport reactions for sugars**

In [260]:
def transport_rxn_dict(model):   
    mets_short = ['arab__L_', 'glc__D_','cellb_','gal_','xyl__D_', 'man_']
    all_met_ids = [met for met in model.metabolites]


    mets_transport_rxns = {met:[] for met in mets_short} # A dictionary carrying all transport reactions for each metabolite
    for met_short in mets_short:
        # Find all metabolites that have the main part of the met id
        mets_long = [s for s in all_met_ids if met_short in s]

        # For all of these mets, find reactions
        for met in mets_long:
            met_rxns = model.get_metabolite_reactions(met)

            # For every reaction check if it is a transport reaction. 
            for met_rxn in met_rxns:
                if model.reactions[met_rxn].reaction_type==reframed.ReactionType.TRANSPORT:
                    mets_transport_rxns[met_short].append(model.reactions[met_rxn])
                    
                    
    return mets_transport_rxns

In [261]:
mets_transport_rxns = transport_rxn_dict(model)

**Transport reactions for arabinose - All proteins are related to ABC transporters**

In [262]:
for rxn in set(mets_transport_rxns['arab__L_']):
    print(str(rxn) + ": " + str(rxn.gpr))

R_ARBabcpp: M_arab__L_p + M_atp_c + M_h2o_c --> M_adp_c + M_arab__L_c + M_h_c + M_pi_c: None
R_ARBabc: M_arab__L_e + M_atp_c + M_h2o_c --> M_adp_c + M_arab__L_c + M_h_c + M_pi_c: (G_WP_015925555_1 and G_WP_015925556_1)
R_ARBt2r: M_arab__L_e + M_h_e --> M_arab__L_c + M_h_c: G_WP_041706401_1
R_ARBt3ipp: M_arab__L_c + M_h_p --> M_arab__L_p + M_h_c: G_WP_015924438_1


***G_WP_041706401_1*** is related to ABC transporter according [annotation in NCBI](https://www.ncbi.nlm.nih.gov/protein/WP_041706401.1?report=genpept#locus_754061182). 

Looking at the annotation data from CarveMe we see that the protein has similarity with P_CD630_25490 from *Clostridioides difficile* (strain 630) (Peptoclostridium difficile). In [UniProt it is annotated as a part of an ABC transporter](https://www.uniprot.org/uniprotkb/Q182N6/entry), however, indicating that the annoation in the *C. difficile* model is also incorrect.


In [263]:
gene_protein_map[gene_protein_map['Cross-reference (RefSeq)']=='G_WP_041706401_1']

Unnamed: 0,Entry,Entry name,Protein names,Gene names,Cross-reference (RefSeq)


It is connected to many other reactions that are not ABC transporters. 

In [264]:
for rxn in model.gene_to_reaction_lookup()['G_WP_041706401_1']:
    print(str(model.reactions[rxn])+": "+str(model.reactions[rxn].gpr))

R_ACNAMt2: M_acnam_e + M_h_e --> M_acnam_c + M_h_c: G_WP_041706401_1
R_ARBt2r: M_arab__L_e + M_h_e --> M_arab__L_c + M_h_c: G_WP_041706401_1
R_CELBt2: M_cellb_e + M_h_e --> M_cellb_c + M_h_c: G_WP_041706401_1
R_GALt2_3: M_gal_bD_e + M_h_e --> M_gal_bD_c + M_h_c: G_WP_041706401_1
R_GLCAt2: M_glc__aD_e + M_h_e --> M_glc__aD_c + M_h_c: G_WP_041706401_1
R_MALTabc: M_atp_c + M_h2o_c + M_malt_e --> M_adp_c + M_h_c + M_malt_c + M_pi_c: (G_WP_015924545_1 or G_WP_015925560_1 or G_WP_015926270_1 or G_WP_242651738_1 or (G_WP_012634654_1 and G_WP_015926592_1) or (G_WP_015924545_1 and G_WP_015924646_1) or (G_WP_015924647_1 and G_WP_041706401_1 and G_WP_041707075_1))
R_MALTt2: M_h_e + M_malt_e --> M_h_c + M_malt_c: G_WP_041706401_1
R_MANt2: M_h_e + M_man_e --> M_h_c + M_man_c: (G_WP_015925129_1 or G_WP_041706401_1)
R_SBTt2: M_h_e + M_sbt__D_e --> M_h_c + M_sbt__D_c: G_WP_041706401_1
R_SUCRt2: M_h_e + M_sucr_e --> M_h_c + M_sucr_c: G_WP_041706401_1
R_XYLt2: M_h_e + M_xyl__D_e --> M_h_c + M_xyl__D_c: 

***G_WP_015924438_1*** Seems to be related to [sugar efflux transporter in Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)](https://www.uniprot.org/uniprotkb?query=KPN_01628)

In [265]:
gene_protein_map[gene_protein_map['Cross-reference (RefSeq)']=='G_WP_015924438_1']

Unnamed: 0,Entry,Entry name,Protein names,Gene names,Cross-reference (RefSeq)
991,B8I8P7,B8I8P7_RUMCH,Major facilitator superfamily MFS_1,Ccel_0908,G_WP_015924438_1


***G_WP_015925555_1*** Also seems to be related to an [ABC transporter from *B. subtilis*](https://www.uniprot.org/uniprotkb?query=BSU28750). 

In [266]:
gene_protein_map[gene_protein_map['Cross-reference (RefSeq)']=='G_WP_015925555_1']

Unnamed: 0,Entry,Entry name,Protein names,Gene names,Cross-reference (RefSeq)
1279,B8I422,B8I422_RUMCH,Binding-protein-dependent transport systems in...,Ccel_2110,G_WP_015925555_1


***G_WP_015925556_1*** Also seems to be related to an [ABC transporter from *B. subtilis*](https://www.uniprot.org/uniprotkb?query=BSU28750). 

In [267]:
gene_protein_map[gene_protein_map['Cross-reference (RefSeq)']=='G_WP_015925556_1']

Unnamed: 0,Entry,Entry name,Protein names,Gene names,Cross-reference (RefSeq)
2191,B8I423,B8I423_RUMCH,Binding-protein-dependent transport systems in...,Ccel_2111,G_WP_015925556_1


In [268]:
model.remove_reaction('R_ARBabcpp')

***Remove reactions associated with G_WP_041706401_1 that are only linked to this gene***
From the list above it seems like this are exclusively reactions that are not abc transporters. 

In [269]:
model.remove_reactions([rxn for rxn in model.gene_to_reaction_lookup()['G_WP_041706401_1'] if len(model.reactions[rxn].get_genes())==1])

In [270]:
model.update()

In [271]:
mets_transport_rxns = transport_rxn_dict(model)

**Transport reactions for glucose**

In [272]:
for rxn in set(mets_transport_rxns['glc__D_']):
    print(str(rxn) + ": " + str(rxn.gpr))

R_GLCabcpp: M_atp_c + M_glc__D_p + M_h2o_c --> M_adp_c + M_glc__D_c + M_h_c + M_pi_c: (G_WP_015924906_1 or (G_WP_015924534_1 and G_WP_015926593_1) or (G_WP_015924592_1 and G_WP_015924906_1))
R_GLCabc: M_atp_c + M_glc__D_e + M_h2o_c --> M_adp_c + M_glc__D_c + M_h_c + M_pi_c: (G_WP_242651738_1 or (G_WP_012634654_1 and G_WP_015926592_1) or (G_WP_015924545_1 and G_WP_015924646_1))
R_GLCt2: M_glc__D_e + M_h_e --> M_glc__D_c + M_h_c: G_WP_015925129_1


In [273]:
gene_protein_map[gene_protein_map['Cross-reference (RefSeq)']=='G_WP_015925129_1']

Unnamed: 0,Entry,Entry name,Protein names,Gene names,Cross-reference (RefSeq)
1260,B8I2M0,B8I2M0_RUMCH,Major facilitator superfamily MFS_1,Ccel_1662,G_WP_015925129_1


***Remove GLCabcpp***

In [274]:
model.remove_reaction('R_GLCabcpp')

In [275]:
model.update()

In [276]:
mets_transport_rxns = transport_rxn_dict(model)

**Transport reactions for cellobiose**

In [277]:
for rxn in set(mets_transport_rxns['cellb_']):
    print(str(rxn) + ": " + str(rxn.gpr))

R_CLBtex: M_cellb_e <-> M_cellb_p: None


Cellobiose transporter is included later. 

In [278]:
mets_transport_rxns = transport_rxn_dict(model)

**Transport reactions for galactose**

In [279]:
for rxn in set(mets_transport_rxns['gal_']):
    print(str(rxn) + ": " + str(rxn.gpr))

R_GALabcpp: M_atp_c + M_gal_p + M_h2o_c --> M_adp_c + M_gal_c + M_h_c + M_pi_c: (G_WP_015924906_1 or (G_WP_015924592_1 and G_WP_015924906_1) or (G_WP_015924730_1 and G_WP_174258523_1) or (G_WP_015924728_1 and G_WP_015924730_1 and G_WP_015924731_1) or (G_WP_015924728_1 and G_WP_015924730_1 and G_WP_015924731_1 and G_WP_174258523_1))
R_GALpts: M_gal_e + M_pep_c <-> M_dgal6p_c + M_pyr_c: G_WP_015924346_1


***Change GALabcpp to GALabc***

Take gpr from periplasmic ABC transporter. 

In [280]:
R_GALabc=copy.deepcopy(model_universe.reactions.R_GALabc)

In [281]:
R_GALabcpp_gpr = copy.deepcopy(model.reactions.R_GALabcpp.gpr)
R_GALabc.set_gpr_association(R_GALabcpp_gpr)
model.add_reaction(R_GALabc)

In [206]:
gene_protein_map[gene_protein_map['Cross-reference (RefSeq)']=='G_WP_015924346_1']

Unnamed: 0,Entry,Entry name,Protein names,Gene names,Cross-reference (RefSeq)
2263,B8I8F1,B8I8F1_RUMCH,"Phosphocarrier, HPr family",Ccel_0806,G_WP_015924346_1


In [282]:
model.update()

In [283]:
mets_transport_rxns = transport_rxn_dict(model)

**Transport reactions for xylose**

In [284]:
for rxn in set(mets_transport_rxns['xyl__D_']):
    print(str(rxn) + ": " + str(rxn.gpr))

R_XYLtex: M_xyl__D_e <-> M_xyl__D_p: None
R_XYLabc: M_atp_c + M_h2o_c + M_xyl__D_e --> M_adp_c + M_h_c + M_pi_c + M_xyl__D_c: ((G_WP_015925436_1 and G_WP_015925438_1) or (G_WP_015925230_1 and G_WP_015925231_1 and G_WP_015925233_1))
R_XYLabcpp: M_atp_c + M_h2o_c + M_xyl__D_p --> M_adp_c + M_h_c + M_pi_c + M_xyl__D_c: (G_WP_012634509_1 and G_WP_015926084_1 and G_WP_049756869_1)


***Remove R_XYLabcpp***

In [285]:
model.remove_reaction('R_XYLabcpp')

In [286]:
mets_transport_rxns = transport_rxn_dict(model)

**Transport reactions for mannose**



In [287]:
for rxn in set(mets_transport_rxns['man_']):
    print(str(rxn) + ": " + str(rxn.gpr))

R_MANt2: M_h_e + M_man_e --> M_h_c + M_man_c: (G_WP_015925129_1 or G_WP_041706401_1)


**All PTS related enzymes for model, based on string comparison**

There are several enzymes that are being transported through the PTS system. There are several of them and they have nearly all been included with the same gene encoding a protein in the HPr family. The HPr protein is just one of several components of the PTS system (McGoy et al. 2014), and therefore probably not sufficient to argue for the existence of a PTS system. HPr is also non-specific, which probably explains why it is involved in the transport of so many different metabolites. 

McCoy JG, Levin EJ, Zhou M. Structural insight into the PTS sugar transporter EIIC. Biochim Biophys Acta. 2015 Mar;1850(3):577-85. doi: 10.1016/j.bbagen.2014.03.013. Epub 2014 Mar 20. PMID: 24657490; PMCID: PMC4169766.

In [288]:
[model.reactions[rxn] for rxn in [rxn for rxn in model.reactions if "pts" in rxn]]

[R_ACGApts: M_acgam_e + M_pep_c --> M_acgam6p_c + M_pyr_c,
 R_ACMANApts: M_acmana_e + M_pep_c --> M_acmanap_c + M_pyr_c,
 R_ARBTpts: M_arbt_e + M_pep_c --> M_arbt6p_c + M_pyr_c,
 R_ASCBpts: M_ascb__L_e + M_pep_c --> M_ascb6p_c + M_pyr_c,
 R_FRUpts: M_fru_e + M_pep_c --> M_f1p_c + M_pyr_c,
 R_FUCpts: M_fuc_e + M_pep_c --> M_fc1p_c + M_pyr_c,
 R_GALTpts: M_galt_e + M_pep_c --> M_galt1p_c + M_pyr_c,
 R_GALpts: M_gal_e + M_pep_c <-> M_dgal6p_c + M_pyr_c,
 R_GAMpts: M_gam_e + M_pep_c --> M_gam6p_c + M_pyr_c,
 R_MALTpts: M_malt_e + M_pep_c --> M_malt6p_c + M_pyr_c,
 R_MNLpts: M_mnl_e + M_pep_c --> M_mnl1p_c + M_pyr_c,
 R_SBTpts: M_pep_c + M_sbt__D_e --> M_pyr_c + M_sbt6p_c,
 R_SUCpts: M_pep_c + M_sucr_e --> M_pyr_c + M_suc6p_c,
 R_TREpts: M_pep_c + M_tre_e --> M_pyr_c + M_tre6p_c]

In [289]:
[model.reactions[rxn].gpr for rxn in [rxn for rxn in model.reactions if "pts" in rxn]]

[G_WP_015924346_1,
 G_WP_015924346_1,
 G_WP_015924346_1,
 G_WP_015924346_1,
 G_WP_015924346_1,
 None,
 G_WP_015924346_1,
 G_WP_015924346_1,
 G_WP_015924346_1,
 G_WP_015924346_1,
 G_WP_015924346_1,
 G_WP_015924346_1,
 G_WP_015924346_1,
 G_WP_015924346_1]

[**Arabinose (xylose; galactose):H+ symporter, AraE**](https://tcdb.org/search/result.php?tc=2.A.1.1.2)

![image.png](attachment:b2e5c50a-83a4-4971-a0fd-12ba6397df32.png)

[**Galactose permease of 462 aas and 12 TMSs**](https://tcdb.org/search/result.php?tc=2.A.2.2.3)

![image.png](attachment:5a5e8487-8054-4087-987b-32e5847c561f.png)

In [290]:
for rxn in model.gene_to_reaction_lookup()['G_WP_015925141_1']:
    print(model.reactions[rxn])

R_METGLCURt2pp: M_h_p + M_metglcur_p --> M_h_c + M_metglcur_c


In [291]:
model.metabolites.M_metglcur_p

1-O-methyl-Beta-D-glucuronate

[**Glucose or galactose:Na+ symporter, SGLT1 <span style="color: red;">(Human protein)**</span>](https://tcdb.org/search/result.php?tc=2.A.21.3.1)

![image.png](attachment:686d8d76-7c61-4209-9f20-087c1e494234.png)

[**TMEM214 (Drosophila melanogaster)**](https://tcdb.org/search/result.php?tc=8.A.209.1.2)

![image.png](attachment:e8843fde-6d02-499d-8998-d5a87c07a8c9.png)

### <span style="color: purple;">B: Summary of analysis </span>

There are transporters for all sugars, and most of them (except mannose) have an ABC transporter. The evidence for PTS transporters in *R. cellulolyticum* is poor. 

### <span style="color: purple;">B: Solution </span>

All PTS related reactions will be removed.

A galactose permease will be included. 

**Removing pts reactions**

In [292]:
pts_rxns = [rxn for rxn in [rxn for rxn in model.reactions if "pts" in rxn]]
model.remove_reactions(pts_rxns)

In [293]:
model.update()

In [294]:
[rxn for rxn in [rxn for rxn in model.reactions if "pts" in rxn]]

[]

### <span style="color: purple;">B: Test </span>

**Restricting it to the only carbon source**

It is currently unable to use it as a carbon source. 

In [296]:
reframed.pFBA(model,constraints={'R_EX_man_e':0,'R_EX_glc__D_e':(-10,0)}).show_values(pattern="R_EX_",sort=True)

R_EX_cys__L_e -10
R_EX_glc__D_e -10
R_EX_h_e     -10
R_EX_no3_e   -9.94322
R_EX_pi_e    -0.766453
R_EX_k_e     -0.10655
R_EX_mg2_e   -0.00473543
R_EX_fe3_e   -0.00426216
R_EX_fe2_e   -0.00366552
R_EX_ca2_e   -0.00284126
R_EX_cl_e    -0.00284126
R_EX_so4_e   -0.00236799
R_EX_cu2_e   -0.000387022
R_EX_mn2_e   -0.000377197
R_EX_ribflv_e -0.000243458
R_EX_zn2_e   -0.000186142
R_EX_cobalt2_e -5.45871e-05
R_EX_4hba_e   0.000121729
R_EX_d23hb_e  0.300875
R_EX_h2s_e    9.86556
R_EX_h2o_e    12.1903
R_EX_nh4_e    14.0661
R_EX_ac_e     21.9172
R_EX_co2_e    22.8663


In [297]:
reframed.pFBA(model,constraints={'R_EX_man_e':0,'R_EX_cellb_e':(-10,0)}).show_values(pattern="R_EX_",sort=True)

R_EX_cellb_e -10
R_EX_h_e     -10
R_EX_no3_e   -10
R_EX_pi_e    -1.32953
R_EX_k_e     -0.184827
R_EX_so4_e   -0.149843
R_EX_nh4_e   -0.107212
R_EX_cys__L_e -0.0874732
R_EX_mg2_e   -0.00821431
R_EX_fe3_e   -0.00739335
R_EX_fe2_e   -0.0063584
R_EX_ca2_e   -0.00492859
R_EX_cl_e    -0.00492859
R_EX_cu2_e   -0.000671348
R_EX_mn2_e   -0.000654304
R_EX_ribflv_e -0.000422315
R_EX_zn2_e   -0.000322891
R_EX_cobalt2_e -9.46895e-05
R_EX_4hba_e   0.000211158
R_EX_d23hb_e  0.521912
R_EX_h2o_e    19.196
R_EX_ac_e     26.0667
R_EX_co2_e    27.713
R_EX_h2_e     27.8466


In [298]:
reframed.pFBA(model,constraints={'R_EX_man_e':0,'R_EX_gal_e':(-10,0)}).show_values(pattern="R_EX_",sort=True)

R_EX_h2o_e   -10
R_EX_h_e     -10
R_EX_cys__L_e -9.96396
R_EX_no3_e   -2.78258
R_EX_pi_e    -0.0547114
R_EX_gal_e   -0.00798797
R_EX_k_e     -0.00760583
R_EX_mg2_e   -0.000338027
R_EX_fe3_e   -0.000304244
R_EX_fe2_e   -0.000261655
R_EX_ca2_e   -0.000202816
R_EX_cl_e    -0.000202816
R_EX_so4_e   -0.000169033
R_EX_cu2_e   -2.76267e-05
R_EX_mn2_e   -2.69253e-05
R_EX_ribflv_e -1.73787e-05
R_EX_zn2_e   -1.32873e-05
R_EX_cobalt2_e -3.89657e-06
R_EX_4hba_e   8.68935e-06
R_EX_d23hb_e  0.0214772
R_EX_ac_e     9.40297
R_EX_co2_e    9.47072
R_EX_h2s_e    9.95437
R_EX_nh4_e    12.327


In [299]:
reframed.pFBA(model,constraints={'R_EX_man_e':(-10,0)}).show_values(pattern="R_EX_",sort=True)

R_EX_cys__L_e -10
R_EX_h_e     -10
R_EX_man_e   -10
R_EX_no3_e   -9.94322
R_EX_pi_e    -0.766453
R_EX_k_e     -0.10655
R_EX_mg2_e   -0.00473543
R_EX_fe3_e   -0.00426216
R_EX_fe2_e   -0.00366552
R_EX_ca2_e   -0.00284126
R_EX_cl_e    -0.00284126
R_EX_so4_e   -0.00236799
R_EX_cu2_e   -0.000387022
R_EX_mn2_e   -0.000377197
R_EX_ribflv_e -0.000243458
R_EX_zn2_e   -0.000186142
R_EX_cobalt2_e -5.45871e-05
R_EX_4hba_e   0.000121729
R_EX_d23hb_e  0.300875
R_EX_h2s_e    9.86556
R_EX_h2o_e    12.1903
R_EX_nh4_e    14.0661
R_EX_ac_e     21.9172
R_EX_co2_e    22.8663


**Tracing possible pathway for galactose catabolism**


This seems to follow the Leiloir pathway. <span style="color: red;"> Find a link for this pathway</span>


In [300]:
for rxn in model.get_metabolite_reactions('M_gal_e'):
    print(model.reactions[rxn])

R_EX_gal_e: M_gal_e --> 
R_GALabc: M_atp_c + M_gal_e + M_h2o_c <-> M_adp_c + M_gal_c + M_h_c + M_pi_c


In [301]:
for rxn in model.metabolite_reaction_lookup()['M_gal_c']:
    if model.reactions[rxn].reaction_type==reframed.ReactionType.ENZYMATIC:
        if 'M_gal_c' in model.reactions[rxn].get_substrates() or model.reactions[rxn].reversible==True:
            print(model.reactions[rxn])

R_GALKr: M_gal_c + M_gtp_c <-> M_gdp_c + M_gal1p_c + M_h_c


In [302]:
for rxn in model.get_metabolite_reactions('M_gal1p_c'):
    if model.reactions[rxn].reaction_type==reframed.ReactionType.ENZYMATIC:
        if 'M_gal1p_c' in model.reactions[rxn].get_substrates() or model.reactions[rxn].reversible==True:
            print(model.reactions[rxn])

R_GALT: M_gal1p_c + M_h_c + M_utp_c <-> M_ppi_c + M_udpgal_c
R_GALKr: M_gal_c + M_gtp_c <-> M_gdp_c + M_gal1p_c + M_h_c


In [303]:
for rxn in model.get_metabolite_reactions('M_udpgal_c'):
    if model.reactions[rxn].reaction_type==reframed.ReactionType.ENZYMATIC:
        if 'M_udpgal_c' in model.reactions[rxn].get_substrates() or model.reactions[rxn].reversible==True:
            print(model.reactions[rxn])

R_GALT: M_gal1p_c + M_h_c + M_utp_c <-> M_ppi_c + M_udpgal_c
R_UDPG4E: M_udpg_c <-> M_udpgal_c


We seem to be missing the reaction that converts UDP-glucose to glucose-1-phosphate, which existance was indicated in [Kampik et al. (2021)](https://journals.asm.org/doi/10.1128/mBio.02206-21). 

In [304]:
for rxn in model.get_metabolite_reactions('M_udpg_c'):
    if model.reactions[rxn].reaction_type==reframed.ReactionType.ENZYMATIC:
        if 'M_udpg_c' in model.reactions[rxn].get_substrates() or model.reactions[rxn].reversible==True:
            print(model.reactions[rxn])

R_ENTERGLCT1: M_enter_c + M_udpg_c --> M_entermg_c + M_h_c + M_udp_c
R_ENTERGLCT2: M_entermg_c + M_udpg_c --> M_h_c + M_salchs4_c + M_udp_c
R_TECA3S45: 45.0 M_cdpglyc_c + M_h2o_c + M_uacgam_c + M_uacmam_c + 45.0 M_udpg_c --> 45.0 M_cmp_c + M_gtca3_45_BS_c + 91.0 M_h_c + 46.0 M_udp_c + M_ump_c
R_UDPG4E: M_udpg_c <-> M_udpgal_c


#### <span style="color: purple;">B.2. Additional curation of missing reaction in galactose catabolic pathway</span>

We seem to be missing the reaction that converts UDP-glucose to glucose-1-phosphate, which existance was indicated in [Kampik et al. (2021)](https://journals.asm.org/doi/10.1128/mBio.02206-21). 

This is catalyzed by [UTP--glucose-1-phosphate uridylyltransferase](https://www.genome.jp/dbget-bin/www_bget?enzyme+2.7.7.9).

Strategy:
- Find [protein related to this enzyme in R. cellulolyticum](https://www.uniprot.org/uniprotkb?query=(ec:2.7.7.9)%20AND%20(taxonomy_id:394503))
- Find alternative reaction in the BiGG universe: [GALUi](http://bigg.ucsd.edu/universal/reactions/GALUi)
- Add reaction to model.

BiGG Universal model from Daniel Machado's Github

In [305]:
model_universe = reframed.load_cbmodel('bigg_universe.xml')

In [306]:
len(model_universe.reactions)

25348

In [307]:
model_universe.reactions.R_GALUi

R_GALUi: M_g1p_c + M_h_c + M_utp_c <-> M_ppi_c + M_udpg_c

In [308]:
model.add_reaction(model_universe.reactions.R_GALUi)

In [309]:
model.update()

In [310]:
model.reactions.R_GALUi

R_GALUi: M_g1p_c + M_h_c + M_utp_c <-> M_ppi_c + M_udpg_c

In [311]:
prot = reframed.Protein()
prot.genes=['G_WP_015926747_1']
gpr = reframed.GPRAssociation()
gpr.proteins=[prot]


In [312]:
model.set_gpr_association('R_GALUi',gpr=gpr)

In [313]:
model.reactions.R_GALUi.gpr

G_WP_015926747_1

In [314]:
model.update()

In [315]:
reframed.pFBA(model,constraints={'R_EX_man_e':0,'R_EX_gal_e':(-10,0)}).show_values(pattern="R_EX_",sort=True)

R_EX_gal_e   -10
R_EX_h_e     -10
R_EX_no3_e   -10
R_EX_cys__L_e -7.81625
R_EX_pi_e    -0.642706
R_EX_k_e     -0.0893471
R_EX_mg2_e   -0.00397087
R_EX_fe3_e   -0.00357401
R_EX_fe2_e   -0.00307371
R_EX_ca2_e   -0.00238252
R_EX_cl_e    -0.00238252
R_EX_so4_e   -0.00198566
R_EX_cu2_e   -0.000324536
R_EX_mn2_e   -0.000316297
R_EX_ribflv_e -0.000204151
R_EX_zn2_e   -0.000156088
R_EX_cobalt2_e -4.57737e-05
R_EX_4hba_e   0.000102075
R_EX_d23hb_e  0.252297
R_EX_h2s_e    7.70351
R_EX_nh4_e    12.888
R_EX_h2o_e    13.1179
R_EX_ac_e     21.0385
R_EX_co2_e    21.8343


### Summary

**Remove genes/proteins**

In [316]:
[key for key, value in model.gene_to_reaction_lookup().items() if len(value)==0]

['G_WP_015926593_1', 'G_WP_049756869_1', 'G_WP_012634509_1']

In [317]:
model.remove_genes([key for key, value in model.gene_to_reaction_lookup().items() if len(value)==0])

In [318]:
model.update()

In [319]:
[key for key, value in model.gene_to_reaction_lookup().items() if len(value)==0]

[]

**Finish model**

In [320]:
model.id = "model_c_H10_part2_3_1"

In [321]:
reframed.save_cbmodel(model,filename="model_c_H10_part2_3_1.xml")

In [322]:
model_new = reframed.load_cbmodel('model_c_H10_part2_3_1.xml')

In [323]:
model_prev = reframed.load_cbmodel('model_cellulolyticum_H10.xml')

In [324]:
models_dict={model.id:{} for model in [model_new,model_prev]}
models_rxn_dict={model.id:{} for model in [model_new,model_prev]}
for model in [model,model_prev]:
    models_dict[model.id]['Reactions']=len(model.reactions)
    models_dict[model.id]['Metabolites']=len(model.metabolites)
    models_dict[model.id]['Genes']=len(model.genes)
    
    models_rxn_dict[model.id]['Enzymatic']=len(model.get_reactions_by_type(reframed.ReactionType.ENZYMATIC))
    models_rxn_dict[model.id]['Exchange']=len(model.get_reactions_by_type(reframed.ReactionType.EXCHANGE))
    models_rxn_dict[model.id]['Transport']=len(model.get_reactions_by_type(reframed.ReactionType.TRANSPORT))
    models_rxn_dict[model.id]['Sink']=len(model.get_reactions_by_type(reframed.ReactionType.SINK))
    models_rxn_dict[model.id]['Other']=len(model.get_reactions_by_type(reframed.ReactionType.OTHER))
    

**Overview models**

In [325]:
pd.DataFrame(models_dict)

Unnamed: 0,model_c_H10_part2_3_1,model_cellulolyticum_H10
Reactions,1782,1811
Metabolites,1250,1250
Genes,728,733


**Overview reactions in models**

In [326]:
pd.DataFrame(models_rxn_dict)

Unnamed: 0,model_c_H10_part2_3_1,model_cellulolyticum_H10
Enzymatic,878,883
Exchange,210,210
Transport,449,475
Sink,0,0
Other,245,243


**Reactions removed**

In [327]:
set(model_prev.reactions)-set(model_new.reactions)

{'R_ACGApts',
 'R_ACMANApts',
 'R_ACNAMt2',
 'R_ARBTpts',
 'R_ARBabcpp',
 'R_ARBt2r',
 'R_ASCBpts',
 'R_CD6P',
 'R_CELBt2',
 'R_FRUpts',
 'R_FUCpts',
 'R_GALTpts',
 'R_GALpts',
 'R_GALt2_3',
 'R_GAMpts',
 'R_GLCAt2',
 'R_GLCabcpp',
 'R_GLCpts',
 'R_GLUKA_1',
 'R_ID6P',
 'R_MALTpts',
 'R_MALTt2',
 'R_MANpts',
 'R_MNLDHr',
 'R_MNLpts',
 'R_SBTpts',
 'R_SBTt2',
 'R_SUCRt2',
 'R_SUCpts',
 'R_TREpts',
 'R_XYLI2',
 'R_XYLabcpp',
 'R_XYLt2',
 'R_r0191'}

**Reactions added**

In [328]:
set(model_new.reactions)-set(model_prev.reactions)

{'R_FNRR', 'R_FNRR2', 'R_GALabc', 'R_H2td', 'R_PFK_ppi'}