# <span style="color: blue;">Part 2.1.1: Curation of model based - cofactors of glycolysis fermentation pathways</span>

- Change direction directionality for FNRR3 (Ferrodoxin reaction, should make it able to produce hydrogen). (Check that there will not be an energy producing cycle from this-what about other h2 producing reactions)
- Removing NADPH dependent Malic enzyme
- GTP/ATP dependency

In [64]:
import reframed
from reframed import pFBA
import pandas as pd
import warnings
import copy
import collections

In [65]:
model = reframed.load_cbmodel('model_cellulolyticum_H10.xml')

In [66]:
model_universe = reframed.load_cbmodel('bigg_universe.xml')

In [67]:
%store -r gene_protein_map

## Fermentative pathways and other

### Preparing for curation



In [68]:
rxn_add=[]
rxn_rm=[]
gprs={}

#### Reactions to add

- FNRR 
- FNRR2

*FNRR*

In [69]:
rxn_add.append(model_universe.reactions.R_FNRR)

*FNRR2*

In [70]:
rxn_add.append(model_universe.reactions.R_FNRR2)

Gene collected from UniProt (explanation in different document)

In [71]:
gene_protein_map[gene_protein_map['Gene names'].str.contains('Ccel_2546')]

Unnamed: 0,Entry,Entry name,Protein names,Gene names,Cross-reference (RefSeq)
2881,B8I6A9,B8I6A9_RUMCH,Oxidoreductase FAD/NAD(P)-binding domain protein,Ccel_2546,G_WP_015925963_1


In [72]:
prot = reframed.Protein()
prot.genes=['G_WP_015925963_1']
gpr = reframed.GPRAssociation()
gpr.proteins=[prot]
gprs['R_FNRR2']=gpr

#### Reactions to remove
- ME1

In [73]:
rxn_rm.append(model.reactions.R_ME1)

#### Reactions to change

* None

FNRR3 = copy.deepcopy(model.reactions.R_FNRR3)

FNRR3.reversible

FNRR3.reversible=True
FNRR3.lb=-1000

rxn_rm.append(model.reactions.R_FNRR3)
rxn_add.append(FNRR3)

### Curation and feasability test

In [74]:
rxn_add

[R_FNRR: M_fdxrd_c + M_h_c + M_nad_c <-> M_fdxo_2_2_c + M_nadh_c,
 R_FNRR2: M_fdxrd_c + M_h_c + M_nadp_c <-> M_fdxo_2_2_c + M_nadph_c]

In [75]:
rxn_rm

[R_ME1: M_mal__L_c + M_nad_c --> M_co2_c + M_nadh_c + M_pyr_c]

#### Remove reactions 

In [76]:
for rxn in rxn_rm:
    print("Removing rxn: " + str(rxn))
    model.remove_reaction(rxn.id)

    env_empty = reframed.Environment.empty(model)
    objective= {rxn:0 for rxn in model.reactions}
    objective['R_ATPM']=1
    
    sol = reframed.FBA(model,objective=objective,constraints=env_empty)

    if abs(sol.fobj) <1e-6: #cplex tolerance
        print('There are NO energy producing cycles in the model')
        print("\n")
    else:
        print('There is at least one energy producing cycle in the model')
        print("\n")
        sol_pfba = pFBA(model,objective=objective,constraints=env_empty)

        print('These are the reactions that are a part of the energy producing cycle')

        for rxn,value in sol_pfba.values.items():
            if abs(value)>1e-6: 

                print("\t" + str(rxn)+": " + str(value))
        print("\n")

Removing rxn: R_ME1: M_mal__L_c + M_nad_c --> M_co2_c + M_nadh_c + M_pyr_c
There are NO energy producing cycles in the model




#### Add reactions 

In [77]:
for rxn in rxn_add:
    print("Adding rxn: " + str(rxn))
    model.add_reaction(rxn)
    
    env_empty = reframed.Environment.empty(model)
    objective= {rxn:0 for rxn in model.reactions}
    objective['R_ATPM']=1
    
    sol = reframed.FBA(model,objective=objective,constraints=env_empty)

    if abs(sol.fobj) <1e-6: #cplex tolerance
        print('There are NO energy producing cycles in the model')
        print("\n")
    else:
        print('There is at least one energy producing cycle in the model')
        sol_pfba = pFBA(model,objective=objective,constraints=env_empty)

        print('These are the reactions that are a part of the energy producing cycle')

        for rxn,value in sol_pfba.values.items():
            if abs(value)>1e-6: 

                print("\t" + str(rxn)+": " + str(value))
        print("\n")

Adding rxn: R_FNRR: M_fdxrd_c + M_h_c + M_nad_c <-> M_fdxo_2_2_c + M_nadh_c
There are NO energy producing cycles in the model


Adding rxn: R_FNRR2: M_fdxrd_c + M_h_c + M_nadp_c <-> M_fdxo_2_2_c + M_nadph_c
There are NO energy producing cycles in the model




#### Add GPR to reactions

In [78]:
for rxn_id,gpr in gprs.items():
    model.set_gpr_association(rxn_id,gpr)

## Cofactors in glycolysis

- HEX1: remove 'G_WP_015926569_1' (Ccel_3221) from gpr  (Change)
- HEX1_gtp:  (add)
- PFK: ATP (protein is not functional G_WP_015926027_1) (remove)
- PFK_ppi: PPi -dependent PFK (Add)
- CD6P: same as PFK but uses wrong cofactor (remove)
- r0191: same as PFK but uses wrong cofactor (remove)
- ID6P: same as PFK but uses wrong cofactor (remove)
- GALK2: ATP->GTP (remove)
- GALKr: ATP ->GTP (remove)
- GALK2_gtp: ATP->GTP (add)
- GALKr_gtp: ATP ->GTP (add)

**HEX1 - GPRs**

Most of the proteins are a part of the *ROK protein family*: [ROK family proteins are are bacterial proteins that compose of transcriptional repressors, sugar kinases, and uncharactarized ORFs.](https://www.ebi.ac.uk/interpro/entry/InterPro/IPR000600/)

One of them is a hexokinase. The evidence used for this part does not support changing GPR.  

**PFK - gprs**

There are two different proteins related to this reaction. They have different preference for ATP vs PPi, but the ATP one is not active. 

### Preparing for curation

In [79]:
rxn_add=[]
rxn_rm=[]
gprs={}

#### Reactions to add
- PFK_ppi
- HEX1_gtp
- GALK2_gtp(add)
- GALKr_gtp->GTP (add)

*PFK_ppi*

In [80]:
stoichiometry={'M_g6p_c':-1,
              'M_ppi_c':-1,
              'M_fdp_c':1,
              'M_h_c':1,
              'M_pi_c':1}

PFK_ppi=reframed.CBReaction(reaction_id='R_PFK_ppi', 
                         name='diphosphate--fructose-6-phosphate 1-phosphotransferase',
                         reversible=True, 
                         stoichiometry=stoichiometry,
                         reaction_type=reframed.ReactionType.ENZYMATIC)

rxn_add.append(PFK_ppi)

prot = reframed.Protein()
prot.genes=['G_WP_015925658_1']
gpr = reframed.GPRAssociation()
gpr.proteins=[prot]
gprs['R_PFK_ppi']=gpr

*HEX1_gtp*

In [81]:
stoichiometry={'M_glc__D_c':-1,
              'M_gtp_c':-1,
              'M_g6p_c':1,
              'M_h_c':1,
              'M_gdp_c':1}

HEX1_gtp=reframed.CBReaction(reaction_id='R_HEX1_gtp', 
                         name='GTP dependent hexokinase',
                         reversible=True, 
                         stoichiometry=stoichiometry,
                         reaction_type=reframed.ReactionType.ENZYMATIC)

rxn_add.append(HEX1_gtp)

prot = reframed.Protein()
prot.genes=['G_WP_015926569_1']
gpr = reframed.GPRAssociation()
gpr.proteins=[prot]
gprs['R_HEX1_gtp']=gpr

*GALK2_gtp*

In [82]:
stoichiometry={'M_a_gal__D_c':-1,
              'M_gtp_c':-1,
              'M_gal1p_c':1,
              'M_h_c':1,
              'M_gdp_c':1}

GALK2_gtp=reframed.CBReaction(reaction_id='R_GALK2_gtp', 
                         name='alpha Galactokinase GTP dependent',
                         reversible=False, 
                         stoichiometry=stoichiometry,
                         reaction_type=reframed.ReactionType.ENZYMATIC)

rxn_add.append(GALK2_gtp)

prot = reframed.Protein()
prot.genes=['G_WP_015926586_1']
gpr = reframed.GPRAssociation()
gpr.proteins=[prot]
gprs['R_GALK2_gtp']=gpr

*GALKr_gtp*

In [83]:
stoichiometry={'M_gal_c':-1,
              'M_gtp_c':-1,
              'M_gal1p_c':1,
              'M_h_c':1,
              'M_gdp_c':1}

GALKr_gtp=reframed.CBReaction(reaction_id='R_GALKr_gtp', 
                         name='Galactokinase GTP dependent',
                         reversible=True, 
                         stoichiometry=stoichiometry,
                         reaction_type=reframed.ReactionType.ENZYMATIC)

rxn_add.append(GALKr_gtp)

prot = reframed.Protein()
prot.genes=['G_WP_015926586_1']
gpr = reframed.GPRAssociation()
gpr.proteins=[prot]
gprs['R_GALKr_gtp']=gpr

In [84]:
rxn_add

[R_PFK_ppi: M_g6p_c + M_ppi_c <-> M_fdp_c + M_h_c + M_pi_c,
 R_HEX1_gtp: M_glc__D_c + M_gtp_c <-> M_g6p_c + M_h_c + M_gdp_c,
 R_GALK2_gtp: M_a_gal__D_c + M_gtp_c --> M_gal1p_c + M_h_c + M_gdp_c,
 R_GALKr_gtp: M_gal_c + M_gtp_c <-> M_gal1p_c + M_h_c + M_gdp_c]

In [85]:
gprs

{'R_PFK_ppi': G_WP_015925658_1,
 'R_HEX1_gtp': G_WP_015926569_1,
 'R_GALK2_gtp': G_WP_015926586_1,
 'R_GALKr_gtp': G_WP_015926586_1}

#### Reactions to remove
- CD6P
- r0191
- ID6P
- PFK
- GALK2
- GALKr

In [86]:
rxn_rm.append(model.reactions.R_CD6P)
rxn_rm.append(model.reactions.R_r0191)
rxn_rm.append(model.reactions.R_ID6P)
rxn_rm.append(model.reactions.R_PFK)
rxn_rm.append(model.reactions.R_GALK2)
rxn_rm.append(model.reactions.R_GALKr)

#### Reactions to change

##### Change GPR
- HEX1

In [87]:
HEX1 = copy.deepcopy(model.reactions.R_HEX1)

In [88]:
HEX1.gpr.proteins

[G_WP_015924247_1,
 G_WP_015925130_1,
 G_WP_015925627_1,
 G_WP_015926569_1,
 G_WP_015926770_1]

In [89]:
HEX1.gpr.proteins.pop(3)

G_WP_015926569_1

In [90]:
rxn_add.append(HEX1)

In [91]:
rxn_rm.append(model.reactions.R_HEX1)

### Curation and feasability test

In [92]:
rxn_add

[R_PFK_ppi: M_g6p_c + M_ppi_c <-> M_fdp_c + M_h_c + M_pi_c,
 R_HEX1_gtp: M_glc__D_c + M_gtp_c <-> M_g6p_c + M_h_c + M_gdp_c,
 R_GALK2_gtp: M_a_gal__D_c + M_gtp_c --> M_gal1p_c + M_h_c + M_gdp_c,
 R_GALKr_gtp: M_gal_c + M_gtp_c <-> M_gal1p_c + M_h_c + M_gdp_c,
 R_HEX1: M_atp_c + M_glc__D_c --> M_adp_c + M_g6p_c + M_h_c]

In [93]:
rxn_rm

[R_CD6P: M_ctp_c + M_f6p_c --> M_cdp_c + M_fdp_c + M_h_c,
 R_r0191: M_f6p_c + M_utp_c --> M_fdp_c + M_h_c + M_udp_c,
 R_ID6P: M_f6p_c + M_itp_c --> M_fdp_c + M_h_c + M_idp_c,
 R_PFK: M_atp_c + M_f6p_c --> M_adp_c + M_fdp_c + M_h_c,
 R_GALK2: M_a_gal__D_c + M_atp_c --> M_adp_c + M_gal1p_c + M_h_c,
 R_GALKr: M_atp_c + M_gal_c <-> M_adp_c + M_gal1p_c + M_h_c,
 R_HEX1: M_atp_c + M_glc__D_c --> M_adp_c + M_g6p_c + M_h_c]

#### Remove reactions 

In [94]:
for rxn in rxn_rm:
    print("Removing rxn: " + str(rxn))
    model.remove_reaction(rxn.id)
    env_empty = reframed.Environment.empty(model)
    objective= {rxn:0 for rxn in model.reactions}
    objective['R_ATPM']=1
    
    sol = reframed.FBA(model,objective=objective,constraints=env_empty)

    if abs(sol.fobj) <1e-6: #cplex tolerance
        print('There are NO energy producing cycles in the model')
        print("\n")
    else:
        print('There is at least one energy producing cycle in the model')
        print("\n")
        sol_pfba = pFBA(model,objective=objective,constraints=env_empty)

        print('These are the reactions that are a part of the energy producing cycle')

        for rxn,value in sol_pfba.values.items():
            if value>1e-6: 
                print("\t" + str(rxn)+": " + str(value))
        print("\n")

Removing rxn: R_CD6P: M_ctp_c + M_f6p_c --> M_cdp_c + M_fdp_c + M_h_c
There are NO energy producing cycles in the model


Removing rxn: R_r0191: M_f6p_c + M_utp_c --> M_fdp_c + M_h_c + M_udp_c
There are NO energy producing cycles in the model


Removing rxn: R_ID6P: M_f6p_c + M_itp_c --> M_fdp_c + M_h_c + M_idp_c
There are NO energy producing cycles in the model


Removing rxn: R_PFK: M_atp_c + M_f6p_c --> M_adp_c + M_fdp_c + M_h_c
There are NO energy producing cycles in the model


Removing rxn: R_GALK2: M_a_gal__D_c + M_atp_c --> M_adp_c + M_gal1p_c + M_h_c
There are NO energy producing cycles in the model


Removing rxn: R_GALKr: M_atp_c + M_gal_c <-> M_adp_c + M_gal1p_c + M_h_c
There are NO energy producing cycles in the model


Removing rxn: R_HEX1: M_atp_c + M_glc__D_c --> M_adp_c + M_g6p_c + M_h_c
There are NO energy producing cycles in the model




In [95]:
for rxn in rxn_add:
    print("Adding rxn: " + str(rxn))
    model.add_reaction(rxn)
    
    env_empty = reframed.Environment.empty(model)
    objective= {rxn:0 for rxn in model.reactions}
    objective['R_ATPM']=1
    
    sol = reframed.FBA(model,objective=objective,constraints=env_empty)

    if abs(sol.fobj) <1e-6: #cplex tolerance
        print('There are NO energy producing cycles in the model')
        print("\n")
    else:
        print('There is at least one energy producing cycle in the model')
        sol_pfba = pFBA(model,objective=objective,constraints=env_empty)

        print('These are the reactions that are a part of the energy producing cycle')

        for rxn,value in sol_pfba.values.items():
            if abs(value)>1e-6:

                print("\t" + str(rxn)+": " + str(value))
        print("\n")

Adding rxn: R_PFK_ppi: M_g6p_c + M_ppi_c <-> M_fdp_c + M_h_c + M_pi_c
There are NO energy producing cycles in the model


Adding rxn: R_HEX1_gtp: M_glc__D_c + M_gtp_c <-> M_g6p_c + M_h_c + M_gdp_c
There are NO energy producing cycles in the model


Adding rxn: R_GALK2_gtp: M_a_gal__D_c + M_gtp_c --> M_gal1p_c + M_h_c + M_gdp_c
There are NO energy producing cycles in the model


Adding rxn: R_GALKr_gtp: M_gal_c + M_gtp_c <-> M_gal1p_c + M_h_c + M_gdp_c
There are NO energy producing cycles in the model


Adding rxn: R_HEX1: M_atp_c + M_glc__D_c --> M_adp_c + M_g6p_c + M_h_c
There are NO energy producing cycles in the model




#### Add GPR to reactions

In [96]:
for rxn_id,gpr in gprs.items():
    model.set_gpr_association(rxn_id,gpr)

## Hydrogen production

One of the goal of this curation was to see if this will help the process of hydrogen production. Let's check if this is now possible. 

Hydrogen production is NOT feasible in the current conditions, even at 0 growth rate.

In [97]:
reframed.FVA(model,reactions=['R_EX_h2_e'])

{'R_EX_h2_e': [0.0, 0.0]}

Looking at the reactions involving M_h2_e and M_h2_c we see that there is no transport reaction for hydrogen. 

In [98]:
for rxn in model.get_metabolite_reactions('M_h2_e'):
    print(model.reactions[rxn])

R_HYDA1: M_h2_e + 2.0 M_h_c + M_mqn6_c --> 2.0 M_h_e + M_mql6_c
R_EX_h2_e: M_h2_e --> 


In [99]:
for rxn in model.get_metabolite_reactions('M_h2_c'):
    print(model.reactions[rxn])

R_FNRR3: M_fdxo_2_2_c + M_h2_c --> M_fdxrd_c + 2.0 M_h_c
R_H2ASE_syn: M_h_c + M_nadph_c <-> M_h2_c + M_nadp_c
R_HYD1pp: M_h2_c + 2.0 M_h_c + M_q8_c --> 2.0 M_h_p + M_q8h2_c
R_HYD2: M_h2_c + 2.0 M_h_c + M_mqn8_c --> 2.0 M_h_e + M_mql8_c


In the universal model there is a transport reaction for hydrogen with diffusion. We can ask project partners for this. 

In [100]:
model_universe.reactions.R_H2td

R_H2td: M_h2_c <-> M_h2_e

In [101]:
model.add_reaction(model_universe.reactions.R_H2td)

In [102]:
model.update()

In [103]:
reframed.FVA(model,reactions=['R_EX_h2_e'])

{'R_EX_h2_e': [0.0, 75.9689557855127]}

In [104]:
env_empty = reframed.Environment.empty(model)
objective= {rxn:0 for rxn in model.reactions}
objective['R_ATPM']=1

sol = reframed.FBA(model,objective=objective,constraints=env_empty)

if abs(sol.fobj) <1e-6: #cplex tolerance
    print('There are NO energy producing cycles in the model')
    print("\n")
else:
    print('There is at least one energy producing cycle in the model')
    sol_pfba = pFBA(model,objective=objective,constraints=env_empty)

    print('These are the reactions that are a part of the energy producing cycle')

    for rxn,value in sol_pfba.values.items():
        if value>50.0: # What value to choose here?

            print("\t" + str(rxn)+": " + str(value))
    print("\n")

There are NO energy producing cycles in the model




### pFBA results

In [105]:
reframed.pFBA(model,constraints={'R_EX_h2s_e':0}).show_values(pattern="R_EX",sort=True)

R_EX_glc__D_e -10
R_EX_h_e     -10
R_EX_no3_e   -7.1296
R_EX_pi_e    -0.97907
R_EX_nh4_e   -0.206068
R_EX_cys__L_e -0.171736
R_EX_k_e     -0.136108
R_EX_mg2_e   -0.00604905
R_EX_fe3_e   -0.0054445
R_EX_fe2_e   -0.00468235
R_EX_ca2_e   -0.00362943
R_EX_cl_e    -0.00362943
R_EX_so4_e   -0.00302488
R_EX_cu2_e   -0.000494384
R_EX_mn2_e   -0.000481832
R_EX_ribflv_e -0.000310995
R_EX_zn2_e   -0.000237778
R_EX_cobalt2_e -6.97297e-05
R_EX_4hba_e   0.000155497
R_EX_d23hb_e  0.384338
R_EX_ac_e     9.84678
R_EX_co2_e    11.0591
R_EX_h2o_e    29.609


### Summary

In [106]:
model.update()

**Remove genes/proteins**

In [107]:
[key for key, value in model.gene_to_reaction_lookup().items() if len(value)==0]

[]

In [108]:
model.remove_genes([key for key, value in model.gene_to_reaction_lookup().items() if len(value)==0])

In [109]:
model.update()

In [110]:
[key for key, value in model.gene_to_reaction_lookup().items() if len(value)==0]

[]

**Finish model**

In [111]:
model.id = "model_c_H10_part2_1_1"

In [112]:
reframed.save_cbmodel(model,filename="model_c_H10_part2_1_1.xml")

In [113]:
model_new = reframed.load_cbmodel('model_c_H10_part2_1_1.xml')

In [114]:
model_prev = reframed.load_cbmodel('model_cellulolyticum_H10.xml')

In [115]:
models_dict={model.id:{} for model in [model_new,model_prev]}
models_rxn_dict={model.id:{} for model in [model_new,model_prev]}
for model in [model,model_prev]:
    models_dict[model.id]['Reactions']=len(model.reactions)
    models_dict[model.id]['Metabolites']=len(model.metabolites)
    models_dict[model.id]['Genes']=len(model.genes)
    
    models_rxn_dict[model.id]['Enzymatic']=len(model.get_reactions_by_type(reframed.ReactionType.ENZYMATIC))
    models_rxn_dict[model.id]['Exchange']=len(model.get_reactions_by_type(reframed.ReactionType.EXCHANGE))
    models_rxn_dict[model.id]['Transport']=len(model.get_reactions_by_type(reframed.ReactionType.TRANSPORT))
    models_rxn_dict[model.id]['Sink']=len(model.get_reactions_by_type(reframed.ReactionType.SINK))
    models_rxn_dict[model.id]['Other']=len(model.get_reactions_by_type(reframed.ReactionType.OTHER))
    

**Overview models**

In [116]:
pd.DataFrame(models_dict)

Unnamed: 0,model_c_H10_part2_1_1,model_cellulolyticum_H10
Reactions,1811,1811
Metabolites,1250,1250
Genes,734,733


**Overview reactions in models**

In [117]:
pd.DataFrame(models_rxn_dict)

Unnamed: 0,model_c_H10_part2_1_1,model_cellulolyticum_H10
Enzymatic,880,883
Exchange,210,210
Transport,476,475
Sink,0,0
Other,245,243


**Reactions removed**

In [118]:
set(model_prev.reactions)-set(model_new.reactions)

{'R_CD6P', 'R_GALK2', 'R_GALKr', 'R_ID6P', 'R_ME1', 'R_PFK', 'R_r0191'}

**Reactions added**

In [119]:
set(model_new.reactions)-set(model_prev.reactions)

{'R_FNRR',
 'R_FNRR2',
 'R_GALK2_gtp',
 'R_GALKr_gtp',
 'R_H2td',
 'R_HEX1_gtp',
 'R_PFK_ppi'}