# Modeling human metabolism based on Rhea reactions

Rhea was not designed for modeling purposes, however, since it is the reference reaction vocabulary for SwissProt and UniProt, it is possible to build draft models based on combination of Rhea data processing and SPARQL query to Rhea and UniProt

1. Install pyrheadb package

In [1]:
#!pip install -e ../

2. Import necessary modules of pyrheadb

In [2]:
import os
import pandas as pd
from pyrheadb.RheaDB import RheaDB
from pyrheadb.RheaCobraModel import RheaCobraModel
from pyrheadb import RheaSPARQL

3. Request only the reactions of human

In [3]:
os.makedirs('cache', exist_ok=True)

In [4]:
file_human_reactions = os.path.join('cache', 'df_human_reactions.tsv')
if not os.path.exists(file_human_reactions):
    df_human_reactions = RheaSPARQL.get_all_human_reactions()
    df_human_reactions.to_csv(file_human_reactions, sep='\t', index=False)
df_human_reactions = pd.read_csv(file_human_reactions, sep='\t')
df_human_reactions

Unnamed: 0,proteinName,protein,proteinId,reaction,MASTER_ID
0,(3R)-3-hydroxyacyl-CoA dehydrogenase,http://purl.uniprot.org/uniprot/Q92506,DHB8_HUMAN,http://rdf.rhea-db.org/14929,14929
1,(3R)-3-hydroxyacyl-CoA dehydrogenase,http://purl.uniprot.org/uniprot/Q92506,DHB8_HUMAN,http://rdf.rhea-db.org/24612,24612
2,(3R)-3-hydroxyacyl-CoA dehydrogenase,http://purl.uniprot.org/uniprot/Q92506,DHB8_HUMAN,http://rdf.rhea-db.org/32711,32711
3,(3R)-3-hydroxyacyl-CoA dehydrogenase,http://purl.uniprot.org/uniprot/Q92506,DHB8_HUMAN,http://rdf.rhea-db.org/41992,41992
4,(Lyso)-N-acylphosphatidylethanolamine lipase,http://purl.uniprot.org/uniprot/Q8TB40,ABHD4_HUMAN,http://rdf.rhea-db.org/45384,45384
...,...,...,...,...,...
33709,uridine/cytidine kinase,http://purl.uniprot.org/uniprot/G3V170,G3V170_HUMAN,http://rdf.rhea-db.org/24674,24674
33710,very-long-chain (3R)-3-hydroxyacyl-CoA dehydra...,http://purl.uniprot.org/uniprot/J3KT94,J3KT94_HUMAN,http://rdf.rhea-db.org/39159,39159
33711,very-long-chain (3R)-3-hydroxyacyl-CoA dehydra...,http://purl.uniprot.org/uniprot/J3KT94,J3KT94_HUMAN,http://rdf.rhea-db.org/45812,45812
33712,vesicle-fusing ATPase,http://purl.uniprot.org/uniprot/A0A6Q8PGU6,A0A6Q8PGU6_HUMAN,http://rdf.rhea-db.org/13065,13065


4. Filter RheaDB (rdb) table used in this analysis to keep only the human reactions

In [5]:
df_human_reactions.drop_duplicates(subset=['MASTER_ID'], inplace=True)
rdb_human = RheaDB('/scratch/')
rdb_human.df_smiles_master_id = \
                rdb_human.df_smiles_master_id.merge(df_human_reactions['MASTER_ID'], how='inner', on='MASTER_ID')

Your Rhea DB version is 132
Using previously downloaded Rhea version


In [6]:
rhea_human_model = RheaCobraModel('rhea_human_model', rdb_human)

Set parameter Username
Academic license - for non-commercial use only - expires 2025-04-25
Adding reactions to model


100%|██████████████████████████████████████| 3748/3748 [00:21<00:00, 171.86it/s]


Total boundary metabolites: 197
Added 379 exchanges:


In [7]:
rhea_human_model.inspect_model()

3945 reactions
3250 metabolites
7890 variables


In [8]:
# # Write down model reactions if you want to inspect it in more detail
# with open(os.path.join('cache','model_reactions.txt'), 'w') as w:
#     for r in rhea_model.model.reactions:
#         w.write(f"{r}\n")

# Use cobra to analyse consistency of fluxes
Text extract added from cobra doc for clarity (https://cobrapy.readthedocs.io/en/latest/consistency.html)

In [9]:
from cobra import flux_analysis

## Using FVA

The first approach we can follow is to use FVA (Flux Variability Analysis) which among many other applications, is used to detect blocked reactions. The cobra.flux_analysis.find_blocked_reactions() function will return a list of all the blocked reactions obtained using FVA.

In [10]:
%time
blocked_reactions=flux_analysis.find_blocked_reactions(rhea_human_model.model)

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 6.2 µs


In [11]:
print(len(blocked_reactions), 'blocked reactions')

1914 blocked reactions


## Using FASTCC
The second approach to obtaining consistent network in cobrapy is to use FASTCC. Using this method,
you can expect to efficiently obtain an accurate consistent network. For more details regarding the algorithm,
please see Vlassis N, Pacheco MP, Sauter T (2014).

In [12]:
consistent_human_model = flux_analysis.fastcc(rhea_human_model.model)
print('Number of consistent reactions according to FASTCC', len(consistent_human_model.reactions))

Read LP format model from file /tmp/tmpobktno8c.lp
Reading time = 0.01 seconds
: 3250 rows, 7890 columns, 32378 nonzeros
[<Reaction RHEA_10040 at 0x7f362ccabaf0>, <Reaction RHEA_10112 at 0x7f362ccabca0>, <Reaction RHEA_10140 at 0x7f362ccab250>, <Reaction RHEA_10164 at 0x7f362ccab910>, <Reaction RHEA_10180 at 0x7f362ccabc70>, <Reaction RHEA_10188 at 0x7f362ccabfd0>, <Reaction RHEA_10224 at 0x7f362ccab7f0>, <Reaction RHEA_10244 at 0x7f362c67a340>, <Reaction RHEA_10248 at 0x7f362c67a4c0>, <Reaction RHEA_10264 at 0x7f362c67a640>, <Reaction RHEA_10276 at 0x7f362c67a7c0>, <Reaction RHEA_10292 at 0x7f362c67a940>, <Reaction RHEA_10300 at 0x7f362c67aac0>, <Reaction RHEA_10336 at 0x7f362c67adc0>, <Reaction RHEA_10348 at 0x7f362c67af40>, <Reaction RHEA_10380 at 0x7f362c67a1c0>, <Reaction RHEA_10412 at 0x7f362cca2520>, <Reaction RHEA_10440 at 0x7f362cca2d60>, <Reaction RHEA_10492 at 0x7f362cca2d90>, <Reaction RHEA_10600 at 0x7f362cca2ac0>, <Reaction RHEA_10656 at 0x7f362cca2e20>, <Reaction RHEA_10

## Use cobra to find the value of the flux for the objective reaction

In [14]:
rhea_human_model.model.objective = 'RHEA_13065' # ATP maintenance

In [15]:
rhea_human_model.model.optimize().objective_value

1000.0

# Repeat analysis for the whole RheaDB

In [16]:
rdb_all = RheaDB('/scratch/')
rhea_all_model = RheaCobraModel('rhea_all_model', rdb_all)

Your Rhea DB version is 132
Using previously downloaded Rhea version
Adding reactions to model


100%|████████████████████████████████████| 12420/12420 [01:40<00:00, 123.70it/s]


Total boundary metabolites: 197
Added 477 exchanges:


In [17]:
rhea_all_model.inspect_model()

12617 reactions
10133 metabolites
25234 variables


In [18]:
%%time
blocked_reactions=flux_analysis.find_blocked_reactions(rhea_all_model.model)
print(len(blocked_reactions), 'blocked reactions')

5778 blocked reactions
CPU times: user 565 ms, sys: 280 ms, total: 845 ms
Wall time: 2min 8s


In [None]:
%%time
consistent_all_model = flux_analysis.fastcc(rhea_all_model.model)
print('Number of consistent reactions according to FASTCC', len(consistent_model.reactions))

# Gapfill rhea_human_model with rhea_all_model

Exclude human model reactions from all rhea model to have a clean gapfill

In [None]:
for i in [i.id for i in rhea_human_model.model.reactions]:
    try:
        reaction = rhea_all_model.model.reactions.get_by_id(i)
        rhea_all_model.model.remove_reactions([reaction])
    except Exception as e:
        print(e)

Gapfill

In [19]:
solution = flux_analysis.gapfill(rhea_human_model.model, rhea_all_model.model, demand_reactions=False)
for reaction in solution[0]:
    print(reaction.id)

Read LP format model from file /tmp/tmppl4dpbqe.lp
Reading time = 0.02 seconds
: 3250 rows, 7890 columns, 32378 nonzeros
Read LP format model from file /tmp/tmp0ui93pr3.lp
Reading time = 0.05 seconds
: 10133 rows, 25234 columns, 114914 nonzeros


In [20]:
solution[0]

[]

In [None]:
stop

We can also instead of using the original objective, specify a given metabolite that we want the model to be able to produce.

In [None]:
w = open(os.path.join('cache', 'reactions_results.tsv'),'w')
for query_reaction in rhea_human_model.model.reactions:
    print(query_reaction)
    temp_model = rhea_human_model.model.copy()
    with temp_model:
        try:
            temp_model.objective = query_reaction
            solution = flux_analysis.gapfill(temp_model, rhea_all_model.model)
            w.write(f"{query_reaction.id}\t{';'.join([reaction.id for reaction in solution[0]])}\n")
            for reaction in solution[0]:
                print(reaction.id)
                
        except Exception as e:
            w.write(f"{query_reaction.id}\t{e}\n")
    print()
w.close()