# Porting genome scale metabolic models for metabolomics

- to make formats compatible to mummichog
- to link to a common compound table 
- from compound table, we generated predicted mass peaks based on formula

As mummichog 3 is under development, treat this as part of development.

**Use cobra to parse SBML models whereas applicable**

Not all models comply with the formats in cobra. Models from USCD and Thiele labs should comply.

**Base our code on metDataModel**

Each model needs a list of Reactions, list of Pathways, and a list of Compounds.
It's important to include with Compounds with all linked identifiers to other DBs (HMDB, PubChem, etc), and with formulae (usually charged form in these models) when available.
We can alwasy update the data later. E.g. the neural formulae can be retrieved from HMDB if linked.
Save in Python pickle and in JSON.


Shuzhao Li, 2021-05-09, 05-10

In [2]:
# https://github.com/shuzhao-li/metDataModel/
!pip install metDataModel

Collecting metDataModel
[?25l  Downloading https://files.pythonhosted.org/packages/4a/e5/e4625fe421c74695a286bc1a289026ef1c1dff464e078c17fd9eed9167dd/metDataModel-0.3.0-py3-none-any.whl (2.1MB)
[K     |████████████████████████████████| 2.2MB 926kB/s eta 0:00:01
[?25hInstalling collected packages: metDataModel
Successfully installed metDataModel-0.3.0


In [6]:
from metDataModel.core import Compound, Reaction, Pathway, metabolicModel

# tool to parse XML - not needed if using cobra
import xml.etree.ElementTree as ET
# tool to parse Excel xlsx
import xlrd


In [14]:
! pip install cobra

Collecting cobra
[?25l  Downloading https://files.pythonhosted.org/packages/c5/04/f785d34e11b42b21c101130d60324c90f58a612f49a6f95e1747f0f05ca1/cobra-0.22.0-py2.py3-none-any.whl (2.4MB)
[K     |████████████████████████████████| 2.4MB 2.1MB/s eta 0:00:01
[?25hCollecting optlang~=1.5 (from cobra)
[?25l  Downloading https://files.pythonhosted.org/packages/12/3e/9d0b72cf5a8ff660e5787a0797906e04942081f3ad4a95f860488affff2b/optlang-1.5.2-py2.py3-none-any.whl (147kB)
[K     |████████████████████████████████| 153kB 2.5MB/s eta 0:00:01
[?25hCollecting rich>=8.0 (from cobra)
[?25l  Downloading https://files.pythonhosted.org/packages/1a/da/2a1f064dc620ab47f3f826ae085384084b71ea05c8c21d67f1dfc29189ab/rich-10.1.0-py3-none-any.whl (201kB)
[K     |████████████████████████████████| 204kB 3.0MB/s eta 0:00:01
[?25hCollecting future (from cobra)
[?25l  Downloading https://files.pythonhosted.org/packages/45/0b/38b06fd9b92dc2b68d58b75f900e97884c45bedd2ff83203d933cf5851c9/future-0.18.2.tar.gz (829k

  Found existing installation: Pygments 2.4.2
    Uninstalling Pygments-2.4.2:
      Successfully uninstalled Pygments-2.4.2
  Found existing installation: pandas 0.25.1
    Uninstalling pandas-0.25.1:
      Successfully uninstalled pandas-0.25.1
Successfully installed appdirs-1.4.4 cobra-0.22.0 colorama-0.4.4 commonmark-0.9.1 depinfo-1.7.0 diskcache-5.2.1 future-0.18.2 h11-0.12.0 httpcore-0.13.3 httpx-0.18.1 importlib-metadata-4.0.1 importlib-resources-5.1.2 optlang-1.5.2 pandas-1.2.4 pydantic-1.8.1 pygments-2.9.0 python-libsbml-5.19.0 rfc3986-1.5.0 rich-10.1.0 ruamel.yaml-0.17.4 ruamel.yaml.clib-0.2.2 sniffio-1.2.0 swiglpk-5.0.3 typing-extensions-3.10.0.0 zipp-3.4.1


In [19]:
import cobra

# https://cobrapy.readthedocs.io/en/latest/io.html#SBML

In [20]:
# cloned from
# https://github.com/VirtualMetabolicHuman
# 2021-05-08
# this is the more inclusive model. The other Recon3DModel_301 is flux constrainted.
R3D = "thiele/Recon/Current_Version/Recon3D_301_Reconstruction/Recon3D_301.xml"

model = cobra.io.read_sbml_model(R3D)

model

0,1
Name,COBRAModel
Memory address,0x07fbf4543dc88
Number of metabolites,8399
Number of reactions,13543
Number of groups,111
Objective expression,1.0*biomass_reaction - 1.0*biomass_reaction_reverse_32a6c
Compartments,"Cytoplasm, Lysosome, Mitochondrion, Endoplasmic_reticulum, Extracellular, Peroxisome, Nucleus, Golgi, unknownCompartment4"


In [21]:
dir(model)

['__add__',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__enter__',
 '__eq__',
 '__exit__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_annotation',
 '_compartments',
 '_contexts',
 '_id',
 '_populate_solver',
 '_repr_html_',
 '_sbml',
 '_set_id_with_model',
 '_solver',
 '_tolerance',
 '_trimmed',
 '_trimmed_genes',
 '_trimmed_reactions',
 'add_boundary',
 'add_cons_vars',
 'add_groups',
 'add_metabolites',
 'add_reaction',
 'add_reactions',
 'annotation',
 'boundary',
 'compartments',
 'constraints',
 'copy',
 'demands',
 'description',
 'exchanges',
 'genes',
 'get_associated_groups',
 'get_metabolite_compartments',
 'groups',
 'id',
 'medium',
 'merge',
 'metabolite

In [25]:
model.metabolites[33].annotation

{'hmdb': 'HMDB06225',
 'inchi': 'InChI=1S/C28H44O3/c1-18(9-10-19(2)27(4,5)31)24-13-14-25-21(8-7-15-28(24,25)6)11-12-22-16-23(29)17-26(30)20(22)3/h9-12,18-19,23-26,29-31H,3,7-8,13-17H2,1-2,4-6H3/b10-9+,21-11+,22-12-/t18-,19+,23-,24-,25+,26+,28-/m1/s1',
 'pubchem.compound': '9547243'}

In [24]:
dir(model.metabolites[33])

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_annotation',
 '_bound',
 '_id',
 '_model',
 '_reaction',
 '_repr_html_',
 '_set_id_with_model',
 'annotation',
 'charge',
 'compartment',
 'constraint',
 'copy',
 'elements',
 'formula',
 'formula_weight',
 'id',
 'model',
 'name',
 'notes',
 'reactions',
 'remove_from_model',
 'shadow_price',
 'summary',
 'y']

In [32]:
[model.metabolites[33].formula,
model.metabolites[33].charge,
 model.metabolites[33].name,
 model.metabolites[33].id,
 model.metabolites[33]._id,
 model.metabolites[33].annotation
]

['C28H44O3',
 0,
 '1-Alpha,25-Dihydroxyvitamin D2',
 '1a25dhvitd2[m]',
 '1a25dhvitd2[m]',
 {'hmdb': 'HMDB06225',
  'inchi': 'InChI=1S/C28H44O3/c1-18(9-10-19(2)27(4,5)31)24-13-14-25-21(8-7-15-28(24,25)6)11-12-22-16-23(29)17-26(30)20(22)3/h9-12,18-19,23-26,29-31H,3,7-8,13-17H2,1-2,4-6H3/b10-9+,21-11+,22-12-/t18-,19+,23-,24-,25+,26+,28-/m1/s1',
  'pubchem.compound': '9547243'}]

In [33]:
len( model.reactions )

13543

In [35]:
model.reactions[33]

0,1
Reaction identifier,25VITD2Hm
Name,1-Alpha-Vitamin D-25-Hydroxylase (D2)
Memory address,0x07fbf62abc4e0
Stoichiometry,"25hvitd2[m] + h[m] + nadph[m] + o2[m] --> 1a25dhvitd2[m] + h2o[m] + nadp[m]  25-Hydroxyvitamin D2 + Proton + Nicotinamide Adenine Dinucleotide Phosphate - Reduced + Oxygen --> 1-Alpha,25-Dihydroxyvitamin D2 + Water + Nicotinamide Adenine Dinucleotide Phosphate"
GPR,1594.1
Lower bound,0.0
Upper bound,1000.0


In [36]:
dir(model.reactions[33])

['__add__',
 '__class__',
 '__copy__',
 '__deepcopy__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__isub__',
 '__le__',
 '__lt__',
 '__module__',
 '__mul__',
 '__ne__',
 '__new__',
 '__radd__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__weakref__',
 '_annotation',
 '_associate_gene',
 '_check_bounds',
 '_dissociate_gene',
 '_gene_reaction_rule',
 '_genes',
 '_id',
 '_lower_bound',
 '_metabolites',
 '_model',
 '_repr_html_',
 '_set_id_with_model',
 '_update_awareness',
 '_upper_bound',
 'add_metabolites',
 'annotation',
 'boundary',
 'bounds',
 'build_reaction_from_string',
 'build_reaction_string',
 'check_mass_balance',
 'compartments',
 'copy',
 'delete',
 'flux',
 'flux_expression',
 'forward_variable',
 'functiona

In [37]:
model.reactions[33].reactants

[<Metabolite h2o[m] at 0x7fbf454644e0>,
 <Metabolite nadp[m] at 0x7fbf45464940>,
 <Metabolite 1a25dhvitd2[m] at 0x7fbf45464c50>]

In [38]:
model.reactions[33].reactants

[<Metabolite o2[m] at 0x7fbf454646a0>,
 <Metabolite h[m] at 0x7fbf45464860>,
 <Metabolite nadph[m] at 0x7fbf45464748>,
 <Metabolite 25hvitd2[m] at 0x7fbf45467f98>]

In [39]:
model.reactions[33].reactants[3].annotation

{'hmdb': 'HMDB01438', 'pubchem.compound': '22833566'}

In [40]:
model.reactions[33].reactants[3].id

'25hvitd2[m]'

In [42]:
model.reactions[33].subsystem

''

In [43]:
model.groups[33]

<Group group34 at 0x7fbf42da0e10>

In [44]:
dir(model.groups[33])

['KIND_TYPES',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_annotation',
 '_id',
 '_kind',
 '_members',
 '_model',
 '_set_id_with_model',
 'add_members',
 'annotation',
 'id',
 'kind',
 'members',
 'name',
 'notes',
 'remove_members']

In [45]:
[
    model.groups[33].name,
    model.groups[33].members,
    model.groups[33].annotation,
]

['Glycine, serine, alanine, and threonine metabolism',
 [<Reaction 2AMACHYD at 0x7fbf62abccf8>,
  <Reaction AACTOOR at 0x7fbf62985fd0>,
  <Reaction ALASm at 0x7fbf62862c18>,
  <Reaction AOBUTDsm at 0x7fbf627cae48>,
  <Reaction BETALDHxm at 0x7fbf626c74a8>,
  <Reaction BHMT at 0x7fbf6273c710>,
  <Reaction CHOLD2m at 0x7fbf4537e6a0>,
  <Reaction DMGDHm at 0x7fbf45227b70>,
  <Reaction GCC2am at 0x7fbf44d17208>,
  <Reaction GCC2bim at 0x7fbf44d17780>,
  <Reaction GCC2cm at 0x7fbf44d17b70>,
  <Reaction GCCam at 0x7fbf44cf7828>,
  <Reaction GCCbim at 0x7fbf44d223c8>,
  <Reaction GCCcm at 0x7fbf44d0def0>,
  <Reaction GHMT2rm at 0x7fbf44cda0f0>,
  <Reaction GLYATm at 0x7fbf44bf9d68>,
  <Reaction GLYOp at 0x7fbf44c07940>,
  <Reaction GNMT at 0x7fbf44c1c5f8>,
  <Reaction PGCD at 0x7fbf4468c828>,
  <Reaction SARCOXp at 0x7fbf4441ae10>,
  <Reaction SERHL at 0x7fbf443d0eb8>,
  <Reaction SPTix at 0x7fbf44332eb8>,
  <Reaction r0160 at 0x7fbf44167eb8>,
  <Reaction r0552 at 0x7fbf43ff1ef0>,
  <Reaction

## Summary

- metabolites and reactions are quite standard.
- subsystem may not be defined in these SBML files.
- use "group" as pathway

Targeted structure in Azimuth 

```
│   ├── RECON3D (document)
│   │   ├── Reactions (subcollection)
│   │   ├── Compounds (subcollection)
│   │   └── Pathways (subcollection)
```

We will export one .py and one JSON per model.

In [49]:
# from metDataModel.core import Compound, Reaction, Pathway, metabolicModel

def metabolite2compound(M):
    # convert cobra Metabolite to metDataModel Compound
    Cpd = Compound()
    Cpd.src_id = M.id
    Cpd.id = M.id.split("[")[0]
    Cpd.name = M.name
    Cpd.charge = M.charge
    Cpd.charged_formula = M.formula
    Cpd.db_ids = M.annotation
    return Cpd

metabolite2compound(model.metabolites[33]).id

'1a25dhvitd2'

In [56]:
# list of Compounds
myCpds = []
anno = {}
for M in model.metabolites:
    anno[M.id.split("[")[0]] = M.annotation
    myCpds.append(metabolite2compound(M))
    
print("total, ", len(myCpds), len(anno))

unique = set([C.id for C in myCpds])
formula = set([C.charged_formula for C in myCpds if C.charged_formula])
# a little tricker to test on dictionaries
annotation = [v for v in anno.values() if v]

print("real, ", len(unique), len(formula), len(annotation))


total,  8399 4140
real,  4140 2631 2943


### Numbers in RECON3D

len(unique), len(formula), len(annotation) = 
4140, 2631, 2943

## Now do ATLAS

In [57]:
xmlFile = 'Human-GEM/ModelFiles/xml/HumanGEM.xml'
excelFile = 'Human-GEM/ModelFiles/xlsx/HumanGEM.xlsx'

model2 = cobra.io.read_sbml_model(xmlFile)

model2

'' is not a valid SBML 'SId'.
Adding exchange reaction EX_m00001x with default bounds for boundary metabolite: m00001x.
Adding exchange reaction EX_m00002x with default bounds for boundary metabolite: m00002x.
Adding exchange reaction EX_m00032x with default bounds for boundary metabolite: m00032x.
Adding exchange reaction EX_m00035x with default bounds for boundary metabolite: m00035x.
Adding exchange reaction EX_m00097x with default bounds for boundary metabolite: m00097x.
Adding exchange reaction EX_m00157x with default bounds for boundary metabolite: m00157x.
Adding exchange reaction EX_m00179x with default bounds for boundary metabolite: m00179x.
Adding exchange reaction EX_m00204x with default bounds for boundary metabolite: m00204x.
Adding exchange reaction EX_m00228x with default bounds for boundary metabolite: m00228x.
Adding exchange reaction EX_m00242x with default bounds for boundary metabolite: m00242x.
Adding exchange reaction EX_m00266x with default bounds for boundary m

Adding exchange reaction EX_m01365x with default bounds for boundary metabolite: m01365x.
Adding exchange reaction EX_m01368x with default bounds for boundary metabolite: m01368x.
Adding exchange reaction EX_m01369x with default bounds for boundary metabolite: m01369x.
Adding exchange reaction EX_m01370x with default bounds for boundary metabolite: m01370x.
Adding exchange reaction EX_m01374x with default bounds for boundary metabolite: m01374x.
Adding exchange reaction EX_m01383x with default bounds for boundary metabolite: m01383x.
Adding exchange reaction EX_m01385x with default bounds for boundary metabolite: m01385x.
Adding exchange reaction EX_m01393x with default bounds for boundary metabolite: m01393x.
Adding exchange reaction EX_m01396x with default bounds for boundary metabolite: m01396x.
Adding exchange reaction EX_m01397x with default bounds for boundary metabolite: m01397x.
Adding exchange reaction EX_m01398x with default bounds for boundary metabolite: m01398x.
Adding exc

Adding exchange reaction EX_m01795x with default bounds for boundary metabolite: m01795x.
Adding exchange reaction EX_m01796x with default bounds for boundary metabolite: m01796x.
Adding exchange reaction EX_m01797x with default bounds for boundary metabolite: m01797x.
Adding exchange reaction EX_m01799x with default bounds for boundary metabolite: m01799x.
Adding exchange reaction EX_m01800x with default bounds for boundary metabolite: m01800x.
Adding exchange reaction EX_m02458x with default bounds for boundary metabolite: m02458x.
Adding exchange reaction EX_m01821x with default bounds for boundary metabolite: m01821x.
Adding exchange reaction EX_m01822x with default bounds for boundary metabolite: m01822x.
Adding exchange reaction EX_m01830x with default bounds for boundary metabolite: m01830x.
Adding exchange reaction EX_m01833x with default bounds for boundary metabolite: m01833x.
Adding exchange reaction EX_m01840x with default bounds for boundary metabolite: m01840x.
Adding exc

Adding exchange reaction EX_m02164x with default bounds for boundary metabolite: m02164x.
Adding exchange reaction EX_m02167x with default bounds for boundary metabolite: m02167x.
Adding exchange reaction EX_m02170x with default bounds for boundary metabolite: m02170x.
Adding exchange reaction EX_m02171x with default bounds for boundary metabolite: m02171x.
Adding exchange reaction EX_m02174x with default bounds for boundary metabolite: m02174x.
Adding exchange reaction EX_m02182x with default bounds for boundary metabolite: m02182x.
Adding exchange reaction EX_m02184x with default bounds for boundary metabolite: m02184x.
Adding exchange reaction EX_m02191x with default bounds for boundary metabolite: m02191x.
Adding exchange reaction EX_m02193x with default bounds for boundary metabolite: m02193x.
Adding exchange reaction EX_m02198x with default bounds for boundary metabolite: m02198x.
Adding exchange reaction EX_m02199x with default bounds for boundary metabolite: m02199x.
Adding exc

Adding exchange reaction EX_m02754x with default bounds for boundary metabolite: m02754x.
Adding exchange reaction EX_m02769x with default bounds for boundary metabolite: m02769x.
Adding exchange reaction EX_m02770x with default bounds for boundary metabolite: m02770x.
Adding exchange reaction EX_m02772x with default bounds for boundary metabolite: m02772x.
Adding exchange reaction EX_m02783x with default bounds for boundary metabolite: m02783x.
Adding exchange reaction EX_m02785x with default bounds for boundary metabolite: m02785x.
Adding exchange reaction EX_m02786x with default bounds for boundary metabolite: m02786x.
Adding exchange reaction EX_m02789x with default bounds for boundary metabolite: m02789x.
Adding exchange reaction EX_m02813x with default bounds for boundary metabolite: m02813x.
Adding exchange reaction EX_m02814x with default bounds for boundary metabolite: m02814x.
Adding exchange reaction EX_m02815x with default bounds for boundary metabolite: m02815x.
Adding exc

Adding exchange reaction EX_m00003x with default bounds for boundary metabolite: m00003x.
Adding exchange reaction EX_m00008x with default bounds for boundary metabolite: m00008x.
Adding exchange reaction EX_m00010x with default bounds for boundary metabolite: m00010x.
Adding exchange reaction EX_m00017x with default bounds for boundary metabolite: m00017x.
Adding exchange reaction EX_m00019x with default bounds for boundary metabolite: m00019x.
Adding exchange reaction EX_m00021x with default bounds for boundary metabolite: m00021x.
Adding exchange reaction EX_m00028x with default bounds for boundary metabolite: m00028x.
Adding exchange reaction EX_m00094x with default bounds for boundary metabolite: m00094x.
Adding exchange reaction EX_m00104x with default bounds for boundary metabolite: m00104x.
Adding exchange reaction EX_m00105x with default bounds for boundary metabolite: m00105x.
Adding exchange reaction EX_m00111x with default bounds for boundary metabolite: m00111x.
Adding exc

Adding exchange reaction EX_m00503x with default bounds for boundary metabolite: m00503x.
Adding exchange reaction EX_m00519x with default bounds for boundary metabolite: m00519x.
Adding exchange reaction EX_1hibup_S_x with default bounds for boundary metabolite: 1hibup_S_x.
Adding exchange reaction EX_1hibupglu_S_x with default bounds for boundary metabolite: 1hibupglu_S_x.
Adding exchange reaction EX_m00260x with default bounds for boundary metabolite: m00260x.
Adding exchange reaction EX_m00265x with default bounds for boundary metabolite: m00265x.
Adding exchange reaction EX_m00270x with default bounds for boundary metabolite: m00270x.
Adding exchange reaction EX_m00279x with default bounds for boundary metabolite: m00279x.
Adding exchange reaction EX_m00291x with default bounds for boundary metabolite: m00291x.
Adding exchange reaction EX_m00293x with default bounds for boundary metabolite: m00293x.
Adding exchange reaction EX_m00294x with default bounds for boundary metabolite: m

Adding exchange reaction EX_isolvstacid_x with default bounds for boundary metabolite: isolvstacid_x.
Adding exchange reaction EX_3hsmvacid_x with default bounds for boundary metabolite: 3hsmvacid_x.
Adding exchange reaction EX_34dhpe_x with default bounds for boundary metabolite: 34dhpe_x.
Adding exchange reaction EX_m00727x with default bounds for boundary metabolite: m00727x.
Adding exchange reaction EX_m00729x with default bounds for boundary metabolite: m00729x.
Adding exchange reaction EX_m00739x with default bounds for boundary metabolite: m00739x.
Adding exchange reaction EX_3hpppn_x with default bounds for boundary metabolite: 3hpppn_x.
Adding exchange reaction EX_3hpppnohgluc_x with default bounds for boundary metabolite: 3hpppnohgluc_x.
Adding exchange reaction EX_3dhcdchol_x with default bounds for boundary metabolite: 3dhcdchol_x.
Adding exchange reaction EX_3dhchol_x with default bounds for boundary metabolite: 3dhchol_x.
Adding exchange reaction EX_3ohsubac_x with defaul

Adding exchange reaction EX_m01052x with default bounds for boundary metabolite: m01052x.
Adding exchange reaction EX_m01054x with default bounds for boundary metabolite: m01054x.
Adding exchange reaction EX_56eppvs_x with default bounds for boundary metabolite: 56eppvs_x.
Adding exchange reaction EX_5eipenc_x with default bounds for boundary metabolite: 5eipenc_x.
Adding exchange reaction EX_5ohhexa_x with default bounds for boundary metabolite: 5ohhexa_x.
Adding exchange reaction EX_5cysdopa_x with default bounds for boundary metabolite: 5cysdopa_x.
Adding exchange reaction EX_5cysgly34dhphe_x with default bounds for boundary metabolite: 5cysgly34dhphe_x.
Adding exchange reaction EX_m01134x with default bounds for boundary metabolite: m01134x.
Adding exchange reaction EX_m01139x with default bounds for boundary metabolite: m01139x.
Adding exchange reaction EX_m01138x with default bounds for boundary metabolite: m01138x.
Adding exchange reaction EX_m01073x with default bounds for boun

Adding exchange reaction EX_alaglylys_x with default bounds for boundary metabolite: alaglylys_x.
Adding exchange reaction EX_alahisala_x with default bounds for boundary metabolite: alahisala_x.
Adding exchange reaction EX_alalysthr_x with default bounds for boundary metabolite: alalysthr_x.
Adding exchange reaction EX_argalaphe_x with default bounds for boundary metabolite: argalaphe_x.
Adding exchange reaction EX_argalaala_x with default bounds for boundary metabolite: argalaala_x.
Adding exchange reaction EX_argalathr_x with default bounds for boundary metabolite: argalathr_x.
Adding exchange reaction EX_argarg_x with default bounds for boundary metabolite: argarg_x.
Adding exchange reaction EX_argarglys_x with default bounds for boundary metabolite: argarglys_x.
Adding exchange reaction EX_argargmet_x with default bounds for boundary metabolite: argargmet_x.
Adding exchange reaction EX_argcysgly_x with default bounds for boundary metabolite: argcysgly_x.
Adding exchange reaction E

Adding exchange reaction EX_m01689x with default bounds for boundary metabolite: m01689x.
Adding exchange reaction EX_m01690x with default bounds for boundary metabolite: m01690x.
Adding exchange reaction EX_m01741x with default bounds for boundary metabolite: m01741x.
Adding exchange reaction EX_decdicrn_x with default bounds for boundary metabolite: decdicrn_x.
Adding exchange reaction EX_c10crn_x with default bounds for boundary metabolite: c10crn_x.
Adding exchange reaction EX_c101crn_x with default bounds for boundary metabolite: c101crn_x.
Adding exchange reaction EX_dca3s_x with default bounds for boundary metabolite: dca3s_x.
Adding exchange reaction EX_dca24g_x with default bounds for boundary metabolite: dca24g_x.
Adding exchange reaction EX_dca3g_x with default bounds for boundary metabolite: dca3g_x.
Adding exchange reaction EX_diholineth_x with default bounds for boundary metabolite: diholineth_x.
Adding exchange reaction EX_docohxeth_x with default bounds for boundary met

Adding exchange reaction EX_hxa_x with default bounds for boundary metabolite: hxa_x.
Adding exchange reaction EX_c6crn_x with default bounds for boundary metabolite: c6crn_x.
Adding exchange reaction EX_hexgly_x with default bounds for boundary metabolite: hexgly_x.
Adding exchange reaction EX_hdl_hs_x with default bounds for boundary metabolite: hdl_hs_x.
Adding exchange reaction EX_hisargcys_x with default bounds for boundary metabolite: hisargcys_x.
Adding exchange reaction EX_hisargser_x with default bounds for boundary metabolite: hisargser_x.
Adding exchange reaction EX_hisasp_x with default bounds for boundary metabolite: hisasp_x.
Adding exchange reaction EX_hiscyscys_x with default bounds for boundary metabolite: hiscyscys_x.
Adding exchange reaction EX_hisglu_x with default bounds for boundary metabolite: hisglu_x.
Adding exchange reaction EX_hisglnala_x with default bounds for boundary metabolite: hisglnala_x.
Adding exchange reaction EX_hisglugln_x with default bounds for 

Adding exchange reaction EX_pcholn281_hs_x with default bounds for boundary metabolite: pcholn281_hs_x.
Adding exchange reaction EX_lysargleu_x with default bounds for boundary metabolite: lysargleu_x.
Adding exchange reaction EX_lyscyshis_x with default bounds for boundary metabolite: lyscyshis_x.
Adding exchange reaction EX_lysglnphe_x with default bounds for boundary metabolite: lysglnphe_x.
Adding exchange reaction EX_lysgluglu_x with default bounds for boundary metabolite: lysgluglu_x.
Adding exchange reaction EX_lyslyslys_x with default bounds for boundary metabolite: lyslyslys_x.
Adding exchange reaction EX_lyspheile_x with default bounds for boundary metabolite: lyspheile_x.
Adding exchange reaction EX_lystrparg_x with default bounds for boundary metabolite: lystrparg_x.
Adding exchange reaction EX_lystyrile_x with default bounds for boundary metabolite: lystyrile_x.
Adding exchange reaction EX_lysvalphe_x with default bounds for boundary metabolite: lysvalphe_x.
Adding exchang

Adding exchange reaction EX_phacgly_x with default bounds for boundary metabolite: phacgly_x.
Adding exchange reaction EX_pheasnmet_x with default bounds for boundary metabolite: pheasnmet_x.
Adding exchange reaction EX_pheasp_x with default bounds for boundary metabolite: pheasp_x.
Adding exchange reaction EX_pheglnphe_x with default bounds for boundary metabolite: pheglnphe_x.
Adding exchange reaction EX_pheleu_x with default bounds for boundary metabolite: pheleu_x.
Adding exchange reaction EX_pheleuasp_x with default bounds for boundary metabolite: pheleuasp_x.
Adding exchange reaction EX_pheleuhis_x with default bounds for boundary metabolite: pheleuhis_x.
Adding exchange reaction EX_phelysala_x with default bounds for boundary metabolite: phelysala_x.
Adding exchange reaction EX_phelyspro_x with default bounds for boundary metabolite: phelyspro_x.
Adding exchange reaction EX_phephe_x with default bounds for boundary metabolite: phephe_x.
Adding exchange reaction EX_phepheasn_x wi

Adding exchange reaction EX_m02956x with default bounds for boundary metabolite: m02956x.
Adding exchange reaction EX_tacr_x with default bounds for boundary metabolite: tacr_x.
Adding exchange reaction EX_tcdca3s_x with default bounds for boundary metabolite: tcdca3s_x.
Adding exchange reaction EX_tca3s_x with default bounds for boundary metabolite: tca3s_x.
Adding exchange reaction EX_tdca3s_x with default bounds for boundary metabolite: tdca3s_x.
Adding exchange reaction EX_thyochol_x with default bounds for boundary metabolite: thyochol_x.
Adding exchange reaction EX_tudca3s_x with default bounds for boundary metabolite: tudca3s_x.
Adding exchange reaction EX_tetdec2crn_x with default bounds for boundary metabolite: tetdec2crn_x.
Adding exchange reaction EX_tetdece1crn_x with default bounds for boundary metabolite: tetdece1crn_x.
Adding exchange reaction EX_thrargtyr_x with default bounds for boundary metabolite: thrargtyr_x.
Adding exchange reaction EX_thrasntyr_x with default bou

Adding exchange reaction EX_valtrpphe_x with default bounds for boundary metabolite: valtrpphe_x.
Adding exchange reaction EX_valtrpval_x with default bounds for boundary metabolite: valtrpval_x.
Adding exchange reaction EX_valval_x with default bounds for boundary metabolite: valval_x.
Adding exchange reaction EX_vanillac_x with default bounds for boundary metabolite: vanillac_x.
Adding exchange reaction EX_vldl_hs_x with default bounds for boundary metabolite: vldl_hs_x.
Adding exchange reaction EX_m00186x with default bounds for boundary metabolite: m00186x.
Adding exchange reaction EX_acmpglut_x with default bounds for boundary metabolite: acmpglut_x.
Adding exchange reaction EX_meracmp_x with default bounds for boundary metabolite: meracmp_x.
Adding exchange reaction EX_acmp_x with default bounds for boundary metabolite: acmp_x.
Adding exchange reaction EX_nfdac_x with default bounds for boundary metabolite: nfdac_x.
Adding exchange reaction EX_m01268x with default bounds for boun

Adding exchange reaction EX_m01675x with default bounds for boundary metabolite: m01675x.
Adding exchange reaction EX_m01696x with default bounds for boundary metabolite: m01696x.
Adding exchange reaction EX_m01698x with default bounds for boundary metabolite: m01698x.
Adding exchange reaction EX_m01705x with default bounds for boundary metabolite: m01705x.
Adding exchange reaction EX_m01708x with default bounds for boundary metabolite: m01708x.
Adding exchange reaction EX_m01767x with default bounds for boundary metabolite: m01767x.
Adding exchange reaction EX_m01771x with default bounds for boundary metabolite: m01771x.
Adding exchange reaction EX_m01773x with default bounds for boundary metabolite: m01773x.
Adding exchange reaction EX_m01778x with default bounds for boundary metabolite: m01778x.
Adding exchange reaction EX_m01790x with default bounds for boundary metabolite: m01790x.
Adding exchange reaction EX_m01798x with default bounds for boundary metabolite: m01798x.
Adding exc

Adding exchange reaction EX_m02439x with default bounds for boundary metabolite: m02439x.
Adding exchange reaction EX_m02441x with default bounds for boundary metabolite: m02441x.
Adding exchange reaction EX_m02444x with default bounds for boundary metabolite: m02444x.
Adding exchange reaction EX_m02446x with default bounds for boundary metabolite: m02446x.
Adding exchange reaction EX_m02447x with default bounds for boundary metabolite: m02447x.
Adding exchange reaction EX_m02448x with default bounds for boundary metabolite: m02448x.
Adding exchange reaction EX_m02449x with default bounds for boundary metabolite: m02449x.
Adding exchange reaction EX_m02451x with default bounds for boundary metabolite: m02451x.
Adding exchange reaction EX_m02456x with default bounds for boundary metabolite: m02456x.
Adding exchange reaction EX_m02457x with default bounds for boundary metabolite: m02457x.
Adding exchange reaction EX_m02460x with default bounds for boundary metabolite: m02460x.
Adding exc

Adding exchange reaction EX_m02920x with default bounds for boundary metabolite: m02920x.
Adding exchange reaction EX_m02921x with default bounds for boundary metabolite: m02921x.
Adding exchange reaction EX_m02922x with default bounds for boundary metabolite: m02922x.
Adding exchange reaction EX_m02925x with default bounds for boundary metabolite: m02925x.
Adding exchange reaction EX_m02927x with default bounds for boundary metabolite: m02927x.
Adding exchange reaction EX_m02929x with default bounds for boundary metabolite: m02929x.
Adding exchange reaction EX_m02933x with default bounds for boundary metabolite: m02933x.
Adding exchange reaction EX_m02938x with default bounds for boundary metabolite: m02938x.
Adding exchange reaction EX_m02939x with default bounds for boundary metabolite: m02939x.
Adding exchange reaction EX_m02940x with default bounds for boundary metabolite: m02940x.
Adding exchange reaction EX_m02942x with default bounds for boundary metabolite: m02942x.
Adding exc

0,1
Name,
Memory address,0x07fbf4184f8d0
Number of metabolites,10073
Number of reactions,14770
Number of groups,145
Objective expression,1.0*biomass_human - 1.0*biomass_human_reverse_fb2f2
Compartments,"Cytosol, Extracellular, Lysosome, Endoplasmic reticulum, Mitochondria, Peroxisome, Golgi apparatus, Nucleus, Boundary, Inner mitochondria"


In [61]:
model2.metabolites[99]

0,1
Metabolite identifier,m00043c
Name,(2E)-eicosenoyl-CoA
Memory address,0x07fbf4352fef0
Formula,C41H68N7O17P3S
Compartment,c
In 2 reaction(s),"HMR_2203, HMR_2204"


In [62]:
dir(model2.metabolites[99])

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_annotation',
 '_bound',
 '_id',
 '_model',
 '_reaction',
 '_repr_html_',
 '_set_id_with_model',
 'annotation',
 'charge',
 'compartment',
 'constraint',
 'copy',
 'elements',
 'formula',
 'formula_weight',
 'id',
 'model',
 'name',
 'notes',
 'reactions',
 'remove_from_model',
 'shadow_price',
 'summary',
 'y']

In [72]:
def at_metabolite2compound(M):
    # convert cobra Metabolite to metDataModel Compound
    Cpd = Compound()
    Cpd.src_id = M.id
    Cpd.id = M.id           #[2:8]
    Cpd.name = M.name
    Cpd.charge = M.charge
    Cpd.charged_formula = M.formula
    Cpd.db_ids = M.annotation
    return Cpd

[at_metabolite2compound(model2.metabolites[199]).id,
 at_metabolite2compound(model2.metabolites[99]).src_id,
 at_metabolite2compound(model2.metabolites[99]).charged_formula,
 at_metabolite2compound(model2.metabolites[99]).db_ids,
]

['m00103c', 'm00043c', 'C41H68N7O17P3S', {'sbo': 'SBO:0000247'}]

In [71]:
# list of Compounds
atCpds = []
anno2 = {}
for M in model2.metabolites:
    anno2[M.id] = M.annotation
    atCpds.append(at_metabolite2compound(M))
    
print("total, ", len(atCpds), len(anno2))

unique = set([C.id for C in atCpds])
formula = set([C.charged_formula for C in atCpds if C.charged_formula])
# a little tricker to test on dictionaries
annotation = [v for v in anno2.values() if v]

print("real, ", len(unique), len(formula), len(annotation))


total,  10073 10073
real,  10073 2680 10073


In [73]:
list(unique)[:9]

['asnphecys_c',
 'm03155s',
 'pcholar_hs_c',
 'm01819x',
 'c12dc_p',
 'm00333m',
 'm02349s',
 'm00630m',
 '35dhpvs_s']

**The ATLA format is not all compatible with cobra**

Do something else for now:

In [74]:
#
# Will check the JSON data in the repo
# shuzhao@canyon:~/projects/Azimuth/model_porting$ ls Human-GEM/data/annotation/
# humanGEMMetAssoc.JSON  humanGEMRxnAssoc.JSON

import json
jm = json.load(open('Human-GEM/data/annotation/humanGEMMetAssoc.JSON'))

print(jm.keys())

dict_keys(['mets', 'metsNoComp', 'metBiGGID', 'metKEGGID', 'metHMDBID', 'metChEBIID', 'metPubChemID', 'metLipidMapsID', 'metEHMNID', 'metHepatoNET1ID', 'metRecon3DID', 'metMetaNetXID'])


In [75]:
for k,v in jm.items(): 
    print( v[:5] )

['m00001c', 'm00001s', 'm00002c', 'm00002s', 'm00003c']
['m00001', 'm00001', 'm00002', 'm00002', 'm00003']
['carveol', 'carveol', 'appnn', 'appnn', '']
['C00964', 'C00964', 'C09880', 'C09880', '']
['', '', 'HMDB06525', 'HMDB06525', '']
['CHEBI:15389', 'CHEBI:15389', 'CHEBI:36740', 'CHEBI:36740', '']
['', '', '6654', '6654', '']
['', '', '', '', 'LMFA01030283']
['', '', '', '', '']
['', '', '', '', '']
['carveol', 'carveol', 'appnn', 'appnn', 'M00003']
['MNXM45735', 'MNXM45735', 'MNXM163755', 'MNXM163755', 'MNXM150165; MNXM27815']


In [76]:
# How many unique IDs in each?
for k,v in jm.items(): 
    print(k, len(set(v)))

mets 10138
metsNoComp 4168
metBiGGID 2347
metKEGGID 1647
metHMDBID 713
metChEBIID 1247
metPubChemID 1382
metLipidMapsID 487
metEHMNID 712
metHepatoNET1ID 773
metRecon3DID 5089
metMetaNetXID 2972


In [78]:
# there is 4168 metsNoComp

# check HMDB or PubChem
HP = {}
for ii in range(10138):
    HP[jm['metsNoComp'][ii]] = jm['metHMDBID'][ii] or jm['metPubChemID'][ii]
    
print(len(HP))
print(len(set([x for x in HP.values() if x])))

4168
1411


### Numbers in ATLAS

len(unique) 4168

len(formula) 2680

len(annotation) = 1411 # HMDB or PubChem

## E. coli

http://bigg.ucsd.edu/models/

Downloads last updated Oct 31, 2019 

In [79]:
ecoli = 'EColi-iJO1366/iJO1366.xml'

model3 = cobra.io.read_sbml_model(ecoli)

In [80]:
model3

0,1
Name,iJO1366
Memory address,0x07fbf384ab128
Number of metabolites,1805
Number of reactions,2583
Number of groups,0
Objective expression,1.0*BIOMASS_Ec_iJO1366_core_53p95M - 1.0*BIOMASS_Ec_iJO1366_core_53p95M_reverse_5c8b1
Compartments,"cytosol, extracellular space, periplasm"


In [81]:
model3.metabolites[99]

0,1
Metabolite identifier,2agpe181_c
Name,2-Acyl-sn-glycero-3-phosphoethanolamine (n-C18:1)
Memory address,0x07fbf38565f28
Formula,C23H46NO7P1
Compartment,c
In 4 reaction(s),"2AGPE181tipp, LPLIPAL2E181, LPLIPAL2ATE181, 2AGPEAT181"


In [84]:
def metabolite3compound(M):
    # convert cobra Metabolite to metDataModel Compound
    Cpd = Compound()
    Cpd.src_id = M.id
    Cpd.id = M.id.split("_")[0]
    Cpd.name = M.name
    Cpd.charge = M.charge
    Cpd.charged_formula = M.formula
    Cpd.db_ids = M.annotation
    return Cpd

[metabolite3compound(model3.metabolites[99]).id,
 metabolite3compound(model3.metabolites[99]).src_id,
 metabolite3compound(model3.metabolites[99]).charged_formula,
 metabolite3compound(model3.metabolites[99]).db_ids,
]

['2agpe181',
 '2agpe181_c',
 'C23H46NO7P1',
 {'sbo': 'SBO:0000247',
  'bigg.metabolite': '2agpe181',
  'metanetx.chemical': 'MNXM3449'}]

In [86]:
# list of Compounds
myCpds3 = []
anno3 = {}
for M in model3.metabolites:
    anno3[M.id.split("_")[0]] = M.annotation
    myCpds3.append(metabolite3compound(M))
    
print("total, ", len(myCpds3), len(anno3))

unique = set([C.id for C in myCpds3])
formula = set([C.charged_formula for C in myCpds3 if C.charged_formula])
# a little tricker to test on dictionaries
annotation = [v for v in anno3.values() if 'metanetx' in str(v)]

print("real, ", len(unique), len(formula), len(annotation))



total,  1805 1100
real,  1100 911 1100


### Numbers in E. coli iJO1366

len(unique) 1100
len(formula) 911
len(annotation) = 1100 ?

## WormJam model



In [87]:
worm = 'worm/WormJam-GEM-20190101_L3.xml'

model4 = cobra.io.read_sbml_model(worm)

'' is not a valid SBML 'SId'.
No objective in listOfObjectives
No objective coefficients in model. Unclear what should be optimized


In [88]:
model4

0,1
Name,
Memory address,0x07fbf317d02b0
Number of metabolites,2834
Number of reactions,3530
Number of groups,114
Objective expression,0
Compartments,"Cytosol, Mitochondrion, Nucleus, Extracellular, Mitochondrial Inner Membrane"


In [90]:
model4.metabolites[9]

0,1
Metabolite identifier,g3m8masn_c
Name,(alpha-D-Glucosyl)3-(alpha-D-mannosyl)8-beta-D-...
Memory address,0x07fbf38f20240
Formula,
Compartment,c
In 4 reaction(s),"2_4_1_119_RXN_c, RC05979, RC05976, 3_2_1_106_RXN_c"


In [91]:
metabolites4 = [M.id for M in model4.metabolites]
print(len(metabolites4))

metabolites4 = set([x.split('_')[0] for x in metabolites4])
print(len(metabolites4))

2834
1271


### Numbers in WormJam

len(unique) 1271 

formula and annotation yet unclear

needed to pull from elsewhere

### Numbers in Drosophila


https://www.nature.com/articles/s41598-019-53532-4

Schönborn, J.W., Jehrke, L., Mettler-Altmann, T. et al. FlySilico: Flux balance modeling of Drosophila larval growth and resource allocation. Sci Rep 9, 17156 (2019). https://doi.org/10.1038/s41598-019-53532-4

This is a curated model, all metabolites having public IDs.

number of unique metabolites = 203


In [92]:
fly = 'Drosophia/41598_2019_53532_MOESM2_ESM.xls'
xlsxData = xlrd.open_workbook(fly)
sheets = xlsxData.sheets()
for x in sheets: print(x.name)

Title page
reactions
metabolites


In [93]:
for ii in range(5):
    print(sheets[2].row_values(ii))

['Abbreviation', 'officialname', 'formula(neutral)', 'formula', 'Charge', 'Compartment', 'KEGG ID', 'PubChem ID', 'ChEBI ID', 'InChI string', 'Smiles', 'HMDB', 'Notes']
['13dpg[c]', '3-Phospho-D-glyceroyl phosphate', 'C3H8O10P2', 'C3H4O10P2', -4.0, 'c', 'C00236', 3535.0, 16001.0, '1S/C3H8O10P2/c4-2(1-12-14(6,7)8)3(5)13-15(9,10)11/h2,4H,1H2,(H2,6,7,8)(H2,9,10,11)/t2-/m1/s1', 'O[C@H](COP(O)(O)=O)C(=O)OP(O)(O)=O', '', '']
['2pg[c]', 'D-Glycerate 2-phosphate', 'C3H7O7P', 'C3H4O7P', -3.0, 'c', 'C00631', 3904.0, 17835.0, '1S/C3H7O7P/c4-1-2(3(5)6)10-11(7,8)9/h2,4H,1H2,(H,5,6)(H2,7,8,9)/t2-/m1/s1', 'OC[C@@H](OP(O)(O)=O)C(O)=O', 'HMDB03391', '']
['3pg[c]', '3-Phospho-D-glycerate', 'C3H7O7P', 'C3H4O7P', -3.0, 'c', 'C00197', 3497.0, 17794.0, '1S/C3H7O7P/c4-2(3(5)6)1-10-11(7,8)9/h2,4H,1H2,(H,5,6)(H2,7,8,9)/p-3/t2-/m1/s1', 'C(OP(=O)([O-])[O-])C(O)C(=O)[O-]', '', '']
['6pgc[c]', '6-Phospho-D-gluconate', 'C6H13O10P', 'C6H10O10P', -3.0, 'c', 'C00345', 3638.0, 48928.0, '1S/C6H13O10P/c7-2(1-16-17(13,14)

In [95]:
fly_m = [sheets[2].row_values(ii)[0] for ii in range(1, sheets[2].nrows)]

fly_m_unique = set([x.split("[")[0] for x in fly_m])

print( len(fly_m), len(fly_m_unique ))

293 203


**I needed the above information to make some quick figures.**

Now getting to the export of models to our own format. 

Use RECON3D as example. 
Continued on a separate notebook - metabolicModel_RECON3D_20210510.json.