# Introduction
Previously, we've noticed that there is quite some redundant information in the notes and annotations field. Generally, it is more accepted to have the information stored in the annotations. 
Therefore in this model, I will move the information that is also stored in annotations to this field. Overall the notes section should then be empty (except for some comments)

In [1]:
import cameo
import pandas as pd
import cobra.io

In [121]:
model = cobra.io.read_sbml_model('../model/g-thermo.xml')

In [3]:
model_e_coli = cameo.load_model('iML1515')

# Metabolites
If metabolites already have information for that field in the annotations, we will let that one predominate. Alternatively, we will then fill it up with the information from notes.

Notes contain:
- KEGG: should be in annotation
- NAME: Leave for now. We've already gone through many of these by hand, due to an issue of automatic conversion the end of some names were cut off. Also, as this contains some alternate names for the metabolite it can be usefull to keep.
- ChEBI: should be in annotation


In [5]:
#check KEGG info
for met in model.metabolites:
    try:
        kegg_anno = met.annotation['kegg.compound'] #try to find if there is already info in the annotation
        if len(kegg_anno) > 0:
            try: 
                del met.notes['KEGG'] #delete the notes if the metabolite has it
            except KeyError: 
                continue #if it doesn't have a KEGG in the notes but has the annotation, just continue
        else: 
            continue
    except KeyError:
        try:
            kegg_note = met.notes['KEGG'] #lift KEGG ID from notes
            met.annotation['kegg.compound'] = kegg_note #move KEGG ID to annotation
            del met.notes['KEGG'] #delete the info in notes
        except KeyError:
            print (met.id)

focytB561_c
Biomass_c
Biomass_e
sbt__D_e
melib_e
tag1p__D_c
amylose_e
pyr_e
for_e


So there are 9 metabolites without KEGG in the annotation or notes. For the Biomass, this makes sense. For the others I will check and fix this. 

In [6]:
model.metabolites.focytB561_c.annotation['kegg.compound'] = 'C05183'

In [7]:
model.metabolites.sbt__D_e.annotation = model.metabolites.sbt__D_c.annotation

In [8]:
model.metabolites.melib_e.annotation = model.metabolites.melib_c.annotation

In [9]:
#tagatose 1 phosphate doesnt have a kegg number, but does have a MetanetX: so I will add this instead.
model.metabolites.tag1p__D_c.annotation['metanetx.chemical'] = 'MNXM11293'

In [10]:
model.metabolites.amylose_e.annotation = model.metabolites.amylose_c.annotation

In [11]:
model.metabolites.pyr_e.annotation = model.metabolites.pyr_c.annotation

In [12]:
model.metabolites.for_e.annotation = model.metabolites.for_c.annotation

In [13]:
#save&commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

### Chebi

In [4]:
#check Chebi info
for met in model.metabolites:
    try:
        chebi_anno = met.annotation['chebi'] #try to find if there is already info in the annotation
        if len(chebi_anno) > 0:
            try: 
                del met.notes['ChEBI'] #delete the notes if the metabolite has it
            except KeyError: 
                continue #if it doesn't have a KEGG in the notes but has the annotation, just continue
        else: 
            continue
    except KeyError:
        try:
            kegg_note = met.notes['ChEBI'] #lift KEGG ID from notes
            met.annotation['chebi'] = kegg_note #move KEGG ID to annotation
            del met.notes['ChEBI'] #delete the info in notes
        except KeyError:
            print (met.id)

octdecacp_c
3hpalmACP_c
3hmrsACP_c
dodecacp_c
3hdecACP_c
toct2eACP_c
3hoctACP_c
but2eACP_c
6dg_c
cellb_c
stys_c
cellulose_c
starch_c
focytB561_c
decdp_c
dextrin_c
5mtr_c
cellb6p_c
3hbutACP_c
Biomass_c
Biomass_e
cmtdepp_c
cthzp_c
enzcys_c
enzscys_c
aglyc3p_c
cellb_e
gtbi_e
gtbi_c
tura_c
tura_e
kdg2_e
kdg2_c
dglcn5_e
dglcn5_c
mdgp_e
mdgp_c
tag__D_e
tag1p__D_c
tagdp__D_c


In [5]:
#save & commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

Quite some metabolites do not have the Chebi associated to them. However this is fine, as they will contain atleast a KEGG or metanetX in the annotation. Now I will check that every metabolite only has the name left in the notes field.

In [18]:
for met in model.metabolites:
    try:
        met.notes['PubChem']
        del met.notes['PubChem']
    except KeyError:
        continue 

In [19]:
for met in model.metabolites:
    try :
        met.notes['EXACT_MASS']
        del met.notes['EXACT_MASS']
    except KeyError:
        continue 

In [20]:
for met in model.metabolites:
    try:
        met.notes['MODULE']
        del met.notes['MODULE']
    except KeyError:
        continue 

In [21]:
for met in model.metabolites:
    try:
        met.notes['KNApSAcK']
        del met.notes['KNApSAcK']
    except KeyError:
        continue 

In [22]:
for met in model.metabolites:
    try:
        met.notes['3DMET']
        del met.notes['3DMET']
    except KeyError:
        continue 

In [23]:
for met in model.metabolites:
    try:
        met.notes['NIKKAJI']
        del met.notes['NIKKAJI']
    except KeyError:
        continue 


In [24]:
for met in model.metabolites:
    try:
        met.notes['PDB-CCD']
        del met.notes['PDB-CCD']
    except KeyError:
        continue 

In [25]:
for met in model.metabolites:
    try:
        met.notes['LIPIDMAPS']
        del met.notes['LIPIDMAPS']
    except KeyError:
        continue 

In [26]:
for met in model.metabolites:
    try:
        met.notes['CAS']
        del met.notes['CAS']
    except KeyError:
        continue 

In [27]:
for met in model.metabolites:
    if len (met.notes) == 0: #metabolites with no notes
        continue
    elif len(met.notes) == 1: #metabolites with just one note
        try:
            name = met.notes['NAME']
        except KeyError:
            print (met.id)
    else:
        print (met.id)

atp_c
actn__R_c
actn_c
hacc_c
mdgp_e
mdgp_c


So it some metabolites still have some notes. I will inspect what they are on a case by case basis and decide what to do with them.

In [28]:
model.metabolites.atp_c.notes = {}

In [29]:
del model.metabolites.actn__R_c.notes['KEGG_COMMENT']

In [30]:
del model.metabolites.actn_c.notes['KEGG_COMMENT']

In [31]:
del model.metabolites.hacc_c.notes['KEGG_COMMENT']

In [32]:
model.metabolites.mdgp_e.notes = {}

In [33]:
model.metabolites.mdgp_c.notes={}

In [35]:
#save&commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

Check that every metabolite has either Chebi or Kegg annotation

In [36]:
for met in model.metabolites: 
    if len(met.annotation) == 0:
        print(met.id)
    else:
        try:
            met.annotation['kegg.compound']
        except KeyError:
            try:
                met.annotation['chebi']
            except KeyError:
                print(met.id)

Biomass_c
Biomass_e
tag1p__D_c


So only one metabolite lacks a Kegg or chebi annotation, tag1p__D_c. But we gave it a metanetx annotation so the metabolite can still be traced back.

In [37]:
model.metabolites.tag1p__D_c.annotation

{'metanetx.chemical': 'MNXM11293'}

# Reactions
Annotation field can contain: ec-code, kegg-reaction, metanetx.reaction, rhea and sbo. These fields may also be present in the notes and so should be removed from notes to prevent duplicate information. 

So from the notes:
- Subsystem: I will remove this, as it has been moved to the group field and modified in notebook 34. 
- Gene_association: Remove, this is stored in the GPR field also.
- KEGG ID: move to kegg.reaction annotation and delete
- Enzyme: move to ec-code and remove
- Name: Contains some alternate names for reactions, can be left. 
- Definition: This describes the stoichiometry of the reaction. This can just be found when looking at the reaction in the model itself, as may become redundant if the reaction is changed and this is wrong.

In [33]:
#check KEGG info
for rct in model.reactions:
    if rct.id[:2] in 'EX':
        continue
    elif rct.id[-1:] in ['t', 's', 'c']: #get rid of transports, we will tackle those later
        continue
    else:
        try:
            kegg_anno = rct.annotation['kegg.reaction'] #try to find if there is already info in the annotation
            if len(kegg_anno) > 0:
                try: 
                    del rct.notes['KEGG ID'] #delete the notes if the metabolite has it
                except KeyError: 
                    continue #if it doesn't have a KEGG in the notes but has the annotation, just continue
            else: 
                continue
        except KeyError:
            try:
                kegg_note = rct.notes['KEGG ID'] #lift KEGG ID from notes
                rct.annotation['kegg.reaction'] = kegg_note #move KEGG ID to annotation
                del rct.notes['KEGG ID'] #delete the info in notes
            except KeyError:
                try:
                    kegg_note_wrong = rct.notes['ID'] #When Kegg ID is stored wrongly in the ID field instead
                    rct.annotation['kegg.reaction'] = kegg_note_wrong #move to annotation
                    del rct.notes['ID'] #delete note
                except KeyError:
                    print (rct.id)

LALDO2x
PGL
GTBIHY
TURAHY
KDG2R
DGLCN5R
BGAL
ALCD1
TAG1PK
TGBPA
SUCD5
ATPS4r
Kt2
CYTBO3


The reactions that did not have the KEGG ID in either notes or annotations will be inspected one by one now.

In [34]:
model.reactions.LALDO2x.annotation = model_e_coli.reactions.LALDO2x.annotation

In [35]:
model.reactions.PGL.annotation = model_e_coli.reactions.PGL.annotation

In [36]:
model.reactions.KDG2R.annotation['kegg.reaction'] = 'R01739'
model.reactions.KDG2R.annotation['ec-code'] = '1.1.1.215'
model.reactions.KDG2R.annotation['rhea']= '16656'
model.reactions.KDG2R.annotation['sbo'] ='SBO:0000176'
model.reactions.KDG2R.annotation['metanetx.reaction']='MNXR94784'

In [37]:
model.metabolites.glcn__D_c.name = 'D-Gluconate'

In [38]:
model.reactions.DGLCN5R.annotation['ec-code'] = '1.1.1.69'
model.reactions.DGLCN5R.annotation['kegg.reaction'] = 'R01740'
model.reactions.DGLCN5R.annotation['rhea'] = '23939'
model.reactions.DGLCN5R.annotation['sbo'] = 'SBO:0000176'
model.reactions.DGLCN5R.annotation['metanetx.reaction'] = 'MNXR107148'

In [39]:
model.reactions.BGAL.annotation['sbo'] = 'SBO:0000176'

In [40]:
model.reactions.ALCD1.annotation['kegg.reaction']= 'R00605'
model.reactions.ALCD1.annotation['ec-code'] = '1.1.1.244'
model.reactions.ALCD1.annotation['sbo'] = 'SBO:0000176'
model.reactions.ALCD1.notes = {}
model.reactions.ALCD1.annotation['metanetx.reaction'] = 'MNXR95708'

In [41]:
model.reactions.TAG1PK.annotation['sbo'] = 'SBO:0000176'
model.reactions.TAG1PK.annotation['ec-code'] = '2.7.1.B6'

In [42]:
model.reactions.TGBPA.name = 'Tagatose-bisphosphate aldolase'

In [43]:
model.reactions.TGBPA.annotation = model_e_coli.reactions.TGBPA.annotation

In [44]:
model.reactions.SUCD5.annotation['sbo'] = 'SBO:0000176'
model.reactions.SUCD5.annotation['kegg.reaction'] = 'R02164'
model.reactions.SUCD5.annotation['ec-code'] = '1.3.5.1'
model.reactions.SUCD5.annotation['metanetx.reaction'] = 'MNXR107340'

In [45]:
model.reactions.LTHRK.annotation['sbo'] = 'SBO:0000176'
model.reactions.LTHRK.annotation['kegg.reaction'] = 'R06531'
model.reactions.LTHRK.annotation['ec-code'] = '2.7.1.177'
model.reactions.LTHRK.annotation['metanetx.reaction'] = 'MNXR101251'

In [46]:
model.reactions.ATPS4r.annotation['sbo'] = 'SBO:0000185'
model.reactions.ATPS4r.annotation['ec-code'] = '7.1.2.2'
model.reactions.ATPS4r.annotation['metanetx.reaction'] = 'MNXR96136'
model.reactions.ATPS4r.annotation['seed.reaction'] = 'rxn10042'

In [47]:
model.reactions.Kt2.annotation['sbo'] = 'SBO:0000185'
model.reactions.Kt2.annotation['metanetx.reaction'] = 'MNXR100951'
model.reactions.Kt2.annotation['seed.reaction'] = 'rxn05595'

In [48]:
model.reactions.CYTBO3.annotation['sbo'] = 'SBO:0000185'
model.reactions.CYTBO3.annotation['metanetx.reaction'] = 'MNXR97035'
model.reactions.CYTBO3.annotation['seed.reaction'] = 'rxn10113'

In [49]:
#save & commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

### EC info

In [39]:
#check EC info
for rct in model.reactions:
    if rct.id[:2] in 'EX':
        continue
    elif rct.id[-1:] in ['t', 's', 'c']: #get rid of transports, we will tackle those later
        continue
    else:
        try:
            ec_anno = rct.annotation['ec-code'] #try to find if there is already info in the annotation
            if len(ec_anno) > 0:
                try: 
                    del rct.notes['ENZYME'] #delete the notes if the metabolite has it
                except KeyError: 
                    continue #if it doesn't have an EC in the notes but has the annotation, just continue
            else: 
                continue
        except KeyError:
            try:
                ec_note = rct.notes['ENZYME'] #lift KEGG ID from notes
                rct.annotation['ec-code'] = ec_note #move KEGG ID to annotation
                del rct.notes['ENZYME'] #delete the info in notes
            except KeyError:
                print (rct.id)

MALHYDRO
BCFASYN
BCFASYN2
BCFASYN3
UPP1S
AMACT
ACEDIA
GTBIHY
TURAHY
BGAL
Kt2
CYTBO3


Most of these reactions are fine without an ec code because they contain significant other information that can allow them to be traced back. IF not, they are changed below.

In [40]:
del model.reactions.MALHYDRO.annotation['metanetx.reaction']

In [41]:
model.reactions.AMACT.annotation['ec-code'] = '2.3.1.180'

In [42]:
#remove all gene associations and definitions
for rct in model.reactions:
    try: 
        del rct.notes['DEFINITION']
    except KeyError:
        continue
    try: 
        del rct.notes['GENE_ASSOCIATION']
    except KeyError:
        continue

In [43]:
#save&commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

## Transport reactions
The transport reactions in our model don't have any annotations or notes it seems. Here, I see that there are MetanetX ids for some of these reactions. So these should be added. There reactions don't have E.C. numbers or KEGG IDs and so I will add the MetanetX IDs and where possible the seed.reaction to the annotation field. And make sure their notes are empty.

__Export reactions__
for Export reactions, they all contain the SBO annotation. As these are not biochemical reactions, but just from a modelling perspective added, I will not add any additional annotations here. 


In [46]:
#check which miss sbo term: should be none as Ben did this already.
for rct in model.reactions:
    if rct.id[-1:] in ['t', 's', 'c']:
        try:
            sbo = rct.annotation['sbo']
        except KeyError:
            print (rct.id)
    else:
        continue

In [47]:
#check which miss metanetx term: probably almost all of them
for rct in model.reactions:
    if rct.id[-1:] in ['t', 's', 'c']:
        try:
            meta = rct.annotation['metanetx.reaction']
        except KeyError:
            print (rct.id)
    else:
        continue

NH4t
GLC__Dtabc
FEtabc
H2Ot
O2t
CO2t
THMt
COBALT2t
ETOHt
ACt
LAC__Dt
SUCCt
LAC__Lt
biomass
EX_succcoa_c
EX_dlglu_c
CLt
ASN__Lt
PYDX5Pt
QH2t
GTHRDt
BIOMASSt
THMTPt
NADPt
GLC__Dtpts
FRUtpts
ARAB__Ltabc
XYL__Dtabc
GALtabc
MNLtpts
CELLBtpts
SUCRtpts
GLYCt
RIB__Dtabc
MANtpts
SBTtpts
ACGAMtpts
ARBTtpts
SALCNtpts
MALTtabc
MALTtpts
TREtpts
RMNt
MELIBt
GTBIt
TURAt
KDG2t
DGLCN5t
MDGPt
TAGtpts
AMYt
PYRt
FORt
PIt
Kt
SO4t


Reactions with no Metanetx: biomass, QH2t (should be removed later), THMt, GTHRDt (should be removed later), BIOMASSt, THMTPt, NADPt(should be removed later), GTBIt, TURAt, KDG2t, MDGPt, TAGtpts.

In [48]:
#these two exchange reactions should be removed
model.remove_reactions(model.reactions.EX_succcoa_c)


need to pass in a list


need to pass in a list



In [49]:
model.remove_reactions(model.reactions.EX_dlglu_c)

In [50]:
model.reactions.NH4t.annotation['metanetx.reaction'] = 'MNXR101950'
model.reactions.NH4t.annotation['seed.reaction'] = 'rxn05466'

In [51]:
model.reactions.PIt.annotation['metanetx.reaction'] = 'MNXR102872'
model.reactions.PIt.annotation['seed.reaction'] = 'rxn05312'

In [52]:
model.reactions.FORt.annotation['metanetx.reaction'] = 'MNXR99621'
model.reactions.FORt.annotation['seed.reaction'] = 'rxn05559'

In [53]:
model.reactions.PYRt.annotation['metanetx.reaction'] = 'MNXR103385 '
model.reactions.PYRt.annotation['seed.reaction'] = 'rxn05469'

In [54]:
model.reactions.AMYt.annotation['metanetx.reaction'] = 'MNXR142927'

In [55]:
model.reactions.Kt.annotation['metanetx.reaction'] = 'MNXR100951'
model.reactions.Kt.annotation['seed.reaction'] = 'rxn05595'

In [56]:
model.reactions.SO4t.annotation['metanetx.reaction'] = 'MNXR104466'
model.reactions.SO4t.annotation['seed.reaction'] = 'rxn05651'

In [57]:
model.reactions.GLC__Dtpts.annotation['metanetx.reaction'] = 'MNXR100237'
model.reactions.GLC__Dtpts.annotation['seed.reaction'] = 'rxn05226'

In [58]:
model.reactions.FRUtpts.annotation['metanetx.reaction'] = 'MNXR99662'

In [59]:
model.reactions.ARAB__Ltabc.annotation['metanetx.reaction'] = 'MNXR126447'
model.reactions.ARAB__Ltabc.annotation['seed.reaction'] = 'rxn05173'

In [60]:
model.reactions.XYL__Dtabc.annotation['metanetx.reaction'] = 'MNXR105268'
model.reactions.XYL__Dtabc.annotation['seed.reaction'] = 'rxn05167'

In [61]:
model.reactions.GALtabc.annotation['metanetx.reaction'] = 'MNXR100023'
model.reactions.GALtabc.annotation['seed.reaction'] = 'rxn05162'

In [62]:
model.reactions.SUCCt.annotation['metanetx.reaction'] = 'MNXR104619'
model.reactions.SUCCt.annotation['seed.reaction'] = 'rxn10952'

In [63]:
model.reactions.LAC__Lt.annotation['metanetx.reaction'] = 'MNXR100999'
model.reactions.LAC__Lt.annotation['seed.reaction'] = 'rxn11016'

In [64]:
model.reactions.ASN__Lt.annotation['metanetx.reaction'] = 'MNXR96066 '
model.reactions.ASN__Lt.annotation['seed.reaction'] = ['rxn05508','rxn11321']

In [65]:
model.metabolites.pydx5p_e.name = 'Pyridoxal phosphate'

In [66]:
model.reactions.PYDX5Pt.annotation['metanetx.reaction'] = 'MNXR103359'

In [67]:
model.metabolites.qh2_e.name = 'Ubiquinol'

In [68]:
model.reactions.GLC__Dtabc.annotation['metanetx.reaction'] = 'MNXR100236'
model.reactions.GLC__Dtabc.annotation['seed.reaction'] = 'rxn05147'

In [69]:
model.reactions.FEtabc.annotation['metanetx.reaction'] = 'MNXR99504'

In [70]:
model.reactions.H2Ot.annotation['metanetx.reaction'] = 'MNXR98641'

In [71]:
model.reactions.O2t.annotation['metanetx.reaction'] = 'MNXR102090'

In [72]:
model.reactions.CO2t.annotation['metanetx.reaction'] = 'MNXR97980'

In [73]:
model.reactions.COBALT2t.id = 'COBALT2tabc'

In [74]:
model.reactions.ETOHt.annotation['metanetx.reaction'] = 'MNXR96810'

In [75]:
model.metabolites.ac_e.name = 'Acetate'

In [76]:
model.reactions.ACt.annotation['metanetx.reaction'] ='MNXR95431'
model.reactions.ACt.annotation['seed.reaction'] = 'rxn10904'

In [77]:
model.reactions.COBALT2tabc.annotation['metanetx.reaction'] = 'MNXR96819'
model.reactions.COBALT2tabc.annotation['seed.reaction'] = 'rxn08239'

In [78]:
model.reactions.LAC__Dt.annotation['metanetx.reaction'] = 'MNXR101277'
model.reactions.LAC__Dt.annotation['seed.reaction'] = 'rxn05602'

In [79]:
model.metabolites.cl_e.name = 'Chloride'

In [80]:
model.reactions.CLt.annotation['metanetx.reaction'] = 'MNXR96797'
model.reactions.CLt.annotation['seed.reaction'] = 'rxn10473'

In [81]:
model.reactions.MNLtpts.annotation['metanetx.reaction'] = 'MNXR101677'
model.reactions.MNLtpts.annotation['seed.reaction'] = 'rxn05617'

In [82]:
model.reactions.CELLBtpts.annotation['metanetx.reaction'] = 'MNXR96586'

In [83]:
model.reactions.SUCRtpts.annotation['metanetx.reaction'] = 'MNXR104643'
model.reactions.SUCRtpts.annotation['seed.reaction'] = 'rxn05655'

In [84]:
model.reactions.GLYCt.annotation['metanetx.reaction'] = 'MNXR100343'
model.reactions.GLYCt.annotation['seed.reaction'] = 'rxn05581'

In [85]:
model.reactions.RIB__Dtabc.annotation['metanetx.reaction'] = 'MNXR104034'
model.reactions.RIB__Dtabc.annotation['seed.reaction'] = 'rxn05160'

In [86]:
model.reactions.MANtpts.annotation['metanetx.reaction'] = 'MNXR101401'
model.reactions.MANtpts.annotation['seed.reaction'] = 'rxn05610'

In [87]:
model.reactions.SBTtpts.annotation['metanetx.reaction'] = 'MNXR104290'
model.reactions.SBTtpts.annotation['seed.reaction'] = 'rxn10184'

In [88]:
model.reactions.ACGAMtpts.annotation['metanetx.reaction'] = 'MNXR95253'
model.reactions.ACGAMtpts.annotation['seed.reaction'] = 'rxn05485'

In [89]:
model.reactions.ARBTtpts.annotation['metanetx.reaction'] = 'MNXR95932'
model.reactions.ARBTtpts.annotation['seed.reaction'] = 'rxn05501'

In [90]:
model.reactions.SALCNtpts.annotation['metanetx.reaction'] = 'MNXR104266'
model.reactions.SALCNtpts.annotation['seed.reaction'] = 'rxn05647'

In [91]:
model.reactions.MALTtabc.annotation['metanetx.reaction'] = 'MNXR101362'
model.reactions.MALTtabc.annotation['seed.reaction'] = 'rxn05170'

In [92]:
model.reactions.MALTtpts.annotation['metanetx.reaction'] = 'MNXR101363'
model.reactions.MALTtpts.annotation['seed.reaction'] = 'rxn05607'

In [93]:
model.reactions.TREtpts.annotation['metanetx.reaction'] = 'MNXR104931'
model.reactions.TREtpts.annotation['seed.reaction'] = 'rxn02005'

In [94]:
model.metabolites.tre6p_c.name='Trehalose 6-phosphate'

In [95]:
model.reactions.RMNt.annotation['metanetx.reaction'] = 'MNXR104041'
model.reactions.RMNt.annotation['seed.reaction'] = 'rxn05646'

In [96]:
model.reactions.MELIBt.annotation['metanetx.reaction'] = 'MNXR138587'

In [97]:
model.metabolites.tura_c.annotation['metanetx.chemical'] = 'MNXM161984'
model.metabolites.tura_e.annotation['metanetx.chemical'] = 'MNXM161984'

In [98]:
model.reactions.TURAt.name = 'Turanose transport'

In [99]:
model.metabolites.kdg2_c.annotation['metanetx.chemical'] = 'MNXM480329'
model.metabolites.kdg2_e.annotation['metanetx.chemical'] = 'MNXM480329'

In [100]:
model.metabolites.dglcn5_e.annotation['metanetx.chemical'] ='MNXM963'
model.metabolites.dglcn5_c.annotation['metanetx.chemical'] ='MNXM963'

In [101]:
model.reactions.DGLCN5t.annotation['metanetx.reaction'] = 'MNXR95067'

In [102]:
model.metabolites.mdgp_e.annotation['metanetx.chemical'] = 'MNXM61754'
model.metabolites.mdgp_c.annotation['metanetx.chemical'] = 'MNXM61754'

In [103]:
model.metabolites.tag__D_e.annotation['metanetx.chemical'] = 'MNXM83257'

In [104]:
#save&commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

Now I will check that none of the reactions have any field left except the 'name' or other notes. I will also check that al reactions will have some annotation (atleast SBO)

In [123]:
#remove other fields in notes that should be gone
for rct in model.reactions:
    try: 
        del rct.notes['GENE_ASSOCIATION']
    except KeyError:
        continue

In [124]:
for rct in model.reactions:
    try: 
        del rct.notes['ORTHOLOGY']
    except KeyError:
        continue

In [125]:
for rct in model.reactions:
    try: 
        del rct.notes['PATHWAY']
    except KeyError:
        continue

In [126]:
for rct in model.reactions:
    try: 
        del rct.notes['EQUATION']
    except KeyError:
        continue

In [127]:
for rct in model.reactions:
    try: 
        del rct.notes['RPAIR']
    except KeyError:
        continue

In [128]:
for rct in model.reactions:
    try: 
        del rct.notes['original_bigg_ids']
    except KeyError:
        continue

In [129]:
for rct in model.reactions:
    try: 
        del rct.notes['KEGG ID']
    except KeyError:
        continue

In [130]:
for rct in model.reactions:
    try: 
        del rct.notes['ENZYME']
    except KeyError:
        continue

In [131]:
for rct in model.reactions:
    try: 
        del rct.notes['ENTRY']
    except KeyError:
        continue

In [132]:
for rct in model.reactions:
    try: 
        del rct.notes['EQUATION_USED_HEREIN']
    except KeyError:
        continue

In [133]:
for rct in model.reactions:
    try: 
        del rct.notes['KEGG_COMMENT']
    except KeyError:
        continue

In [134]:
for rct in model.reactions:
    try: 
        del rct.notes['MODULE']
    except KeyError:
        continue

In [135]:
#convert field comments to field 'NOTES'
for rct in model.reactions:
    try:
        comment = rct.notes['COMMENTS']
        rct.notes['NOTES'] = comment
        del rct.notes['COMMENTS']
    except KeyError:
        try:
            comment = rct.notes['COMMENT']
            rct.notes['NOTES'] = comment
            del rct.notes['COMMENT']
        except KeyError:
            continue

In [136]:
for rct in model.reactions:
    if len (rct.notes) == 0: #rcts with no notes
        continue
    elif len(rct.notes) == 1: #rcts with just one note
        try:
            name = rct.notes['NAME']
        except KeyError:
            try:
                rct.notes['NOTES']
            except:
                print (rct.id)
    else:
        print (rct.id)

IG3PS
PGL
HOXPRm


In [139]:
del model.reactions.IG3PS.notes['ECNumbers']

In [143]:
del model.reactions.PGL.notes['KEGG']

In [146]:
#save&commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

In [147]:
#which reactions have no annotation
for rct in model.reactions:
    if len(rct.annotation) == 0:
        print(rct.id)
    else:
        continue

This list should become zero, as all should have an SBO. 

So below I check how many rcts have only SBO annotation, excluding the exchange reactions.

In [148]:
for rct in model.reactions:
    if rct.id[:2] in 'EX':
        continue
    else:
        if len(rct.annotation) <= 1:
            print(rct.id)
        else:
            continue

THMt
biomass
QH2t
GTHRDt
BIOMASSt
THMTPt
NADPt
GTBIt
GTBIHY
TURAt
TURAHY
KDG2t
MDGPt
BGAL
TAGtpts


This list matches what we've already checked above. So I can leave it like it is. 

# Missing names
some reactions and metabolites don't have names, and this was added automatically in commit  4fc25cc, and so this will be fixed here,

In [149]:
model.metabolites.lald__L_c.name = 'L-Lactaldehyde' 

In [150]:
model.metabolites.gtspmd_c.name = 'Glutathionylspermidine'

In [151]:
model.metabolites.cdpdag_c.name = 'CDP-1,2-diacylglycerol'

In [152]:
model.metabolites.actn__R_c.name = '(R)-Acetoin'

In [153]:
model.reactions.HOXPRm.name = '(R)-Glycerate:NAD+ oxidoreductase'

In [154]:
model.reactions.MOX.name = '(S)-Malate:oxygen oxidoreductase'

In [155]:
model.reactions.TRPS3.name = 'Tryptophan synthase (indoleglycerol phosphate)'

In [156]:
model.reactions.SUCOAACTr.name = 'succinyl-CoA:acetate CoA-transferase'

In [157]:
model.reactions.ACTD.name = '(R)-Acetoin:NAD+ oxidoreductase'

In [158]:
model.reactions.ACOAD1.name = 'butanoyl-CoA:NAD+ trans-2-oxidoreductase'

In [159]:
model.reactions.ECOAH1.name = '(S)-3-hydroxybutanoyl-CoA hydro-lyase'

In [160]:
model.reactions.MANtpts.name = 'Mannose transport via PTS'

In [161]:
model.reactions.SALCNtpts.name = 'Salicin transport via PTS'

In [162]:
model.reactions.TGBPA.name = 'Tagatose-bisphosphate aldolase'

In [163]:
#save&commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

__E. coli genes__
In a previous commit, Ben saw therewere accidentally three E. coli genes present in the model. (See Issue #55: fix: 3 genes from E. coli) 

Here I check that they are not associated to any reaction. Also the xml file seems to show they are not associated to any reaction. So they can be removed. 

In [164]:
e_coli_genes = ['b3903', 'b3904', 'b3902']

In [165]:
for rct in model.reactions:
    if rct.gene_reaction_rule in e_coli_genes:
        print (rct.id)
    else:
        continue

In [166]:
genes = [model.genes.get_by_id('b3903'), model.genes.get_by_id('b3904'), model.genes.get_by_id('b3902')]

In [167]:
genes

[<Gene b3903 at 0x22257ca0b08>,
 <Gene b3904 at 0x22257ca06c8>,
 <Gene b3902 at 0x22257ca0348>]

In [168]:
cobra.manipulation.delete.remove_genes(model, genes, remove_reactions=True)

In [169]:
#save&commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')