# Remove duplicate reactions
Author: Famke Baeuerle

With automated drafting from the BiGG database using CarveMe, sometimes reactions are added twice with different identifiers. Here we denoted which reactions were removed.

Note: This notebook will not work on the current model files since they do not hold these duplicates anymore.

In [None]:
from refinegems.io import load_model_libsbml, load_model_cobra
from numpy import NaN
from libsbml import writeSBMLToFile
from cobra.io import write_sbml_model

## Remove duplicates found with MEMOTE

We denote in lists which reactions to remove. Each of those has a reaction with the same reactants, products and direction which will stay in the models.

In [None]:
sids_all = ['R_GLCNt2ir', 'R_ARGDI', 'R_ECOAH5_1', 'R_HACD1_1']
sids_17 = ['R_ECOAH5_1', 'R_ORNTAC_1', 'R_PPRGL']
sids_14 = ['R_ECOAH5_1', 'R_GLYCS_I', 'R_METSOX1abc']
sids_15 = ['R_GLYCS_I', 'R_ABTt_1']
sids_16 = ['OCBT_1']

In [None]:
def remove_duplicates(reac, sids):
    for sid in sids:
        if reac.getElementBySId(sid) is not None:
            print(str(reac.getElementBySId(sid)) + ' found.')
            reac.remove(sid)
            print(sid + ' removed.')

In [None]:
mod_14 = load_model_libsbml('../../models/Cstr_TS.xml')
reac_14 = mod_14.getListOfReactions()
mod_15 = load_model_libsbml('../../models/Cstr_1197.xml')
reac_15 = mod_15.getListOfReactions()
mod_16 = load_model_libsbml('../../models/Cstr_1115.xml')
reac_16 = mod_16.getListOfReactions()
mod_17 = load_model_libsbml('../../models/Cstr_1116.xml')
reac_17 = mod_17.getListOfReactions()

In [None]:
remove_duplicates(reac_14, sids_all)
remove_duplicates(reac_14, sids_14)
remove_duplicates(reac_15, sids_all)
remove_duplicates(reac_15, sids_15)
remove_duplicates(reac_16, sids_all)
remove_duplicates(reac_16, sids_16)
remove_duplicates(reac_17, sids_all)
remove_duplicates(reac_17, sids_17)

In [None]:
writeSBMLToFile(mod_14.getSBMLDocument(),'../../models/Cstr_TS.xml')
writeSBMLToFile(mod_15.getSBMLDocument(),'../../models/Cstr_1197.xml')
writeSBMLToFile(mod_16.getSBMLDocument(),'../../models/Cstr_1115.xml')
writeSBMLToFile(mod_17.getSBMLDocument(),'../../models/Cstr_1116.xml')

## Remove duplicates found during manual curation

Some reactions occured twice in the models, the duplicate being the same reaction but with a trailing `_1` in the ID.

In [None]:
modelpaths_to_change = ['../../models/Cstr_TS.xml', '../../models/Cstr_1197.xml', '../../models/Cstr_1115.xml', '../../models/Cstr_1116.xml']
for mod in modelpaths_to_change:
    model = load_model_cobra(mod)
    try:
        model.reactions.get_by_id('PNCDC_1').remove_from_model()
        model.reactions.get_by_id('NP1_1').remove_from_model()
        model.reactions.get_by_id('NADDPp_1').remove_from_model()
    except (KeyError):
        print('not in model')
    write_sbml_model(model, mod)

## Remove duplicates found with metabolic maps

After drawing escher maps (denoted in `./escher/`) overlap of reactions was visible. These are removed here.

In [None]:
modelpaths_to_change = ['../../models/Cstr_TS.xml', '../../models/Cstr_1197.xml', '../../models/Cstr_1115.xml', '../../models/Cstr_1116.xml']
for mod in modelpaths_to_change:
    model = load_model_cobra(mod)
    print(model.reactions.get_by_id('GNKr'))
    try:
        model.reactions.get_by_id('GNK').remove_from_model()
    except (KeyError):
        print('not in model')
    write_sbml_model(model, mod)