# Remove duplicate metabolites
Author: Famke Baeuerle

With automated drafting from the BiGG database using CarveMe, sometimes metabolites are added twice with different identifiers. Here we denoted which metabolites were removed and consequently replaced (if needed) in the occuring reactions. The replacement was done if needed by search-and-replace in the respective model XML files. 

Note: This notebook will not work on the current model files since they do not hold these duplicates anymore.

In [None]:
import refinegems as rg
from cobra.io import write_sbml_model

We use the type strain model as starting point:

In [None]:
model = rg.io.load_model_cobra('../../models/iCstr1054FB23.xml')

## Example: replace `gly_cys__L` by `cgly`

`cgly` was the better annotated metabolite

In [None]:
cgly = list(model.metabolites.get_by_id('cgly_c').reactions)
glycys = list(model.metabolites.get_by_id('gly_cys__L_c').reactions)

In [None]:
for rea in cgly:
    print(rea)
print('---')
for rea in glycys:
    print(rea)

In [None]:
cgly = list(model.metabolites.get_by_id('cgly_e').reactions)
glycys = list(model.metabolites.get_by_id('gly_cys__L_e').reactions)

In [None]:
for rea in cgly:
    print(rea)
print('---')
for rea in glycys:
    print(rea)

Now we remove 'gly_cys__L_c' and 'gly_cys__L_e' from all models. We also remove the respective exchange reaction 'EX_gly_cys__L_e' and two reactions that held the same metabolites 'GLYCYSAP' and 'GLYCYSabc'.

In [None]:
modelpaths_to_change = ['../models/iCstr1054FB23.xml', '../models/iCstr1197FB23.xml', '../models/iCstr1115FB23.xml', '../models/iCstr1116FB23.xml']
for mod in modelpaths_to_change:
    model = rg.io.load_model_cobra(mod)
    print(model.metabolites.get_by_id('cgly_c'))
    print(model.metabolites.get_by_id('cgly_e'))
    model.metabolites.get_by_id('gly_cys__L_c').remove_from_model()
    model.metabolites.get_by_id('gly_cys__L_e').remove_from_model()
    model.reactions.get_by_id('GLYCYSAP').remove_from_model()
    model.reactions.get_by_id('GLYCYSabc').remove_from_model()
    model.reactions.get_by_id('EX_gly_cys__L_e').remove_from_model()
    write_sbml_model(model, mod)

## Changes to metabolite identifiers

The following metabolite identifiers were changed to obtain the better annotated identifier. Changes were done by search-and-replace in the XML files. Duplicates were removed as shown above.

| **old id**  | **new id**  | **removed duplicates** | **strain** |
|------------------|------------------|-----------------------------|-----------------|
| trans\_dd2coa    | dd2coa           |                             | 14, 17          |
| ala\_gln         | ala\_L\_gln\_\_L |                             | all             |
| gly\_gln         | gly\_gln\_\_L    |                             | all             |
| alathr           | ala\_L\_Thr\_\_L |                             | all             |
| alaleu, ala\_leu | ala\_L\_leu\_\_L | \_c, \_e, EX                | all             |
| ala\_his         | ala\_L\_his\_\_L |                             | all             |
| cresol           | 4crsol           |                             | 15, 16, 17      |
| lgt\_s           | lgt\_\_S         |                             | all             |
| gly\_phe         | gly\_phe\_\_L    |                             | all             |
| gly\_tyr         | gly\_tyr\_\_L    |                             | all             |
| gly\_met         | gly\_met\_\_L    |                             | all             |
| gly\_leu         | gly\_leu\_\_L    |                             | all             |
| gly\_cys         | gly\_cys\_\_L    |                             | all             |
| tmam             | tma              |                             | 14              |
| orn              | orn\_\_L         | \_c                         | all             |
| 3hbycoa          | 3hbcoa           | \_c                         | all             |
| glcn             | glcn\_\_D        | \_c, \_e, EX                | all             |
| glyphe           | gly\_phe\_\_L    | \_c, \_e, EX                | all             |
| abt              | abt\_\_L         | \_c, \_e, EX                | 15              |
| metox            | metsox\_S\_\_L   | \_c, \_e, EX                | 14              |