# Fix gene ids in iYali map

This notebook fixes the iYali gene ids available at https://caffeine.dd-decaf.eu/

Benjamín J. Sánchez, 2020-03-23

## 1. Getting model and map

* Model: https://api.dd-decaf.eu/model-storage/models/44
* Map: https://api.dd-decaf.eu/map-storage/maps/37

In [1]:
import wget
import os

def deleteIfExists(filename):
    if os.path.exists(filename):
        os.remove(filename)

deleteIfExists("iYali-model.json")
deleteIfExists("iYali-map.json")
wget.download("https://api.dd-decaf.eu/model-storage/models/44", "iYali-model.json", bar=False)
wget.download("https://api.dd-decaf.eu/map-storage/maps/37", "iYali-map.json", bar=False)

'iYali-map.json'

Let's check if they are the same in staging:

* Model: https://api-staging.dd-decaf.eu/model-storage/models/44
* Map: https://api-staging.dd-decaf.eu/map-storage/maps/90

In [2]:
deleteIfExists("iYali-model.json")
deleteIfExists("iYali-map.json")
wget.download("https://api-staging.dd-decaf.eu/model-storage/models/44", "iYali-model.json", bar=False)
wget.download("https://api-staging.dd-decaf.eu/map-storage/maps/90", "iYali-map.json", bar=False)  # note the id is different

'iYali-map.json'

So the model is the same (no changes in git) but the map is different. To what is exactly different, we first need to store the json files in a readable format (currently they are stored in one line):

In [3]:
# Go back to the main version:
deleteIfExists("iYali-model.json")
deleteIfExists("iYali-map.json")
wget.download("https://api.dd-decaf.eu/model-storage/models/44", "iYali-model.json", bar=False)
wget.download("https://api.dd-decaf.eu/map-storage/maps/37", "iYali-map.json", bar=False)

# Load object:
import json
def openJSON(filename):
    with open(filename, 'r') as handle:
        parsed = json.load(handle)
        handle.close()
    return parsed

# Convert to readable format:
def storePrettyJSON(parsed, filename):
    # Generate pretty text from stored file:
    pretty_text = json.dumps(parsed, indent=4, sort_keys=True)
    # Store pretty text as file:
    with open(filename, "w") as handle:
        handle.write(pretty_text)
        handle.close()

# Function for doing both:
def makeJSONpretty(filename):
    parsed = openJSON(filename)
    storePrettyJSON(parsed, filename)

makeJSONpretty("iYali-model.json")
makeJSONpretty("iYali-map.json")

Let's repeat the analysis to see what has changed in staging:

In [4]:
deleteIfExists("iYali-model.json")
deleteIfExists("iYali-map.json")
wget.download("https://api-staging.dd-decaf.eu/model-storage/models/44", "iYali-model.json", bar=False)
wget.download("https://api-staging.dd-decaf.eu/map-storage/maps/90", "iYali-map.json", bar=False)
makeJSONpretty("iYali-model.json")
makeJSONpretty("iYali-map.json")

So actually the only thing that changed was the map id (see git), the map itself is the same. Great! We can proceed by only fixing one model and one map. Let's get the relevant parts out of the caffeine-specific objects and save them again as pretty JSON:

In [5]:
# Load models
iYali_model = openJSON("iYali-model.json")
iYali_map = openJSON("iYali-map.json")

# Get relevant parts:
iYali_model = iYali_model["model_serialized"]
iYali_map = iYali_map["map"]

# Store the new elements:
storePrettyJSON(iYali_model, "iYali-model.json")
storePrettyJSON(iYali_map, "iYali-map.json")

And in the case of the model, let's load it / save it as a model using cobrapy:

In [6]:
import cobra
iYali_model = cobra.io.load_json_model("iYali-model.json")
iYali_model

Using license file C:\Users\bejsab\gurobi.lic
Academic license - for non-commercial use only


0,1
Name,iYali
Memory address,0x02577a6408c8
Number of metabolites,1671
Number of reactions,1924
Number of groups,0
Objective expression,1.0*xBIOMASS - 1.0*xBIOMASS_reverse_52bad
Compartments,"cell envelope, extracellular, mitochondrion, cytoplasm, peroxisome, endoplasmic reticulum, nucleus, Golgi, lipid particle, vacuole, endoplasmic reticulum membrane, vacuolar membrane, mitochondrial membrane, Golgi membrane"


In [7]:
cobra.io.save_json_model(iYali_model, "iYali-model.json")

## 2. Fix model

First we will fix the model. The only thing needed is to swap id for name in each gene, cobrapy takes care of the rest:

In [8]:
gene_dict = {}
for gene in iYali_model.genes:
    # Store dictionary (old, new):
    gene_dict[gene.id] = gene.name
    # Change gene name:
    gene.name = gene.id

# Apply dictionary to change all ids using cobra function:
cobra.manipulation.modify.rename_genes(iYali_model, gene_dict)

# Save new model:
cobra.io.save_json_model(iYali_model, "iYali-model.json")

## Fix map

Now we fix the map. The map consists of a list with two elements: All map info is in the second one, as the first one just has:

In [9]:
print(len(iYali_map))
iYali_map[0]

2


{'homepage': 'https://escher.github.io',
 'map_description': '\nLast Modified Mon Jul 01 2019 15:05:09 GMT+0200 (Central European Summer Time)',
 'map_id': 'xXdlpXnnusXP',
 'map_name': 'YALI_combined',
 'schema': 'https://escher.github.io/escher/jsonschema/1-0-0#'}

The second element is a dictionary, let's see the keys in it:

In [10]:
for key, value in iYali_map[1].items() :
    print(key)

canvas
nodes
reactions
text_labels


All information is in the `reactions` key, stored for each reaction in the `genes` and `gene_reaction_rule` keys (as genes don't have a node assign to them, they just appear as extra text next to the reaction):

In [11]:
iYali_map[1]["reactions"]["1"]  # show as example

{'bigg_id': 'GALt2',
 'gene_reaction_rule': 'YALI1_F25587g',
 'genes': [{'annotation': {'Annotation': '  similar to uniprot|P32466 Saccharomyces cerevisiae YDR345c HXT3 low-affinity hexose transporter or uniprot|P23585 Saccharomyces cerevisiae YMR011w HXT2 high- affinity hexose transporter',
    'Chromosome': 'YALI1F',
    'Strand': '+',
    'YALI0_locus_tag': 'YALI0F19184g',
    'YALI1_locus_tag': 'YALI1_F25587g',
    'sbo': 'SBO:0000252'},
   'bigg_id': 'YALI1_F25587g',
   'name': 'YALI0F19184g'}],
 'label_x': -39744.908111572266,
 'label_y': -26567.39727783203,
 'metabolites': [{'bigg_id': 'gal_c', 'coefficient': 1},
  {'bigg_id': 'gal_e', 'coefficient': -1},
  {'bigg_id': 'h_c', 'coefficient': 1},
  {'bigg_id': 'h_e', 'coefficient': -1}],
 'name': 'D-galactose transport',
 'reversibility': False,
 'segments': {'3': {'b1': None,
   'b2': None,
   'from_node_id': '4',
   'to_node_id': '5'},
  '4': {'b1': None, 'b2': None, 'from_node_id': '6', 'to_node_id': '5'},
  '5': {'b1': {'x': -

And as we already have the dictionary `gene_dict`, we can just go through the dictionary and replace stuff:

In [12]:
for key, reaction in iYali_map[1]["reactions"].items() :
    # Iterate through genes to assign the proper ids/names:
    for gene in reaction["genes"]:
        gene["name"] = gene["bigg_id"]  # The name should be the id
        gene["bigg_id"] = gene_dict[gene["bigg_id"]]  # The id is stored in the dict
    # Replace gene_reaction_rule using info from model:
    reaction_model = iYali_model.reactions.get_by_id(reaction["bigg_id"])
    reaction["gene_reaction_rule"] = reaction_model.gene_reaction_rule

Let's confirm that this worked by looking at the same gene from before:

In [13]:
iYali_map[1]["reactions"]["1"]  # show as example

{'bigg_id': 'GALt2',
 'gene_reaction_rule': 'YALI0F19184g',
 'genes': [{'annotation': {'Annotation': '  similar to uniprot|P32466 Saccharomyces cerevisiae YDR345c HXT3 low-affinity hexose transporter or uniprot|P23585 Saccharomyces cerevisiae YMR011w HXT2 high- affinity hexose transporter',
    'Chromosome': 'YALI1F',
    'Strand': '+',
    'YALI0_locus_tag': 'YALI0F19184g',
    'YALI1_locus_tag': 'YALI1_F25587g',
    'sbo': 'SBO:0000252'},
   'bigg_id': 'YALI0F19184g',
   'name': 'YALI1_F25587g'}],
 'label_x': -39744.908111572266,
 'label_y': -26567.39727783203,
 'metabolites': [{'bigg_id': 'gal_c', 'coefficient': 1},
  {'bigg_id': 'gal_e', 'coefficient': -1},
  {'bigg_id': 'h_c', 'coefficient': 1},
  {'bigg_id': 'h_e', 'coefficient': -1}],
 'name': 'D-galactose transport',
 'reversibility': False,
 'segments': {'3': {'b1': None,
   'b2': None,
   'from_node_id': '4',
   'to_node_id': '5'},
  '4': {'b1': None, 'b2': None, 'from_node_id': '6', 'to_node_id': '5'},
  '5': {'b1': {'x': -3

Looks good! Let's save results:

In [14]:
storePrettyJSON(iYali_map, "iYali-map.json")

## Fix new id bug

We should add an underscore to all ids, both in the model and map:

In [15]:
# model changes:
iYali_model = cobra.io.load_json_model("iYali-model.json")
gene_dict = {}
for gene in iYali_model.genes:
    if gene.id.startswith("YALI0"):
        if not gene.id.startswith("YALI0_"):
            gene_dict[gene.id] = f"YALI0_{gene.id.split('YALI0')[1]}"
    else:
        print(gene.id)
cobra.manipulation.modify.rename_genes(iYali_model, gene_dict)
cobra.io.save_json_model(iYali_model, "iYali-model.json")

# map changes:
iYali_map = openJSON("iYali-map.json")
for key, reaction in iYali_map[1]["reactions"].items():
    for gene in reaction["genes"]:
        if gene["bigg_id"].startswith("YALI0"):
            if not gene["bigg_id"].startswith("YALI0_"): 
                gene["bigg_id"] = gene_dict[gene["bigg_id"]]
    reaction_model = iYali_model.reactions.get_by_id(reaction["bigg_id"])
    reaction["gene_reaction_rule"] = reaction_model.gene_reaction_rule
storePrettyJSON(iYali_map, "iYali-map.json")

NP_075432
NP_075433
NP_075437
NP_075434
NP_075438
NP_075443
YALIUNK1


In [1]:
# save SBML as well:
import cobra
iYali_model = cobra.io.load_json_model("iYali-model.json")
cobra.io.write_sbml_model(iYali_model, "iYali-model.xml")

Using license file C:\Users\bejsab\gurobi.lic
Academic license - for non-commercial use only
