# BridgeDb tutorial: Gene HGNC name to Ensembl identifier

This tutorial explains how to use the BridgeDb identifier mapping service to translate HGNC names to Ensembl identifiers. This step is part of the OpenRiskNet use case to link Adverse Outcome Pathways to [WikiPathways](https://wikipathways.org/).

First we need to load the Python library to allow calls to the [BridgeDb REST webservice](http://bridgedb.prod.openrisknet.org/swagger/):

In [2]:
import requests

Let's assume we're interested in the gene with HGNC MECP2 (FIXME: look up a gene in AOPWiki), the API call to make mappings is given below as `callUrl`. Here, the `H` indicates that the query (`MECP2`) is an HGNC symbol:

In [3]:
callUrl = 'https://bridgedb.cloud.vhp4safety.nl/Human/xrefs/H/MECP2'

The default call returns all identifiers, not just for Ensembl:

In [4]:
response = requests.get(callUrl)
response.text

'4027037\tAffy\n4027036\tAffy\n4027039\tAffy\nNP_001356323\tRefSeq\n4027038\tAffy\nNP_001356322\tRefSeq\nGO:0042551\tGeneOntology\nNP_001356321\tRefSeq\nuc065car.1\tUCSC Genome Browser\nNP_001356320\tRefSeq\n4204\tWikiGenes\nGO:0051707\tGeneOntology\nGO:0043524\tGeneOntology\nHMNXSV003039744\tAgilent\nILMN_1702715\tIllumina\nGO:0045944\tGeneOntology\nuc065caz.1\tUCSC Genome Browser\nA_33_P3339036\tAgilent\nGO:0006576\tGeneOntology\nuc065cbc.1\tUCSC Genome Browser\n4027031\tAffy\n4027030\tAffy\n4027033\tAffy\n4027032\tAffy\n4027035\tAffy\n4027034\tAffy\nGO:0060090\tGeneOntology\n4027048\tAffy\n4027047\tAffy\n4027049\tAffy\nGO:0090063\tGeneOntology\nGO:0002087\tGeneOntology\nGO:0007416\tGeneOntology\nXP_047298078\tRefSeq\nXP_047298073\tRefSeq\nXP_047298071\tRefSeq\nXP_047298072\tRefSeq\nGO:0007420\tGeneOntology\n4027040\tAffy\n4027042\tAffy\nGO:0046470\tGeneOntology\n4027041\tAffy\n4027044\tAffy\n4027043\tAffy\nGO:0010385\tGeneOntology\nILMN_3310740\tIllumina\n4027046\tAffy\n4027045\tAff

You can also see the results are returned as a TSV file, consisting of two columns, the identifier and the matching database.

We will want to convert this reply into a Python dictionary (with the identifier as key, as one database may have multiple identifiers):

In [5]:
lines = response.text.split("\n")
mappings = {}
for line in lines:
    if ('\t' in line):
        tuple = line.split('\t')
        identifier = tuple[0]
        database = tuple[1]
        if (database == "Ensembl"):
            mappings[identifier] = database

print(mappings)

{'ENSG00000169057': 'Ensembl'}


Alternatively, we can restrivct the return values from the BridgeDb webservice to just return Ensembl identifiers (system code `En`). For this, we add the `?dataSource=En` call parameter:

In [6]:
callUrl = 'https://webservice.bridgedb.org/Human/xrefs/H/MECP2?dataSource=En'
response = requests.get(callUrl)
response.text

'ENSG00000169057\tEnsembl\n'