Skip to content
Paolo Cozzi edited this page Nov 9, 2016 · 1 revision

Usage examples

Importing the correct class

In ensemblrest library there are two interfaces for interact with ensembl and ensemblgenome rest API; The two methods works similarly, the only difference is that in ensemblgenomes we can search for genomes not included in ensembl core database, like plants or bacteria. So if you are interested in a specie included in the ensembl core database, like human for example, you have to deal with EnsemblRest class:

from ensemblrest import EnsemblRest
ensRest = EnsemblRest()

If yo need to work with a specie not included in ensembl core database, you have to deal with EnsemblGenomeRest class

from ensemblrest import EnsemblGenomeRest
ensGenomeRest = EnsemblGenomeRest()

The two class are very similar. EnsemblGenomeRest implements some few methods specific for the ensemblgenomes rest API.

Search for GO terms

Suppose you need to get the GO annotation for a known ensembl gene. Those type of annotations are external to ensembl annotation pipeline, and so they are parts of Cross References endpoints. Take a look at the GET xrefs/id/:id endpoint and its parameters: it needs the EnsEMBL stable id as required parameters; other parameters are optional but could be specified as described by this endpoint. The ensemblrest methods are named with get, the name of the endpoints, the word By and the required parameter, so to get xref by id, I need to search for getXrefById method (the first element of each word after get is in capital letter). Methods with Multiple words inside name, act as the others but implement POST requests, and could search for multiple values at the same time. By inspecting the GET xrefs/id/:id endpoint, you can find such description for the all_levels parameter:

Set to find all genetic features linked to the stable ID, and fetch all external references for them. Specifying this on a gene will also return values from its transcripts and translations

And this information can be useful. So, if I need to get all Xref annotations for a gene, I could do like this:

xrefs = ensGenomeRest.getXrefsById(id='OS11G0485200', all_levels=1)

In such example, I used an EnsemblGenomeRest object since OS11G0485200 is a O. Sativa stable gene ID; If you need to search for a Human stable id you can use an EnsemblRest instance. Parameters are specified as key=value, where keys have the same names of parameters specified by the ensembl endpoints, and values need to have the same values of types described by ensembl endpont. In the example above, xrefs is a list of dictionaries in which all external annotation defined for OS11G0485200 are reported. To restrict annotation on only GO terms, I could filter using the external_db parameter:

xrefs = ensGenomeRest.getXrefsById(id='OS11G0485200', all_levels=1, external_db='GO')

The description of each GO annotation could be incomplete. So, you may want to use the Ontologies and Taxonomy endpoints to integrate your annotation, for example the GET ontology/id/:id method can do the work. The proper ensemblrest method is called getOntologyById. If I am not interested, I could filter out children and parents by setting the simple=1 parameter. In the following example, I will use the first result found by getXrefsById:

xref = xrefs[0]
print(xref)
# {u'display_id': u'GO:0000166', u'description': u'nucleotide binding', u'db_display_name': u'GO', u'info_text': u'Generated via main', u'linkage_types': [u'IEA'], u'info_type': u'DEPENDENT', u'primary_id': u'GO:0000166', u'synonyms': [], u'version': u'0', u'dbname': u'GO'}
ontology = ensGenomeRest.getOntologyById(id=xref['display_id'], simple=1)
print(ontologies)
#{u'definition': u'"Interacting selectively and non-covalently with a nucleotide, any compound consisting of a nucleoside that is esterified with (ortho)phosphate or an oligophosphate at any hydroxyl group on the ribose or deoxyribose." [GOC:mah, ISBN:0198547684]', u'name': u'nucleotide binding', u'subsets': [u'goslim_chembl', u'goslim_metagenomics', u'goslim_pir', u'goslim_plant', u'gosubset_prok'], u'namespace': u'molecular_function', u'accession': u'GO:0000166', u'synonyms': [], u'ontology': u'GO'}