# Ontology mapping with the EBI ontology service APIs

This notebook describes a typical ontology mapping scenario where we want to map some textual values to ontology terms. In this workflow we will use the EBI's Ontology Lookup Service to lookup an ontology by label using the OLS REST API. Once we have a term we will fetch additional metadata about the term, including the synonyms and the term description. We will the get related terms, such as all the parent or ancerstral term. Finally we will look for mapppings to other ontology terms using the EBI's ontology mappings service, OxO. 

To run this tutorial you will need access to a running instance of the Ontology Lookup Service and OxO. By default it will run against the public servcies run by EBI, but can change the base URL to use a different instance of the service.  

In [9]:
OLS_BASE_URI='https://www.ebi.ac.uk/ols'
OXO_BASE_URI='https://www.ebi.ac.uk/spot/oxo'

ols_base = input("Your base OLS URL [https://www.ebi.ac.uk/ols]: " or OLS_BASE_URI)
oxo_base = input("Your base OXO URL [https://www.ebi.ac.uk/spot/oxo]: " or OXO_BASE_URI)

print("Using OLS {}, OXO {}".format(OLS_BASE_URI, OXO_BASE_URI))


In this tutorial we will uses some disease data from EBI's GWAS catalogue. 

* Chronic obstructive pulmonary disease
* Crohn's disease
* Sarcoidosis
* Cystic fibrosis
* Idiopathic pulmonary fibrosis
* Lung adenocarcinoma
* Chronic bronchitis

In [2]:
input_terms = [
    'Chronic obstructive pulmonary disease',
    'Crohn\'s disease',
    'Sarcoidosis',
    'Cystic fibrosis',
    'Idiopathic pulmonary fibrosis',
    'Lung adenocarcinoma',
    'Chronic bronchitis']


## Querying the OLS API by label 

In this first step we will use the [OLS REST API](https://www.ebi.ac.uk/ols/docs/api) to query the [Experimental Factor Ontology](https://www.ebi.ac.uk/efo) for exact label matches to our input term. 

In [3]:
import requests
import json
import urllib.parse

OLS_SEARCH_API = OLS_BASE_URI + '/api/search'

term_id_map = {}

for lookup in input_terms:
        
    search_params = {
        'q' : lookup,
        'ontology' : 'efo',
        'exact' : True
    }
    response = requests.get(OLS_SEARCH_API, params=search_params)
    if (response.ok):

        jData = json.loads(response.content)
        
        # get the first hit
        first_hit = jData['response']['docs'][0]
        
        # store the mapping for use later
        term_id_map[lookup] = first_hit['obo_id']
        
        print ('OLS search result for \'{}\': mapped to \'{}\' with id {}'.format(lookup, first_hit['label'], first_hit['obo_id']))
        
        
        

## Get term information including synonyms and description

We can use the OLS API to fetch more information about each term, including the terms description and synonyms.

In [4]:
OLS_TERMS_API = OLS_BASE_URI + '/api/terms'

for term_id in term_id_map.values():
    

    lookup_params = {
        'id' : term_id,
        'ontology' : 'efo',
    }
    response = requests.get(OLS_TERMS_API, params=lookup_params)

    if response.ok:
        jData = json.loads(response.content)

        print ("Term: {} ({}),\nDescription:{}\nSynonyms:{}\n\n"
               .format(
                    term_id, 
                    jData['_embedded']['terms'][0]['label'],
                    jData['_embedded']['terms'][0]['description'][0], 
                    jData['_embedded']['terms'][0]['synonyms']))


## Getting related parent terms

You can use the OLS API to get all direct parent/child terms, or fetch all descendants/ancerstors for a given term.In this scenario we will use the `_links` exposed via the API to guide us to the correct REST URL. Also not that in these examples the results are paged. We will use the `_links` again to help navigate the results.  

In [5]:
OLS_TERMS_API = OLS_BASE_URI + '/api/terms'

for term_id in term_id_map.values():
    
    lookup_params = {
        'id' : term_id,
        'ontology' : 'efo',
    }
    response = requests.get(OLS_TERMS_API, params=lookup_params)

    if response.ok:
        jData = json.loads(response.content)

        # get the URL for direct parents and ancestors
        label = jData['_embedded']['terms'][0]['label']
        
        parents_url = jData['_embedded']['terms'][0]['_links']['parents']['href']
        ancestors_url = jData['_embedded']['terms'][0]['_links']['ancestors']['href']
   
        parents_response = requests.get(parents_url)
        
        print("{}({})".format(term_id, label))
        if parents_response.ok:
            
            jData = json.loads(parents_response.content)
            for parent in jData['_embedded']['terms']:
                print ("\t child of --> {} ({})".format(
                    parent['obo_id'],
                    parent['label']
                ))
                
            print("\n\n")
                            

        more_paged_results = True
        print("{}({})".format(term_id, label))
        while (more_paged_results): 
            ancestor_response = requests.get(ancestors_url)
            if ancestor_response.ok:
            
                jData = json.loads(ancestor_response.content)
                for parent in jData['_embedded']['terms']:
                    print ("\t descendant of --> {} ({})".format(
                        parent['obo_id'],
                        parent['label']
                    ))
                    
               
                if 'next' in jData['_links']:
                    ancestors_url = jData['_links']['next']['href']
                else:
                    more_paged_results = False
                    print("\n\n")

## Checking for subsumptions

You can use the OLS API to test if a term is child a particular higher level category. We want to know if any of the terms above are type of cancer. Cancer in EFO has the full URI of `http://www.ebi.ac.uk/efo/EFO_0000311`.

In [6]:
EFO_CANCER_TERM = 'http://www.ebi.ac.uk/efo/EFO_0000311'

for term_name in term_id_map.keys():
    
    search_params = {
        'q' : term_name,
        'exact' : True,
        'childrenOf' : EFO_CANCER_TERM,
        'ontology' : 'efo',
    }
    response = requests.get(OLS_SEARCH_API, params=search_params)

    if response.ok:
        jData = json.loads(response.content)
        
        # if we get a result then it must be a child
        
        if len(jData['response']['docs']) > 0:
            print("{} is a type of cancer".format(jData['response']['docs'][0]['label']))
        else:
            print("{} is not a type of cancer".format(term_name))



## Finding mappings to other ontologies

You will sometimes need to map between ontologies, especially the disease ontologies. You can use the EBI's OxO service to lookup mappings from existing ontologies. Here we will use the [OxO REST API](https://www.ebi.ac.uk/spot/oxo/docs/api) to find mappings for the terms used above. 

You can submit multiple ids at once to OxO, we will also restrict the mappings to a set of disease ontologies; SNOMEDCT and ICD10CM. We also only want mappings from a trusted source, in this case we will set MONDO and EFO and the allowed source for mappings. We set the distance to 1 to only get direct mappings. In the next section we will look at distance 2 and how that changes the result. Also note the results from OxO are paged. 


In [7]:
    
input_data = {
    'ids' : term_id_map.values(),
    "mappingTarget": ["SNOMEDCT", "ICD10CM"],
    "mappingSource": ["EFO", "MONDO"],
    'distance' : 1
}

def get_mappings_from_oxo (input_data):
    oxo_search_url = OXO_BASE_URI + '/api/search'
    more_paged_results = True
    while (more_paged_results): 
    
        oxo_response = requests.post(oxo_search_url, data=input_data)
        
        if oxo_response.ok:
            jData = json.loads(oxo_response.content)
    
            for oxo_result in jData["_embedded"]["searchResults"]:
                print("{} ({})".format(oxo_result['queryId'], oxo_result['label']))
    
                for mappings in oxo_result['mappingResponseList']:
                    print("\tmaps to {} ({}))".format(
                        mappings["curie"],
                        mappings["label"]
                    ))
            
            if 'next' in jData['_links']:
                oxo_search_url = jData['_links']['next']['href']
            else:
                more_paged_results = False
                
get_mappings_from_oxo(input_data)

Let's increase the distance and try again. Setting the distance to 2 will find indirect mappings and should increase the coverage for some terms. 

In [8]:
input_data = {
    'ids' : term_id_map.values(),
    "mappingTarget": ["SNOMEDCT", "ICD10CM"],
    "mappingSource": ["EFO", "MONDO"],
    'distance' : 2
}
get_mappings_from_oxo(input_data)

# Summary

We've shown some of the functionality of the EBI's Ontology Lookup Service and OxO Mapping API. For more information about the EBI's ontology services see http://www.ebi.ac.uk/spot/ontology. You can also contact the team at ols-support@ebi.ac.uk