In [1]:
# Imports
import yaml
import urllib
import json
import pandas as pd

# Using the Bioportal API to retrieve the mappings of terms in the eNM ontology

# Motivation
The eNanoMapper imports terms from several ontologies. For example, CHEBI and NPO both contain the class `chemical substance`, but eNanoMapper only imports the one from CHEBI [CHEBI_59999](http://purl.obolibrary.org/obo/CHEBI_59999) and not [NPO_1973](http://purl.bioontology.org/ontology/npo#NPO_1973). This notebooks checks whether the same 'mapping' is also present in Bioportal.

# Approach
Using the programmatic access to Bioportal, it is possible to retrieve the Bioportal mapping data (_LOOM_ and IRI matching).

In [2]:
with open("../config.yaml", "r") as f:
    config = yaml.safe_load(f)
    dependencies = [dep.upper() for dep in config["dependencies"]]
    API_KEY = config["bioportal_api_key"]
    classes = config["classes"]

Setting up a function to use the API ([documentation](https://data.bioontology.org/documentation#Mapping), [example response](https://data.bioontology.org/ontologies/CHEBI/classes/http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCHEBI_59999/mappings)):

In [3]:
def mappings_lookup(dependencies, classes,):
    iri = classes['kept']
    removed = classes['removed']
    source = ""
    for d in dependencies:
        if d.casefold() in iri.casefold():
            source = d
    if source == "":
        print(f'No mappings found for {iri}')
        exit(1)
    f_iri = urllib.parse.quote(iri, safe='')
    rq = f"https://data.bioontology.org/ontologies/{source}/classes/{f_iri}/mappings"
    opener = urllib.request.build_opener()
    opener.addheaders = [('Authorization', 'apikey token=' + API_KEY)]
    try:
        r = json.loads(opener.open(rq).read())
    except Exception as e:
        print(e)
        pass
    else:
        classes = [i["classes"] for i in r]
        matches = [classes[i] for i in range(len(classes))]
        matches = [matches[i][j]["@id"] for i in range(len(matches)) for j in range(0, len(matches[i]))]
        if removed in matches:
            return True
        else:
            return False

Using the function to retrieve the mappings for the terms specified in te [config file](config.yaml) and saving it to a JSON:

In [4]:
result = []
for term in classes.keys():
    in_bioportal = mappings_lookup(dependencies=dependencies, classes=classes[term], ) 
    result.append([classes[term]['kept'], classes[term]['removed'], in_bioportal])
result = pd.DataFrame(result, columns = ['kept', 'removed', 'in_bioportal'])

No mappings found for http://purl.enanomapper.org/onto/ENM_0000089
HTTP Error 404: Not Found
HTTP Error 404: Not Found


In [5]:
result.to_csv('../results/in_bioportal.csv', index=False)
result

Unnamed: 0,kept,removed,in_bioportal
0,http://purl.bioontology.org/ontology/npo#NPO_401,http://purl.obolibrary.org/obo/CHEBI_50825,True
1,http://purl.bioontology.org/ontology/npo#NPO_1542,http://purl.obolibrary.org/obo/CHEBI_50836,True
2,http://purl.bioontology.org/ontology/npo#NPO_729,http://purl.obolibrary.org/obo/CHEBI_50822,True
3,http://purl.bioontology.org/ontology/npo#NPO_1892,http://purl.obolibrary.org/obo/CHEBI_50826,True
4,http://purl.bioontology.org/ontology/npo#NPO_1893,http://purl.obolibrary.org/obo/CHEBI_52522,True
5,http://purl.enanomapper.org/onto/ENM_0000089,http://www.bioassayontology.org/bao#BAO_0002135,
6,http://purl.obolibrary.org/obo/OBI_0000891,http://www.bioassayontology.org/bao#BAO_0002805,True
7,http://purl.obolibrary.org/obo/BTO_0000018,http://www.ebi.ac.uk/efo/EFO_0001086,False
8,http://purl.bioontology.org/ontology/npo#NPO_639,http://purl.obolibrary.org/obo/ENVO_00010506,True
9,http://www.bioassayontology.org/bao#BAO_0002145,http://purl.bioontology.org/ontology/npo#NPO_1196,True
