In [1]:
# Imports
import yaml
from IPython.display import Markdown, display
import urllib
import json
import re
import pandas as pd
import numpy as np

# Using the Bioportal API to retrieve the mappings of terms in the eNM ontology

# Motivation
The eNanoMapper imports terms from several ontologies. For example, CHEBI and NPO both contain the class `chemical substance`, but eNanoMapper only imports the one from CHEBI [CHEBI_59999](http://purl.obolibrary.org/obo/CHEBI_59999) and not [NPO_1973](http://purl.bioontology.org/ontology/npo#NPO_1973). How do these homonymous classes map to each other?

# Approach
Using the programmatic access to Bioportal, it is possible to retrieve the mapping data originating from using LOOM and IRI matching.
The mappings will be checked to relevant, upper(ish)-level classes, among all eNanoMapper ontology dependences.

In [2]:
with open("config.yaml", "r") as f:
    config = yaml.safe_load(f)
    dependencies = [dep.upper() for dep in config["dependencies"]]
    API_KEY = config["api_key"]
    classes = config["classes"]

Setting up a function to use the API ([documentation](https://data.bioontology.org/documentation#Mapping), [example response](https://data.bioontology.org/ontologies/CHEBI/classes/http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCHEBI_59999/mappings)):

In [3]:
def mappings_lookup(dependencies, iri, source):
    """
    This function returns the bioportal mapping of a given IRI in a given source ontology.

    Parameters:
    dependencies (list): List of dependencies to match the mappings against.
    iri (str): IRI of the term for which to retrieve mappings.
    source (str): Source ontology for the term.

    Returns:
    matches: List of matching mappings.
    """
    print(f"IRI: {iri}")
    f_iri = urllib.parse.quote(iri, safe='')
    for dependency in dependencies:
        rq = f"https://data.bioontology.org/ontologies/{source}/classes/{f_iri}/mappings"
        opener = urllib.request.build_opener()
        opener.addheaders = [('Authorization', 'apikey token=' + API_KEY)]
        try:
            r = json.loads(opener.open(rq).read())
        except Exception as e:
            pass
        else:
            classes = [i["classes"] for i in r]
            matches = [classes[i] for i in range(len(classes))]
            matches = [matches[i][j]["@id"] for i in range(len(matches)) for j in range(0, len(matches[i]))]
            matches = [match for match in matches if match!=iri if any(re.findall(dep, match) for dep in dependencies)]
            return matches

Using the function to retrieve the mappings for the terms specified in te [config file](config.yaml) and saving it to a JSON:

In [4]:
mappings = {term:mappings_lookup(dependencies=dependencies, iri=classes[term][1], source = classes[term][0]) for term in classes}
for term in mappings:
    mappings[term] = list(np.unique(mappings[term]))
with open("mappings.json", "w") as outfile:
    json.dump(mappings, outfile)

IRI: http://purl.obolibrary.org/obo/CHEBI_59999
IRI: http://purl.obolibrary.org/obo/CHEBI_23367
IRI: http://purl.obolibrary.org/obo/BFO_0000040
IRI: http://purl.bioontology.org/ontology/npo#NPO_199
IRI: http://purl.bioontology.org/ontology/npo#NPO_707


Saving the result to a table for the repo [README.md](README.md):

In [5]:
pd.DataFrame.from_dict(mappings, orient='index').transpose().to_markdown("README.md")