# Edge Normalization

## Introduction

The [Biolink Model](https://biolink.github.io/biolink-model/) defines allowed predicates in the Translator ecosystem.  Ingesting data from arbitrary sources requires mapping predicates in those sources to Translator predicates. 
 
The [EdgeNormalization Service](https://edgenormalization-sri.renci.org/apidocs) can find predicates if they have an exact mapping in the model.  The EdgeNormalization service takes this a step further, and attempts to find the best match to a Biolink predicate, even if there is not an explicit mapping. 

## Direct Lookups

If the Biolink model defines a direct mapping to a predicate from another vocabulary, then the EdgeNormalization service will find it.  In this example, we are starting with the RO property `RO:0002450 (directly positively regulates activity of`.  We use the [Biolink Lookup Service](https://bl-lookup-sri.renci.org/apidocs/) to find a predicate that has a direct mapping, and it returns `positively regulates, entity to entity`.  We can also see that by calling the Biolink Lookup service for that property directly.

In [26]:
import json
import requests

response=requests.get('https://bl-lookup-sri.renci.org/uri_lookup/RO:0002450')
print( json.dumps(response.json(), indent = 2))

response=requests.get('https://bl-lookup-sri.renci.org/bl/positively_regulates__entity_to_entity')
props = response.json()
print('According to the biolink model, our property has the following mappings:', props['mappings'])

[
  "positively_regulates__entity_to_entity"
]
According to the biolink model, our property has the following mappings: ['RO:0002450', 'SEMMEDDB:STIMULATES']


Having now seen that there is a direct mapping for this term (as well as `SEMMEDDB:STIMULATES`) in the Biolink Model, we can see what happens when we use EdgeNormalization: it returns the relevant Biolink predicate.  So far, this EdgeNormalization simply recapitulates the response of `uri_lookup`. 

Notice that EdgeNormalization allows batched calls, as seen here.

In [27]:
response = requests.get('https://edgenormalization-sri.renci.org/resolve_predicate',
                        params={'predicate':['RO:0002450','SEMMEDDB:STIMULATES']})
print('\nNow use these mappings, and see if we can get back to the predicate:')
print( json.dumps(response.json(), indent=2))


Now use these mappings, and see if we can get back to the predicate:
{
  "RO:0002450": {
    "identifier": "RO:0002450",
    "label": "positively_regulates__entity_to_entity"
  },
  "SEMMEDDB:STIMULATES": {
    "identifier": "RO:0002450",
    "label": "positively_regulates__entity_to_entity"
  }
}


## Find BioLink Predicate for unmapped term

EdgeNormalization can also return the best Biolink predicate for an unmapped term.  At the moment, this functionality is limited to terms from Relation Ontology (RO).   Here, we will begin with `RO:0002354 (formed as a result of)`. Checking the [Biolink Lookup Service](https://bl-lookup-sri.renci.org/apidocs/) we can see that there is no direct mapping:

In [None]:
response=requests.get('https://bl-lookup-sri.renci.org/uri_lookup/RO:0002354')
print(response.status_code)
props = response.json()
print(props)

However, if we call EdgeNormalization, we are returned a suitable Biolink Model predicate:

In [28]:
response = requests.get('https://edgenormalization-sri.renci.org/resolve_predicate',
                        params={'predicate':['RO:0002354']})
print('\nNow use these mappings, and see if we can get back to the predicate:')
print( json.dumps(response.json(), indent=2))


Now use these mappings, and see if we can get back to the predicate:
{
  "RO:0002354": {
    "identifier": "RO:0000056",
    "label": "participates_in"
  }
}
