# Name Resolver (NameRes)

The [Babel pipelines](https://github.com/NCATSTranslator/Babel) generates sets of equivalent identifiers to be used by the [Node Normalizer](https://github.com/NCATSTranslator/NodeNormalization) to harmonize identifiers from different sources, but it also collects all known synonyms for these identifiers. The Name Resolver can be used to search through those synonyms.

### Instances of NodeNorm

For the examples in this document, we will use the _development_ version of the Node Normalization Service, hosted at https://nodenormalization-sri.renci.org/ by the [Renaissance Computing Institute](https://renci.org/) (RENCI) at the University of North Carolina. This version is updated more frequently than the production instance listed below.

The production instance of NodeNorm is hosted by the NCATS Translator project at https://nodenorm.transltr.io/docs, and may be older than the development version hosted below. As with other NCATS Translator tools, a [CI instance](https://nodenorm.ci.transltr.io/docs) and a [Test instance](https://nodenorm.test.transltr.io/docs) have also been deployed, but are not likely to be useful for non-Translator users.

## Searching by using a text string

Suppose that a user has the text string "diabetes" and we want to turn it into an identifier.  There are many concepts that might be appropriate.  Perhaps they are looking for the identifier for the disease "Diabetes Mellitus".  Or perhaps they are looking for a subtype such as "Type 2 Diabetes Mellitus".  Or perhaps they are looking for "diabetes drugs".  This service searches the lexical synonyms of concepts and returns those identifiers as well as the full set of synonyms for each of the returned identifiers.

In [8]:
import requests
import json
nameres_lookup_url = 'https://name-resolution-sri.renci.org/lookup'

In [9]:
response = requests.post(nameres_lookup_url, params={'string':'diabetes', 'limit': 5})
print(json.dumps(response.json(),indent=2))

[
  {
    "curie": "UMLS:C0011847",
    "label": "Diabetes",
    "highlighting": {},
    "synonyms": [
      "Diabetes",
      "diabetes NOS"
    ],
    "taxa": [],
    "types": [
      "biolink:Disease",
      "biolink:DiseaseOrPhenotypicFeature",
      "biolink:BiologicalEntity",
      "biolink:ThingWithTaxon",
      "biolink:NamedThing",
      "biolink:Entity"
    ],
    "score": 741.0644,
    "clique_identifier_count": 1
  },
  {
    "curie": "MONDO:0005015",
    "label": "diabetes mellitus",
    "highlighting": {},
    "synonyms": [
      "dm",
      "DM",
      "Diabetes",
      "diabetes",
      "Diabetes NOS",
      "diabetes (DM)",
      "diabete mellitus",
      "DIABETES MELLITUS",
      "Diabetes mellitus",
      "diabetes mellitus",
      "Diabetes Mellitus",
      "Diabetes mellitus NOS",
      "DIABETES MELLITUS NOS",
      "DM - Diabetes mellitus",
      "diabetes mellitus (DM)",
      "Diabetes mellitus, NOS",
      "Diabetes mellitus (DM)",
      "disorder diabetes me

Note also that we returned 5 results.  The `limit` parameter, which defaults to 10, sets the maximum number of results that come back.

Results can be batched using the `offset` parameter.  Here we'll get the first two results by setting `limit=2` and then show that we can get only the second result by setting `limit=1` and `offset=1`

In [11]:
print("These are the first two results:")
response = requests.post(nameres_lookup_url, params={'string':'diabetes','limit':2})
print(json.dumps(response.json(),indent=2))

print('\nAnd this is just the second result:')
response = requests.post(nameres_lookup_url, params={'string':'diabetes','limit':1, 'offset':1})
print(json.dumps(response.json(),indent=2))

These are the first two results:
[
  {
    "curie": "UMLS:C0011847",
    "label": "Diabetes",
    "highlighting": {},
    "synonyms": [
      "Diabetes",
      "diabetes NOS"
    ],
    "taxa": [],
    "types": [
      "biolink:Disease",
      "biolink:DiseaseOrPhenotypicFeature",
      "biolink:BiologicalEntity",
      "biolink:ThingWithTaxon",
      "biolink:NamedThing",
      "biolink:Entity"
    ],
    "score": 741.0644,
    "clique_identifier_count": 1
  },
  {
    "curie": "MONDO:0005015",
    "label": "diabetes mellitus",
    "highlighting": {},
    "synonyms": [
      "dm",
      "DM",
      "Diabetes",
      "diabetes",
      "Diabetes NOS",
      "diabetes (DM)",
      "diabete mellitus",
      "DIABETES MELLITUS",
      "Diabetes mellitus",
      "diabetes mellitus",
      "Diabetes Mellitus",
      "Diabetes mellitus NOS",
      "DIABETES MELLITUS NOS",
      "DM - Diabetes mellitus",
      "diabetes mellitus (DM)",
      "Diabetes mellitus, NOS",
      "Diabetes mellitus (

## Filtering options

### Biolink type

### Prefixes: only and exclude

### Taxa: only

## Autocomplete mode

An important use of NameRes on Translator is to support autocomplete functionality on websites. To support this, NameRes includes an `autocomplete` flag which expands your search to assume the final word in your search is incomplete. For example, if you search for "diab" without autocomplete, you get results such as the "DiaB Klenz wound & skin cleanser", "Diab Gel" and Wolfram syndrome (which includes "Diab ins,diab mell,opt at,deaf" as a synonym). However, with autocomplete turned on, results including diabetes are moved higher up in the search.

In [18]:
response = requests.post(nameres_lookup_url, params={'string':'diab','limit':5,'autocomplete':False})
print(json.dumps(response.json(),indent=2))

[
  {
    "curie": "UMLS:C0719857",
    "label": "Diab",
    "highlighting": {},
    "synonyms": [
      "Diab"
    ],
    "taxa": [],
    "types": [
      "biolink:ChemicalEntity",
      "biolink:PhysicalEssence",
      "biolink:ChemicalOrDrugOrTreatment",
      "biolink:ChemicalEntityOrGeneOrGeneProduct",
      "biolink:ChemicalEntityOrProteinOrPolypeptide",
      "biolink:NamedThing",
      "biolink:Entity",
      "biolink:PhysicalEssenceOrOccurrent"
    ],
    "score": 1331.6138,
    "clique_identifier_count": 1
  },
  {
    "curie": "UMLS:C1330194",
    "label": "Diab Gel",
    "highlighting": {},
    "synonyms": [
      "Diab Gel"
    ],
    "taxa": [],
    "types": [
      "biolink:Drug",
      "biolink:ChemicalOrDrugOrTreatment",
      "biolink:OntologyClass",
      "biolink:MolecularMixture",
      "biolink:ChemicalMixture",
      "biolink:ChemicalEntity",
      "biolink:PhysicalEssence",
      "biolink:ChemicalEntityOrGeneOrGeneProduct",
      "biolink:ChemicalEntityOrProtein

In [19]:
response = requests.post(nameres_lookup_url, params={'string':'diab','limit':5,'autocomplete':True})
print(json.dumps(response.json(),indent=2))

[
  {
    "curie": "UMLS:C0719857",
    "label": "Diab",
    "highlighting": {},
    "synonyms": [
      "Diab"
    ],
    "taxa": [],
    "types": [
      "biolink:ChemicalEntity",
      "biolink:PhysicalEssence",
      "biolink:ChemicalOrDrugOrTreatment",
      "biolink:ChemicalEntityOrGeneOrGeneProduct",
      "biolink:ChemicalEntityOrProteinOrPolypeptide",
      "biolink:NamedThing",
      "biolink:Entity",
      "biolink:PhysicalEssenceOrOccurrent"
    ],
    "score": 741.0644,
    "clique_identifier_count": 1
  },
  {
    "curie": "MONDO:0012819",
    "label": "diabetic ketoacidosis",
    "highlighting": {},
    "synonyms": [
      "DKA",
      "KPD",
      "Diabetic Ketoses",
      "diabetic ketosis",
      "Diabetic ketosis",
      "Diabetic Ketosis",
      "diabetic ketoses",
      "Diabetic Acidosis",
      "Diabetic acidosis",
      "DIABETIC ACIDOSIS",
      "Diabetic Acidoses",
      "Ketosis, Diabetic",
      "diabetic acidosis",
      "ACIDOSIS DIABETIC",
      "Acidosis

## Synonyms

The synonyms endpoint can be used to look up synonyms (and other information) for a particular identifier. Note that **the synonyms endpoint will not normalize identifiers** -- only the preferred identifier from a NodeNorm result with both GeneProtein and DrugChemical conflation turned on can be looked up in the corresponding NameRes instance. This actually includes all the information stored about that clique in Apache Solr, some of which is used in ordering search results.

In [26]:
nodenorm_url = "https://nodenormalization-sri.renci.org/get_normalized_nodes"
curie = "UniProtKB:A0A0S2Z3B5"
response = requests.get(nodenorm_url, params={"curie": "UniProtKB:A0A0S2Z3B5", "conflate": True, "drug_chemical": True})
result = response.json()
print(json.dumps(result, indent=2))
normalized_curie = result[curie]["id"]["identifier"]

{
  "UniProtKB:A0A0S2Z3B5": {
    "id": {
      "identifier": "NCBIGene:1756",
      "label": "DMD"
    },
    "equivalent_identifiers": [
      {
        "identifier": "NCBIGene:1756",
        "label": "DMD",
        "taxa": [
          "NCBITaxon:9606"
        ]
      },
      {
        "identifier": "ENSEMBL:ENSG00000198947"
      },
      {
        "identifier": "HGNC:2928",
        "label": "DMD"
      },
      {
        "identifier": "OMIM:300377"
      },
      {
        "identifier": "UMLS:C1414083",
        "label": "DMD gene"
      },
      {
        "identifier": "UniProtKB:A0A087WV90",
        "label": "A0A087WV90_HUMAN Dystrophin (trembl)",
        "taxa": [
          "NCBITaxon:9606"
        ]
      },
      {
        "identifier": "UniProtKB:A0A0S2Z3B5",
        "label": "A0A0S2Z3B5_HUMAN Dystrophin isoform 2 (trembl)",
        "taxa": [
          "NCBITaxon:9606"
        ]
      },
      {
        "identifier": "UniProtKB:A0A0S2Z3J7",
        "label": "A0A0S2Z3J7_HUMAN 

In [23]:
# A query without normalization won't work.
nameres_synonyms_url = "https://name-resolution-sri.renci.org/synonyms"
response = requests.get(nameres_synonyms_url, params={"preferred_curies": curie})
result = response.json()
print(json.dumps(result, indent=2))

{
  "UniProtKB:A0A0S2Z3B5": {}
}


In [27]:
# A query with normalization will work.
nameres_synonyms_url = "https://name-resolution-sri.renci.org/synonyms"
response = requests.get(nameres_synonyms_url, params={"preferred_curies": normalized_curie})
result = response.json()
print(json.dumps(result, indent=2))

{
  "NCBIGene:1756": {
    "curie": "NCBIGene:1756",
    "preferred_name": "DMD",
    "names": [
      "BMD",
      "DMD",
      "MRX85",
      "CMD3B",
      "DXS164",
      "DXS270",
      "DXS142",
      "DXS268",
      "DXS206",
      "DXS272",
      "DXS269",
      "DXS239",
      "DXS230",
      "DMD Gene",
      "DMD gene",
      "DYSTROPHIN",
      "dystrophin",
      "APO-DYSTROPHIN 1",
      "mutant dystrophin",
      "mental retardation, X-linked 85",
      "muscular dystrophy, Duchenne and Becker types",
      "Dystrophin (Muscular Dystrophy, Duchenne And Becker Types) Gene",
      "dystrophin (muscular dystrophy, Duchenne and Becker types), includes DXS142, DXS164, DXS206, DXS230, DXS239, DXS268, DXS269, DXS270, DXS272",
      "A0A087WV90_HUMAN Dystrophin (trembl)",
      "A0A0S2Z3B5_HUMAN Dystrophin isoform 2 (trembl)",
      "A0A0S2Z3J7_HUMAN Dystrophin isoform 1 (Fragment) (trembl)",
      "A0A5H1ZRP9_HUMAN Dystrophin (trembl)",
      "A0A5H1ZRQ1_HUMAN Dystrophin (tremb