In [12]:
# Parameter inputs
trapi_submit_url = "http://automat-u24.apps.renci.org/robokopkg/1.3/query"
automat_cypher_submit_url = 'https://automat.renci.org/robokopkg/cypher'
CURIE_buprenorphine_PubChem = "PUBCHEM.COMPOUND:644073"
CURIE_tremor_HP = "HP:0001337"

In [2]:
import requests
import json

This document provides a high level overview of the different ways to access ROBOKOP information programmatically. The first two examples rely on the Translator Reasoner API (TRAPI) format for both the query submission and return of results. The first TRAPI example is submitted directly to the Automat system, which hosts the ROBOKOP knowledgegraph. This query is submitted without preprocessing and returned without postprocessing of the results. An alternative is the Aragorn interface, which accepts a TRAPI query and returns a TRAPI query. This interface will expand the query to include synonomous concepts and postprocesses the results to score the results for potential relevance. The final two examples illustrate how to query the ROBOKOP knowledgegraph directly using the Neo4j query language Cypher. This includes both a direct query against the standalone instance of the ROBOKOP knowledgegraph as well as the version hosted via the Automat system.  The latter instance is the knowledge source for all TRAPI interfaces.
More info about accessing ROBOKOP KG via TRAPI is available in the `HelloRobokop_TRAPI` and Cypher options are provided in much more detail in `HelloRobokop_Cypher`.

## TRAPI

The first example uses the TRAPI format to query the ROBOKOP instance hosted on the Automat system.

The TRAPI Documentation is available here: https://github.com/NCATSTranslator/ReasonerAPI

Most TRAPI documents contain a `message` key.  Within that `message` are a `query_graph` denoting the user query,
a `knowledge_graph` consisting of the union of all nodes and edges that match the `query_graph` pattern, and a list of `results` that bind `query_graph` elements to `knowledge_graph` elements.

When a user submits a query, the message contains only the `query_graph`.  The query graph below consists of 3 nodes connected together in a line.   Two of the nodes (`n00` and `n02`) have specified identifiers, while the middle node of the line does not.  Rather the middle node has a list of `categories` that are acceptable. Nodes and edges are specified as defined lists to ensure the correct ordering of output strings at the end of this Notebook. This is not required for running the queries or retrieving results.

This query asks "Find me a Biological Process or Activity, or a Gene, or a Pathway that is related to both `PUBCHEM.COMPOUND:644073` (Buprenorphine) and `HP:0001337` (Tremor).

In [9]:
edges = ["e00", "e01"]
nodes = ["n00", "n01", "n02"]
query={
    "message": {
      "query_graph": {
        "edges": {
         edges[0] : {
            "subject": nodes[0],
              "object": nodes[1],
          "predicates":["biolink:related_to"]
          },
          edges[1]: {
            "subject": nodes[1],
              "object": nodes[2],
          "predicates":["biolink:related_to"]
          }
        },
        "nodes": {
          nodes[0]: {
            "ids": [CURIE_buprenorphine_PubChem],
            "categories": ["biolink:ChemicalEntity"]
          },
          nodes[1]: {
              "categories": ["biolink:BiologicalProcessOrActivity","biolink:Gene","biolink:Pathway"]
          },
          nodes[2]: {
            "ids": [CURIE_tremor_HP],
            "categories": ["biolink:DiseaseOrPhenotypicFeature"]
          }
        }
      }
    }
  }


This query can be sent to various components of Translator as needed.  It can be sent directly to the ROBOKOP knowledgegraph hosted in the Automat system like this:

In [13]:
response = requests.post(trapi_submit_url,json=query)

In [14]:
print(response.status_code)

200


In [15]:
print(len(response.json()['message']['results']))

7


The response in JSON form is a python dictionary with three main keys, the `message`, `log_level`, and `workflow`.  The `message` component contains the `query_graph` from the input query, ROBOKOP has added the `knowledge_graph`, and `results` which in combination contain the answer to the query graph. While we'll continue querying the response for the next few sections to reinforce the structure of the response, we'll go ahead and create separate variables for the three components to make future queries easier to read.

Additional details about each of these components can be found in `HelloRobokop_TRAPI.ipynb`.

In [16]:
import pprint
pp = pprint.PrettyPrinter(indent=5)

In [17]:
print(response.json().keys())
print(response.json()['message'].keys())

dict_keys(['message', 'log_level', 'workflow'])
dict_keys(['query_graph', 'knowledge_graph', 'results'])


Next we will summarize all results to provide an overview of the different result graphs matching our query. Each node and edge has all the additional info shown above available for further inspection. Additional info can be found in `HelloRobokop_TRAPI.ipynb`.  Results are printed below.

In [18]:
result_summaries = []
for r in response.json()['message']['results']:
    rs = ""
    j = 0
    while j < len(nodes):
        node_id = r['node_bindings'][nodes[j]][0]['id']
        node_name = response.json()['message']['knowledge_graph']['nodes'][node_id]['name']
        rs = rs + f"{node_name} ({node_id})"
        if j < len(edges):
            edge_id = r['edge_bindings'][edges[j]][0]['id']
            edge_name = response.json()['message']['knowledge_graph']['edges'][edge_id]['predicate']
            rs = rs + f"--{edge_name}-->"
        j = j + 1
    result_summaries.append(rs)

In [19]:
for rs in result_summaries:
    print(rs)

Buprenorphine (PUBCHEM.COMPOUND:644073)--biolink:affects-->OPRM1 (NCBIGene:4988)--biolink:genetic_association-->Asterixis (HP:0012164)
Buprenorphine (PUBCHEM.COMPOUND:644073)--biolink:directly_physically_interacts_with-->CYP2D6 (NCBIGene:1565)--biolink:genetic_association-->Postural tremor (HP:0002174)
Buprenorphine (PUBCHEM.COMPOUND:644073)--biolink:directly_physically_interacts_with-->CYP2D6 (NCBIGene:1565)--biolink:genetic_association-->Resting tremor (HP:0002322)
Buprenorphine (PUBCHEM.COMPOUND:644073)--biolink:directly_physically_interacts_with-->CYP2D6 (NCBIGene:1565)--biolink:genetic_association-->Tremor (HP:0001337)
Buprenorphine (PUBCHEM.COMPOUND:644073)--biolink:directly_physically_interacts_with-->CYP2D6 (NCBIGene:1565)--biolink:genetic_association-->Action tremor (HP:0002345)
Buprenorphine (PUBCHEM.COMPOUND:644073)--biolink:directly_physically_interacts_with-->CYP2D6 (NCBIGene:1565)--biolink:genetic_association-->Pill-rolling tremor (HP:0025387)
Buprenorphine (PUBCHEM.COMPO

## Cypher

You can also bypass TRAPI entirely and just use cypher to talk to the graph.  There are two instances.  There is one at http://robokopkg.renci.org which has a cypher browser on it, or you can write cypher and post it there. This requires the neo4j package, which is likely not installed if you haven't accessed a neo4j database before. The code below should work, but if you encounter errors, look into how to best install this package for your local setup.

Cypher queries can be posted to either the neo4j browser at robokopkg.renci.org or through automat at automat.renci.org (recommended).  Depending on how the Cypher query is structured, results may be returned differently between the two access points.  The query below is asking for slightly different information than the TRAPI message above.  The TRAPI query asks for results related to `Buprenorphine` and `Tremor` that are of the type `Gene`, `Pathway`, or `BiologicalProcessOrActivity`.  Because no results were present for `Pathway` or `BiologicalProcessOrActivity`, a Cypher query including these would return 0 results, so the below query has been modified to ask for results related to `Buprenorphine` and `Tremor` that are of the type `Gene`.

In [21]:
cypher = f"""MATCH (n0_0:`biolink:ChemicalEntity`)-[r0_0]-(n1_0:`biolink:Gene`)-[r1_0]-(n2_0:`biolink:DiseaseOrPhenotypicFeature`) 
WHERE n0_0.name IN ['Buprenorphine'] AND n2_0.name IN ['Tremor'] 
RETURN [startNode(r0_0),[type(r0_0),properties(r0_0)],endNode(r0_0)] as edge_1, 
[startNode(r1_0),[type(r1_0),properties(r1_0)],endNode(r1_0)] as edge_2, 
[n0_0.name, n1_0.name, n2_0.name] as node_names LIMIT 100"""

In [24]:
j = {'query': cypher}
results = requests.post(automat_cypher_submit_url,json=j)
print(results.status_code)
results_json = results.json()

200


In [25]:
print(results)

<Response [200]>


In [26]:
print(results.json())

{'results': [{'columns': ['edge_1', 'edge_2', 'node_names'], 'data': [{'row': [[{'CHEBI_ROLE_delta_opioid_agent': True, 'smiles': 'CO[C@]12CC[C@@]3(C[C@@H]1[C@](C)(O)C(C)(C)C)[C@H]1CC4=CC=C(O)C5=C4[C@@]3(CCN1CC1CC1)[C@H]2O5', 'description': 'A morphinane alkaloid that is 7,8-dihydromorphine 6-O-methyl ether in which positions 6 and 14 are joined by a -CH2CH2- bridge, one of the hydrogens of the N-methyl group is substituted by cyclopropyl, and a hydrogen at position 7 is substituted by a 2-hydroxy-3,3-dimethylbutan-2-yl group. It is highly effective for the treatment of opioid use disorder and is also increasingly being used in the treatment of chronic pain.', 'fda_labels': 74, 'rgb': 28, 'CHEBI_ROLE_analgesic': True, 'CHEBI_ROLE_opioid_agent': True, 'sp2_c': 0, 'sp3_c': 23, 'CHEBI_ROLE_antagonist': True, 'CHEBI_ROLE_agonist': True, 'CHEBI_ROLE_opioid_analgesic': True, 'cd_formula': 'C29H41NO4', 'alogs': -4.44, 'CHEBI_ROLE_opioid_receptor_agonist': True, 'id': 'PUBCHEM.COMPOUND:644073'

## Question Builder and ExEmPLAR

Two GUI-based tools are available in the form of the ROBOKOP Question Builder `https://robokop.renci.org/question-builder` and ExEmPLAR `https://www.exemplar.mml.unc.edu/`.  The ROBOKOP Question Builder allows users to build queries by specifying nodes and predicates with the aid of a graph showing what the full query looks like.  This returns a list of pathways with related terms, where each pathway can be expanded to show all individual edges.  The ExEmPLAR tool also allows users to build queries by specifying nodes and predicates, this time returning pathways with all individual edges and only the specific terms used in the query.

More details on these GUI can be found in `HelloRobokop_Question_Builder.ipynb` for the ROBOKOP Question Builder and in `HelloRobokop_ExEmPLAR.ipynb` for ExEmPLAR.