# Introduction


This notebook demonstrates how BioThings Explorer can be used to execute queries having more than one intermediate nodes (here a "three-hop query").  The original posed TRAPI query was this:

```
{
 "message": {
  "query_graph": {
    "edges": {
      "e0": {
        "predicate": "biolink:gene_associated_with_condition",
        "subject": "n0",
        "object": "n1"
      },
      "e1": {
        "predicate":     "biolink:gene_has_variant_that_contributes_to_disease_association",
        "subject": "n0",
        "object": "n2"
      },
      "e2": {
        "subject": "n2",
        "object": "n3",
        "predicate": "biolink:correlated_with"
      }
    },
    "nodes": {
      "n0": {
        "category": "biolink:Gene"
      },
      "n1": {
        "category": "biolink:Disease",
        "id": "MONDO:0007254"
      },
      "n2": {
        "category": "biolink:Disease"
      },
      "n3": {
        "category": "biolink:ChemicalSubstance"
      }
    }
  }
 }
}
```

The original query starts with **"breast cancer" (MONDO:0007254)**.  To reduce the combinatorial explosion, I've replaced that with **"NGLY1-deficiency" (MONDO:0014109)** in the notebook below. The rest of the query path joins that initial disease to a `ChemicalSubstance`, joined by two intermediate nodes with type `Gene` and `Disease`.


**Background**: BioThings Explorer can answer two classes of queries -- "EXPLAIN" and "PREDICT".  EXPLAIN queries are described in [EXPLAIN_demo.ipynb](https://github.com/biothings/biothings_explorer/blob/master/jupyter%20notebooks/EXPLAIN_demo.ipynb), and PREDICT queries are described in [PREDICT_demo.ipynb](https://github.com/biothings/biothings_explorer/blob/master/jupyter%20notebooks/PREDICT_demo.ipynb). Here, we describe PREDICT queries and how to use BioThings Explorer to execute them.  A more detailed overview of the BioThings Explorer systems is provided in [these slides](https://docs.google.com/presentation/d/1QWQqqQhPD_pzKryh6Wijm4YQswv8pAjleVORCPyJyDE/edit?usp=sharing).

## Step 0: Load BioThings Explorer modules

Install the `biothings_explorer` and `biothings_schema` packages, as described in this [README](https://github.com/biothings/biothings_explorer/blob/master/jupyter%20notebooks/README.md#prerequisite).  This only needs to be done once (but including it here for compability with [colab](https://colab.research.google.com/)).

In [1]:
!pip install git+https://github.com/biothings/biothings_explorer#egg=biothings_explorer





Next, import the relevant modules:

* **Hint**: Find corresponding bio-entity representation used in BioThings Explorer based on user input (could be any database IDs, symbols, names)
* **FindConnection**: Find intermediate bio-entities which connects user specified input and output

In [2]:
from biothings_explorer.hint import Hint
from biothings_explorer.user_query_dispatcher import FindConnection
import nest_asyncio
nest_asyncio.apply()

## Step 1: Find representation of the input disease in BTE

In this step, BioThings Explorer translates our query string into a BioThings object, which contain mappings to many common identifiers.  Generally, the top result returned by the `Hint` module will be the correct item, but you should confirm that using the identifiers shown.

Search terms can correspond to any child of [BiologicalEntity](https://biolink.github.io/biolink-model/docs/BiologicalEntity.html) from the [Biolink Model](https://biolink.github.io/biolink-model/docs/), including `DiseaseOrPhenotypicFeature` (e.g., "lupus"), `ChemicalSubstance` (e.g., "acetaminophen"), `Gene` (e.g., "CDK2"), `BiologicalProcess` (e.g., "T cell differentiation"), and `Pathway` (e.g., "Citric acid cycle").

In [3]:
ht = Hint()
input_disease = ht.query("MONDO:0014109")["Disease"][0]

input_disease

{'MONDO': 'MONDO:0014109',
 'DOID': 'DOID:0060728',
 'UMLS': 'C3808991',
 'name': 'NGLY1-deficiency',
 'OMIM': '615273',
 'ORPHANET': '404454',
 'primary': {'identifier': 'MONDO',
  'cls': 'Disease',
  'value': 'MONDO:0014109'},
 'display': 'MONDO(MONDO:0014109) DOID(DOID:0060728) OMIM(615273) ORPHANET(404454) UMLS(C3808991) name(NGLY1-deficiency)',
 'type': 'Disease'}

## Step 2: Find `ChemicalSubstance`s that are associated with the input disease through `Gene` and `Disease` as intermediate nodes

In this section, we find all paths in the knowledge graph that connect the input disease to any entity that is a `ChemicalSubstance` feature.  To do that, we will use `FindConnection`.  This class is a convenient wrapper around two advanced functions for **query path planning** and **query path execution**. More advanced features for both query path planning and query path execution are in development and will be documented in the coming months. 

In [4]:
fc = FindConnection(input_obj=input_disease, 
                    output_obj='ChemicalSubstance', 
                    intermediate_nodes=['Gene', 'Disease'])

In [5]:
fc.connect(verbose=True)


BTE will find paths that join 'NGLY1-deficiency' and 'ChemicalSubstance'. Paths will have 2 intermediate node.

Intermediate node #1 will have these type constraints: Gene

Intermediate node #2 will have these type constraints: Disease




==== Step #1: Query path planning ====

Because NGLY1-deficiency is of type 'Disease', BTE will query our meta-KG for APIs that can take 'Disease' as input and 'Gene' as output

BTE found 10 apis:

API 1. hetio(1 API call)
API 2. biolink(1 API call)
API 3. mydisease(1 API call)
API 4. semmed_disease(15 API calls)
API 5. pharos(1 API call)
API 6. cord_disease(1 API call)
API 7. mgi_gene2phenotype(1 API call)
API 8. scigraph(1 API call)
API 9. scibite(1 API call)
API 10. DISEASES(1 API call)


==== Step #2: Query path execution ====
NOTE: API requests are dispatched in parallel, so the list of APIs below is ordered by query time.

API 3.2: https://biothings.ncats.io/semmed/query?fields=coexists_with (POST -d q=C3808991&scopes=umls)
API 3.9: https://bi

API 1.4: https://automat.renci.org/hetio/gene/disease/NCBIGene:11124
API 7.1: https://mydisease.info/v1/query?fields=disgenet.xrefs.umls&size=250 (POST -d q=26232,4779,84301,11124,64772,26270,5236,55768,29926,199857,5373&scopes=disgenet.genes_related_to_disease.gene_id)


==== Step #3: Output normalization ====

API 3.1 semmed_gene: No hits
API 3.2 semmed_gene: 2 hits
API 2.1 biolink: No hits
API 2.2 biolink: No hits
API 2.3 biolink: 1 hits
API 2.4 biolink: 8 hits
API 2.5 biolink: No hits
API 2.6 biolink: No hits
API 2.7 biolink: 4 hits
API 2.8 biolink: 1 hits
API 2.9 biolink: 2 hits
API 2.10 biolink: 6 hits
API 2.11 biolink: 4 hits
API 6.1 scibite: No hits
API 6.2 scibite: No hits
API 6.3 scibite: No hits
API 6.4 scibite: No hits
API 6.5 scibite: No hits
API 6.6 scibite: No hits
API 6.7 scibite: No hits
API 6.8 scibite: No hits
API 6.9 scibite: No hits
API 6.10 scibite: No hits
API 6.11 scibite: No hits
API 3.3 semmed_gene: No hits
API 3.4 semmed_gene: No hits
API 5.1 DISEASES: 159 hi

API 2.94: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0001149
API 2.93: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0016785
API 2.95: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0045010
API 2.92: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0007719
API 2.99: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0019297
API 2.100: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0004976
API 2.98: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0012123
API 2.22: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0005395
API 2.107: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0002017
API 2.13: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0008100
API 2.109: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0005488
API 2.44: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0014109
API 2.36: htt

API 2.196: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0002908
API 2.197: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0014789
API 2.204: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0017896
API 2.198: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0044113
API 2.141: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0005336
API 2.212: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0011257
API 2.143: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0014912
API 2.165: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0000840
API 2.163: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0007079
API 2.166: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0004577
API 2.164: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0000468
API 2.145: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0006955
API 

API 2.232: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0005044
API 2.231: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0005438
API 2.235: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0021199
API 2.237: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0005371
API 2.239: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0004995
API 2.238: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0001673
API 2.241: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0004975
API 2.242: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0009561
API 2.240: https://automat.renci.org/hmdb/disease/chemical_substance/MONDO:0004989
API 8.16: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0005156
API 8.11: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0015742
API 8.14: https://automat.renci.org/cord19-scigraph/disease/chemica

API 8.156: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0004907
API 8.158: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0006504
API 8.159: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0004781
API 8.160: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0005299
API 8.157: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0007402
API 8.155: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0008907
API 8.162: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0005490
API 8.161: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0005105
API 8.92: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0007719
API 8.89: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0016064
API 8.95: https://automat.renci.org/cord19-scigraph/disease/ch

API 8.237: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0005371
API 8.261: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0002367
API 8.241: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0004975
API 8.255: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0009310
API 8.273: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0020745
API 8.263: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0009938
API 8.271: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0015943
API 8.272: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0009279
API 8.264: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0005335
API 8.276: https://automat.renci.org/cord19-scigraph/disease/chemical_substance/MONDO:0010998
API 6.11: https://automat.renci.org/cord19-scibite/disease/c

API 6.146: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0033619
API 6.161: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0005105
API 6.159: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0004781
API 6.154: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0043765
API 6.141: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0005336
API 6.134: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0005560
API 6.140: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0020671
API 6.138: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0021569
API 6.133: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0001286
API 6.17: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0000429
API 6.16: https://automat.renci.org/cord19-scibite/disease/chemical_sub

API 6.193: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0005406
API 6.200: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0006527
API 6.261: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0002367
API 6.263: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0009938
API 6.262: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0000001
API 3.1: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0002427
API 6.276: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0010998
API 3.3: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0020358
API 3.2: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0011801
API 6.189: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0015286
API 6.190: https://automat.renci.org/cord19-scibite/disease/chemical_substance/MONDO:0009182
API 3.4

API 3.35: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0005293
API 3.36: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0017117
API 3.47: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0005392
API 3.42: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0004946
API 3.46: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0006497
API 3.44: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0014109
API 3.45: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0001280
API 3.40: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0009309
API 3.74: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0005618
API 3.79: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0005132
API 3.69: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0008939
API 3.77: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:

API 3.200: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0006527
API 3.198: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0044113
API 3.189: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0015286
API 3.192: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0001384
API 3.188: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0018874
API 3.190: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0009182
API 3.193: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0005406
API 3.191: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0003847
API 3.179: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0005904
API 3.242: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0009561
API 3.245: https://automat.renci.org/pharos/disease/chemical_substance/MONDO:0008383
API 3.207: https://automat.renci.org/pharos/disease/chemical_subs

API 1.4: https://biothings.ncats.io/semmed/query?fields=prevented_by (POST -d q=C0015376,C0857836,C4317224,C0752324,C4021085,C0699748,C0033377,C0033687,C0917816,C0424230,C0242354,C0234174,C0025362,C0020534,C0332482,C0023493,C3150191,C0011847,C0151603,C3808991,C0014070,C0086439,C0005940,CN227216,C0586553,C0240671,C0024620,C0857379,C0282313,C0029489,C0151699,C0344505,C0239234,C0017919,C0015544,C0007222,C0026848,C0155502,C0010823,C0282577,C0427065,C0017154,C0221356,C0014848,C0006111,C0346153,C4280709,C1518922,C0007786,C1840077,C1865145,C0022658,C0007682,C0038325,C0238463,CN881103,C4551715,C1281901,C0009946,C0234966,C4022605,C0011168,C0239067,C1956391,C0010964,C0751884,C1836542,C0025363,C0151786,C0010038,C4073157,C0020456,C0239981,C0027726,C0338484,C0553681,C0027819,C3714756,C1257958,C1839603,C1854113,C0027627,C4551563,C0018802,C0021655,C0024236,C0007134,C0151526,C4551714,C1837098,CN236628,C1837260,C4476792,C0015967,C0085740,C0003873,C0037763,CN205405,C2752015,C0035335,C0026857,C0242656,C0

API 1.10: https://biothings.ncats.io/semmed/query?fields=affects (POST -d q=C0015376,C0857836,C4317224,C0752324,C4021085,C0699748,C0033377,C0033687,C0917816,C0424230,C0242354,C0234174,C0025362,C0020534,C0332482,C0023493,C3150191,C0011847,C0151603,C3808991,C0014070,C0086439,C0005940,CN227216,C0586553,C0240671,C0024620,C0857379,C0282313,C0029489,C0151699,C0344505,C0239234,C0017919,C0015544,C0007222,C0026848,C0155502,C0010823,C0282577,C0427065,C0017154,C0221356,C0014848,C0006111,C0346153,C4280709,C1518922,C0007786,C1840077,C1865145,C0022658,C0007682,C0038325,C0238463,CN881103,C4551715,C1281901,C0009946,C0234966,C4022605,C0011168,C0239067,C1956391,C0010964,C0751884,C1836542,C0025363,C0151786,C0010038,C4073157,C0020456,C0239981,C0027726,C0338484,C0553681,C0027819,C3714756,C1257958,C1839603,C1854113,C0027627,C4551563,C0018802,C0021655,C0024236,C0007134,C0151526,C4551714,C1837098,CN236628,C1837260,C4476792,C0015967,C0085740,C0003873,C0037763,CN205405,C2752015,C0035335,C0026857,C0242656,C07518

API 1.12: https://biothings.ncats.io/semmed/query?fields=affected_by (POST -d q=C0015376,C0857836,C4317224,C0752324,C4021085,C0699748,C0033377,C0033687,C0917816,C0424230,C0242354,C0234174,C0025362,C0020534,C0332482,C0023493,C3150191,C0011847,C0151603,C3808991,C0014070,C0086439,C0005940,CN227216,C0586553,C0240671,C0024620,C0857379,C0282313,C0029489,C0151699,C0344505,C0239234,C0017919,C0015544,C0007222,C0026848,C0155502,C0010823,C0282577,C0427065,C0017154,C0221356,C0014848,C0006111,C0346153,C4280709,C1518922,C0007786,C1840077,C1865145,C0022658,C0007682,C0038325,C0238463,CN881103,C4551715,C1281901,C0009946,C0234966,C4022605,C0011168,C0239067,C1956391,C0010964,C0751884,C1836542,C0025363,C0151786,C0010038,C4073157,C0020456,C0239981,C0027726,C0338484,C0553681,C0027819,C3714756,C1257958,C1839603,C1854113,C0027627,C4551563,C0018802,C0021655,C0024236,C0007134,C0151526,C4551714,C1837098,CN236628,C1837260,C4476792,C0015967,C0085740,C0003873,C0037763,CN205405,C2752015,C0035335,C0026857,C0242656,C0

API 1.9: https://biothings.ncats.io/semmed/query?fields=related_to (POST -d q=C0015376,C0857836,C4317224,C0752324,C4021085,C0699748,C0033377,C0033687,C0917816,C0424230,C0242354,C0234174,C0025362,C0020534,C0332482,C0023493,C3150191,C0011847,C0151603,C3808991,C0014070,C0086439,C0005940,CN227216,C0586553,C0240671,C0024620,C0857379,C0282313,C0029489,C0151699,C0344505,C0239234,C0017919,C0015544,C0007222,C0026848,C0155502,C0010823,C0282577,C0427065,C0017154,C0221356,C0014848,C0006111,C0346153,C4280709,C1518922,C0007786,C1840077,C1865145,C0022658,C0007682,C0038325,C0238463,CN881103,C4551715,C1281901,C0009946,C0234966,C4022605,C0011168,C0239067,C1956391,C0010964,C0751884,C1836542,C0025363,C0151786,C0010038,C4073157,C0020456,C0239981,C0027726,C0338484,C0553681,C0027819,C3714756,C1257958,C1839603,C1854113,C0027627,C4551563,C0018802,C0021655,C0024236,C0007134,C0151526,C4551714,C1837098,CN236628,C1837260,C4476792,C0015967,C0085740,C0003873,C0037763,CN205405,C2752015,C0035335,C0026857,C0242656,C075



==== Step #3: Output normalization ====

API 4.1 mychem: 3906 hits
API 2.1 hmdb: No hits
API 2.2 hmdb: No hits
API 2.3 hmdb: No hits
API 2.4 hmdb: No hits
API 2.5 hmdb: No hits
API 2.6 hmdb: No hits
API 2.7 hmdb: No hits
API 2.8 hmdb: No hits
API 2.9 hmdb: No hits
API 2.10 hmdb: No hits
API 2.11 hmdb: No hits
API 2.12 hmdb: No hits
API 2.13 hmdb: No hits
API 2.14 hmdb: No hits
API 2.15 hmdb: No hits
API 2.16 hmdb: No hits
API 2.17 hmdb: No hits
API 2.18 hmdb: No hits
API 2.19 hmdb: No hits
API 2.20 hmdb: No hits
API 2.21 hmdb: No hits
API 2.22 hmdb: No hits
API 2.23 hmdb: No hits
API 2.24 hmdb: No hits
API 2.25 hmdb: No hits
API 2.26 hmdb: No hits
API 2.27 hmdb: No hits
API 2.28 hmdb: No hits
API 2.29 hmdb: No hits
API 2.30 hmdb: No hits
API 2.31 hmdb: No hits
API 2.32 hmdb: No hits
API 2.33 hmdb: No hits
API 2.34 hmdb: No hits
API 2.35 hmdb: No hits
API 2.36 hmdb: No hits
API 2.37 hmdb: No hits
API 2.38 hmdb: No hits
API 2.39 hmdb: No hits
API 2.40 hmdb: No hits
API 2.41 hmdb: No hi

API 1.4 semmed_disease: 12581 hits
API 1.5 semmed_disease: No hits
API 6.1 scibite: No hits
API 6.2 scibite: No hits
API 6.3 scibite: No hits
API 6.4 scibite: No hits
API 6.5 scibite: No hits
API 6.6 scibite: No hits
API 6.7 scibite: No hits
API 6.8 scibite: No hits
API 6.9 scibite: No hits
API 6.10 scibite: No hits
API 6.11 scibite: No hits
API 6.12 scibite: No hits
API 6.13 scibite: No hits
API 6.14 scibite: No hits
API 6.15 scibite: No hits
API 6.16 scibite: No hits
API 6.17 scibite: No hits
API 6.18 scibite: No hits
API 6.19 scibite: No hits
API 6.20 scibite: No hits
API 6.21 scibite: No hits
API 6.22 scibite: No hits
API 6.23 scibite: No hits
API 6.24 scibite: No hits
API 6.25 scibite: No hits
API 6.26 scibite: No hits
API 6.27 scibite: No hits
API 6.28 scibite: No hits
API 6.29 scibite: No hits
API 6.30 scibite: No hits
API 6.31 scibite: No hits
API 6.32 scibite: No hits
API 6.33 scibite: No hits
API 6.34 scibite: No hits
API 6.35 scibite: No hits
API 6.36 scibite: No hits
API 6.

API 1.8 semmed_disease: 52255 hits
API 1.9 semmed_disease: 23763 hits
API 1.10 semmed_disease: 9 hits
API 1.11 semmed_disease: No hits
API 1.12 semmed_disease: 26020 hits
API 1.13 semmed_disease: No hits
API 1.14 semmed_disease: 1042 hits
API 7.1 mydisease: 4591 hits
API 1.15 semmed_disease: 1 hits

After id-to-object translation, BTE retrieved 18049 unique objects.



In the #1 query, BTE found 11 unique Gene nodes
In the #2 query, BTE found 565 unique Disease nodes
In the #3 query, BTE found 18049 unique ChemicalSubstance nodes


## Step 3: Assemble and filter the results

Let's create a `df` object that contains the full output from BioThings Explorer. Each row will show one path that joins the input node to an intermediate node (a `Gene`) to another intermediate node (a `Disease`) to an ending node (a `ChemicalSubstance`). The data frame includes a set of columns with additional details on each node and edge (including human-readable labels, identifiers, and sources). 

In [6]:
df = fc.display_table_view()

In [7]:
df

Unnamed: 0,input,input_type,pred1,pred1_source,pred1_api,pred1_pubmed,node1_type,node1_name,node1_id,pred2,...,node2_type,node2_name,node2_id,pred3,pred3_source,pred3_api,pred3_pubmed,output_type,output_name,output_id
0,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,treats,...,Disease,C0302148,UMLS:C0302148,disrupted_by,SEMMED,SEMMED Disease API,28007597,ChemicalSubstance,C1328819,UMLS:C1328819
1,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,DISEASE,DISEASES API,,Gene,NGLY1,NCBIGene:55768,related_to,...,Disease,C0699748,UMLS:C0699748,disrupted_by,SEMMED,SEMMED Disease API,239044848744542,ChemicalSubstance,C1328819,UMLS:C1328819
3,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,,BioLink API,,Gene,NGLY1,NCBIGene:55768,related_to,...,Disease,C0699748,UMLS:C0699748,disrupted_by,SEMMED,SEMMED Disease API,239044848744542,ChemicalSubstance,C1328819,UMLS:C1328819
4,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,NGLY1,NCBIGene:55768,related_to,...,Disease,C0699748,UMLS:C0699748,disrupted_by,SEMMED,SEMMED Disease API,239044848744542,ChemicalSubstance,C1328819,UMLS:C1328819
10,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,C0699748,UMLS:C0699748,disrupted_by,SEMMED,SEMMED Disease API,239044848744542,ChemicalSubstance,C1328819,UMLS:C1328819
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1438837,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,BENIGN NEOPLASM,MONDO:MONDO:0005165,treated_by,SEMMED,SEMMED Disease API,6261923,ChemicalSubstance,AR 100,name:AR 100
1438838,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,BENIGN NEOPLASM,MONDO:MONDO:0005165,related_to,SEMMED,SEMMED Disease API,28598117,ChemicalSubstance,"NUMB PROTEIN, DROSOPHILA","name:NUMB PROTEIN, DROSOPHILA"
1438839,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,CUTANEOUS DISORDER,MONDO:MONDO:0005093,related_to,Translator Text Mining Provider,CORD Disease API,,ChemicalSubstance,CHEBI:63562,CHEBI:CHEBI:63562
1438840,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,ANESTHESIA RELATED HYPERTHERMIA,MONDO:MONDO:0018493,prevented_by,SEMMED,SEMMED Disease API,9417254,ChemicalSubstance,1-CHLORO-2-METHYL-4-HYDROXYBENZENE,name:1-CHLORO-2-METHYL-4-HYDROXYBENZENE


Let's do some profiling of the predicates, and then let's filter `pred3` to only include `treated_by` edges.

In [8]:
df.pred1.value_counts()

related_to    791464
Name: pred1, dtype: int64

In [9]:
df.pred2.value_counts()

related_to    682484
causes         67867
treats         15993
affects        15002
disrupts        9858
prevents         260
Name: pred2, dtype: int64

In [10]:
df.pred3.value_counts()

treated_by                   252482
related_to                   159053
affected_by                  133456
caused_by                    124002
prevented_by                  57189
disrupted_by                  54730
contraindicated_by             7046
negatively_regulated_by        3297
coexists_with                   111
affects                          60
physically_interacts_with        35
positively_regulated_by           3
Name: pred3, dtype: int64

In [11]:
dfFilt = df.loc[df['output_name'].notnull()].query('pred3 == "treated_by"')

In [12]:
dfFilt

Unnamed: 0,input,input_type,pred1,pred1_source,pred1_api,pred1_pubmed,node1_type,node1_name,node1_id,pred2,...,node2_type,node2_name,node2_id,pred3,pred3_source,pred3_api,pred3_pubmed,output_type,output_name,output_id
11,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,DISEASE,DISEASES API,,Gene,NGLY1,NCBIGene:55768,related_to,...,Disease,C0699748,UMLS:C0699748,treated_by,SEMMED,SEMMED Disease API,27821778,ChemicalSubstance,C1328819,UMLS:C1328819
13,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,,BioLink API,,Gene,NGLY1,NCBIGene:55768,related_to,...,Disease,C0699748,UMLS:C0699748,treated_by,SEMMED,SEMMED Disease API,27821778,ChemicalSubstance,C1328819,UMLS:C1328819
14,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,NGLY1,NCBIGene:55768,related_to,...,Disease,C0699748,UMLS:C0699748,treated_by,SEMMED,SEMMED Disease API,27821778,ChemicalSubstance,C1328819,UMLS:C1328819
20,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,C0699748,UMLS:C0699748,treated_by,SEMMED,SEMMED Disease API,27821778,ChemicalSubstance,C1328819,UMLS:C1328819
32,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,affects,...,Disease,C0242656,UMLS:C0242656,treated_by,SEMMED,SEMMED Disease API,242346492631125627529062,ChemicalSubstance,C1328819,UMLS:C1328819
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1438799,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,CAD,MONDO:MONDO:0005010,treated_by,SEMMED,SEMMED Disease API,347136,ChemicalSubstance,XAVIN,name:XAVIN
1438800,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,CAD,MONDO:MONDO:0005010,treated_by,SEMMED,SEMMED Disease API,1527926,ChemicalSubstance,PREBET,name:PREBET
1438835,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,BENIGN NEOPLASM,MONDO:MONDO:0005165,treated_by,SEMMED,SEMMED Disease API,9374344,ChemicalSubstance,TECHNETIUM TC 99M DIMERCAPTOSUCCINIC ACID,name:TECHNETIUM TC 99M DIMERCAPTOSUCCINIC ACID
1438836,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,BENIGN NEOPLASM,MONDO:MONDO:0005165,treated_by,SEMMED,SEMMED Disease API,961054,ChemicalSubstance,KARCHAULI LIQUID,name:KARCHAULI LIQUID


## Step 4: Summarize output `ChemicalSubstance`s

In [13]:
dfFilt.output_name.value_counts().head(50)

PHARMACEUTICAL PREPARATIONS                                                                                                                557
C0450442                                                                                                                                   419
C0243077                                                                                                                                   381
C1611640                                                                                                                                   327
ANTIOXIDANTS                                                                                                                               317
ADRENAL CORTEX HORMONES                                                                                                                    310
AGONISTS                                                                                                                                   305

Suppose we are interested in the links behind `5-METHOXY-N-ACETYLTRYPTAMINE`. Let's examine that subset of paths, specifically printing the frequency of the intermediate `Gene`s and `Disease`s.

In [14]:
dfFilt2 = dfFilt.query('output_name == "5-METHOXY-N-ACETYLTRYPTAMINE"')
dfFilt2

Unnamed: 0,input,input_type,pred1,pred1_source,pred1_api,pred1_pubmed,node1_type,node1_name,node1_id,pred2,...,node2_type,node2_name,node2_id,pred3,pred3_source,pred3_api,pred3_pubmed,output_type,output_name,output_id
140463,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,DISEASE,DISEASES API,,Gene,NGLY1,NCBIGene:55768,related_to,...,Disease,C0699748,UMLS:C0699748,treated_by,SEMMED,SEMMED Disease API,1828917423312686,ChemicalSubstance,5-METHOXY-N-ACETYLTRYPTAMINE,name:5-METHOXY-N-ACETYLTRYPTAMINE
140465,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,,BioLink API,,Gene,NGLY1,NCBIGene:55768,related_to,...,Disease,C0699748,UMLS:C0699748,treated_by,SEMMED,SEMMED Disease API,1828917423312686,ChemicalSubstance,5-METHOXY-N-ACETYLTRYPTAMINE,name:5-METHOXY-N-ACETYLTRYPTAMINE
140466,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,NGLY1,NCBIGene:55768,related_to,...,Disease,C0699748,UMLS:C0699748,treated_by,SEMMED,SEMMED Disease API,1828917423312686,ChemicalSubstance,5-METHOXY-N-ACETYLTRYPTAMINE,name:5-METHOXY-N-ACETYLTRYPTAMINE
140472,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,C0699748,UMLS:C0699748,treated_by,SEMMED,SEMMED Disease API,1828917423312686,ChemicalSubstance,5-METHOXY-N-ACETYLTRYPTAMINE,name:5-METHOXY-N-ACETYLTRYPTAMINE
140495,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,DISEASE,DISEASES API,,Gene,PMM2,NCBIGene:5373,causes,...,Disease,C0025362,UMLS:C0025362,treated_by,SEMMED,SEMMED Disease API,15275698,ChemicalSubstance,5-METHOXY-N-ACETYLTRYPTAMINE,name:5-METHOXY-N-ACETYLTRYPTAMINE
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
142491,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,DIFFUSE SCLERODERMA,MONDO:MONDO:0005100,treated_by,SEMMED,SEMMED Disease API,1617326216428927,ChemicalSubstance,5-METHOXY-N-ACETYLTRYPTAMINE,name:5-METHOXY-N-ACETYLTRYPTAMINE
142492,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,ACTIVE TUBERCULOSIS,MONDO:MONDO:0018076,treated_by,SEMMED,SEMMED Disease API,22430231,ChemicalSubstance,5-METHOXY-N-ACETYLTRYPTAMINE,name:5-METHOXY-N-ACETYLTRYPTAMINE
142493,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,CAD,MONDO:MONDO:0005010,treated_by,SEMMED,SEMMED Disease API,10235654117770841570288777606129441519,ChemicalSubstance,5-METHOXY-N-ACETYLTRYPTAMINE,name:5-METHOXY-N-ACETYLTRYPTAMINE
142496,ALACRIMIA - CHOREOATHETOSIS - LIVER DYSFUNCTIO...,Disease,related_to,disgenet,mydisease.info API,,Gene,PGM1,NCBIGene:5236,related_to,...,Disease,CAD,MONDO:MONDO:0005010,treated_by,SEMMED,SEMMED Disease API,157548511637727420210856,ChemicalSubstance,5-METHOXY-N-ACETYLTRYPTAMINE,name:5-METHOXY-N-ACETYLTRYPTAMINE


In [15]:
dfFilt2.node1_name.value_counts().head(50)

NGLY1     87
PMM2      45
PGM1      31
FAF1      26
NFE2L1    19
FBXO2     12
ENGASE    10
FBXO6      7
ALG14      6
GMPPA      6
DDI2       4
Name: node1_name, dtype: int64

In [16]:
dfFilt2.node2_name.value_counts().head(50)

CONDITION                                   20
CELL PROCESS DISEASE                        20
CA                                          19
MALIGNANT NEOPLASM OF STOMACH                8
SCOLIOSIS                                    7
DIABETES                                     6
DISEASE OF METABOLISM                        6
AD                                           6
EPILEPSY, VISUAL                             5
DIABETES MELLITUS, NON-INSULIN-DEPENDENT     5
ARTHRITIS OR POLYARTHRITIS, RHEUMATIC        4
PERICENTRAL PIGMENTARY RETINOPATHY           4
CARCINOGENESIS                               4
DISEASE OF LIVER                             4
BREAST CANCER, FAMILIAL                      4
FAILURE TO THRIVE                            4
FEVER                                        4
OSTEOPENIA                                   4
C0699748                                     4
BONE DISEASE                                 3
C0277785                                     3
MALIGNANT MEL