## This notebook's purpose serves to give a short demonstration on how to connect drug or gene lists found through Outlier Analysis with the Druggable Genome Interactive Database.

In [1]:
import pandas as pd
import binarization_functions as bf
import requests
import json

First we will generate a list of genes or drugs. This will be generated by patient or by clinical attributes with other functions within cptac.utils, but to demonstrate how to use DGIdb, we will make an arbitrary list of genes as an example.

In [2]:
gene_list = ['AVL9', 'COG1', 'GOLPH3L', 'RAB3GAP1', 'RAB3GAP2']

In [3]:
r = bf.dgidb_get_request(genes_or_drugs_list=gene_list, 
                         genes=True, drugs=False, 
                         interaction_sources=['TTD','DrugBank','TALC'], 
                         interaction_types=['inhibitor','activator'], 
                         fda_approved_drug=True, immunotherapy=True,
                         anti_neoplastic=True, clinically_actionable=True, 
                         druggable_genome=True, drug_resistance=True, 
                         gene_categories=['tumor suppressor','dna repair', 'kinase'], 
                         source_trust_levels=['Expert cUrated']) 

An HTTP GET request to DGIdb will return a large JSON object, with a lot of information, some of which may not be necessary for our analysis. However, if you would like the whole JSON object, it will saved as a Python dictionary to whatever value you set bf.dgidb_get_request() to, in this case, "r".

In [4]:
r

{'ambiguousTerms': [],
 'matchedTerms': [{'entrezId': 23080,
   'geneCategories': [],
   'geneLongName': 'AVL9 CELL MIGRATION ASSOCIATED',
   'geneName': 'AVL9',
   'interactions': [],
   'searchTerm': 'AVL9'},
  {'entrezId': 9382,
   'geneCategories': [],
   'geneLongName': 'COMPONENT OF OLIGOMERIC GOLGI COMPLEX 1',
   'geneName': 'COG1',
   'interactions': [],
   'searchTerm': 'COG1'},
  {'entrezId': 55204,
   'geneCategories': [],
   'geneLongName': 'GOLGI PHOSPHOPROTEIN 3 LIKE',
   'geneName': 'GOLPH3L',
   'interactions': [],
   'searchTerm': 'GOLPH3L'},
  {'entrezId': 22930,
   'geneCategories': [],
   'geneLongName': 'RAB3 GTPASE ACTIVATING PROTEIN CATALYTIC SUBUNIT 1',
   'geneName': 'RAB3GAP1',
   'interactions': [],
   'searchTerm': 'RAB3GAP1'},
  {'entrezId': 25782,
   'geneCategories': [],
   'geneLongName': 'RAB3 GTPASE ACTIVATING NON-CATALYTIC PROTEIN SUBUNIT 2',
   'geneName': 'RAB3GAP2',
   'interactions': [],
   'searchTerm': 'RAB3GAP2'}],
 'unmatchedTerms': []}

You will notice that when using dgidb_json_parse() on "r", there are "No Gene/Drug interactions".  dgidb_json_parse() will parse through a rather complicated json object, and return a dictionary of genes and their interactions with drugs (or the inverse if a list of drugs were given rather than a list of genes to dgidb_get_request()). Because there were no interactions, there are no drugs in DGIdb that are related to regulating the expression of genes in our list.

In [5]:
drug_dict = bf.dgidb_json_parse(r, genes=True)

No Gene/Drug interactions


Below is an example of genes with many interactions, and a lot of extra information we likely do not need for our analysis. The JSON object of our query before and after dgidb_json_parse() will illustrate that fact, and help extract some of the more meaningful information from our query.

In [6]:
gene_list2 = ["ATP1B2", "SHBG", "CD38", "P2RX4"]
r2 = bf.dgidb_get_request(gene_list2, genes=True)

In [7]:
r2

{'ambiguousTerms': [],
 'matchedTerms': [{'entrezId': 482,
   'geneCategories': [{'id': '9bcd3de20cbc41da8d8ba4c693c93c8b',
     'name': 'TRANSPORTER'},
    {'id': '00f6836cd18c4d1ab6fa29d672375ae0', 'name': 'ABC TRANSPORTER'},
    {'id': 'd3ec2631e0b2434b9dcc008e793d3fa5', 'name': 'DRUGGABLE GENOME'}],
   'geneLongName': 'ATPASE NA+/K+ TRANSPORTING SUBUNIT BETA 2',
   'geneName': 'ATP1B2',
   'interactions': [{'drugChemblId': 'CHEMBL1614',
     'drugName': 'DESLANOSIDE',
     'interactionId': '51136830-a97d-4b77-9d92-5edebeb782d8',
     'interactionTypes': ['inhibitor'],
     'pmids': [],
     'score': 1,
     'sources': ['ChemblInteractions']},
    {'drugChemblId': 'CHEMBL254219',
     'drugName': 'DIGITOXIN',
     'interactionId': 'f5a8b9ba-b3f3-40b5-b394-4e2432c6e4c1',
     'interactionTypes': ['inhibitor'],
     'pmids': [],
     'score': 1,
     'sources': ['ChemblInteractions']},
    {'drugChemblId': 'CHEMBL1751',
     'drugName': 'DIGOXIN',
     'interactionId': 'a6e61319-e04a-

In [8]:
drug_dict2 = bf.dgidb_json_parse(r2, genes=True)

In [9]:
#Note: Key = Gene, Value = {Key = Drug, Value = [Interaction Type(s)]}
print(json.dumps(drug_dict2, indent=4))

{
    "ATP1B2": {
        "DESLANOSIDE": [
            "inhibitor"
        ],
        "DIGITOXIN": [
            "inhibitor"
        ],
        "DIGOXIN": [
            "inhibitor"
        ],
        "ACETYLDIGITOXIN": [
            "inhibitor"
        ]
    },
    "SHBG": {
        "SPIRONOLACTONE": [
            "binder"
        ],
        "BUSULFAN": [],
        "LISINOPRIL": [],
        "NORGESTREL": [],
        "DROLOXIFENE": []
    },
    "CD38": {
        "DARATUMUMAB": [
            "antibody",
            "inhibitor"
        ],
        "ISATUXIMAB": [
            "antibody"
        ],
        "SAR-650984": [],
        "HuMax-CD38": [],
        "THROMBIN": []
    },
    "P2RX4": {
        "ESLICARBAZEPINE ACETATE": [
            "antagonist"
        ]
    }
}


Below is an example of searching with a list of drugs, rather than genes

In [10]:
drug_list = ["DESLANOSIDE", "SPIRONOLACTONE", "DARATUMUMAB"]

In [13]:
r3 = bf.dgidb_get_request(drug_list, genes=False, drugs=True)

In [14]:
gene_dict = bf.dgidb_json_parse(r3, drugs=True)

In [15]:
#Note: Key = Drug, Value = {Key = Gene, Value = [Interaction type(s)]}
print(json.dumps(gene_dict, indent=4))

{
    "DESLANOSIDE": {
        "ATP1A1": [
            "inhibitor",
            "binder"
        ],
        "ATP1B2": [
            "inhibitor"
        ],
        "ATP1A3": [
            "inhibitor"
        ],
        "ATP1A4": [
            "inhibitor"
        ],
        "ATP1B3": [
            "inhibitor"
        ],
        "ATP1A2": [
            "inhibitor"
        ],
        "ATP1B1": [
            "inhibitor"
        ],
        "FXYD2": [
            "inhibitor"
        ]
    },
    "SPIRONOLACTONE": {
        "NR3C2": [
            "antagonist"
        ],
        "AR": [
            "antagonist"
        ],
        "PGR": [
            "agonist"
        ],
        "NR3C1": [
            "antagonist"
        ],
        "CYP11B2": [
            "antagonist"
        ],
        "SHBG": [
            "binder"
        ],
        "SSTR2": [
            "agonist"
        ],
        "CACNG1": [],
        "CYP7B1": [],
        "ACE": [],
        "CYP1A2": [],
        "ADIPOQ": [],
        