# pyBravo - Drugbank use case

---
In this use case, our goal is to demonstrate how pyBravo can support *in silico* drug screening.  

We show here **how a list of potential drug targets can be completed based on a reconstructed gene regulation network**. 

We consider that evaluating the biological or clinical relevance of the retrieved potential drugs is out of the scope of this demonstration.

In [1]:
# Imports
import csv
from SPARQLWrapper import SPARQLWrapper, JSON
from IPython.display import display, Markdown, Latex

sparql_endpoint = SPARQLWrapper("https://bio2rdf.org/sparql")

def get_bio2rdf_drug_targets(gene_set):
    """
    From a list of genes (HGNC symbols) this function queries the Bio2RDF sparql endpoint 
    and checks if the symbols are known as a drug target (Drugbank). It returns the list of drug targets. 
    """
    drug_targets = set() 
    for g in gene_set:
        ask_query = 'ASK { ?drug <http://bio2rdf.org/drugbank_vocabulary:target> ?target . ?target <http://bio2rdf.org/drugbank_vocabulary:gene-name> "'+g+'"^^xsd:string}'

        sparql_endpoint.setQuery(ask_query)
        sparql_endpoint.setReturnFormat(JSON)
        results = sparql_endpoint.query().convert()
        if (results["boolean"] == True):
            print(f'found drug targeting {g} in Drugbank (live Bio2RDF endpoint)')
            drug_targets.add(g)
    
    return drug_targets

def get_bio2rdf_drugs(gene_set):
    """
    From a list of genes (HGNC symbols) known as drug targets, returns the URL of found drug in Drugbank. 
    """
    drugs = set() 
    drug_groups = ['Approved', 'Nutraceutical', 'Experimental']
    group = drug_groups[0]
    for g in gene_set:
        query = 'SELECT ?drug WHERE { \
            ?drug <http://bio2rdf.org/drugbank_vocabulary:target> ?target .\
            ?target <http://bio2rdf.org/drugbank_vocabulary:gene-name> "'+g+'"^^xsd:string .\
            ?drug rdf:type <http://bio2rdf.org/drugbank_vocabulary:Drug> .\
            ?drug <http://bio2rdf.org/drugbank_vocabulary:group> <http://bio2rdf.org/drugbank_vocabulary:'+group+'> .\
            }'

        sparql_endpoint.setQuery(query)
        sparql_endpoint.setReturnFormat(JSON)
        results = sparql_endpoint.query().convert()
        
        for r in results["results"]["bindings"]:
            d = r["drug"]["value"]
            d_url = d.replace("http://bio2rdf.org/drugbank:", "https://www.drugbank.ca/drugs/")
            print(f'{g} --> {d_url}')
            drugs.add(d_url)
    
    return drugs

## 1. Loading input gene list to be considered as drug targets

In [2]:
# Read input genes list from a file
def read_input_genes(filename):
    res = set()
    with open(filename, newline='') as csvfile:
        reader = csv.reader(csvfile, delimiter=' ', quotechar='|')
        for row in reader:
            #print(''.join(row))
            res.add(''.join(row))
    return res

# Genes input list
gene_set = read_input_genes('data/target_genes.csv')
print('{} target genes'.format(len(gene_set)))

192 target genes


## 2. Retrieving potential drugs 

In [3]:
%%time
#drug_targets = get_bio2rdf_drug_targets(gene_set)
#potential_drugs = get_bio2rdf_drugs(drug_targets)

potential_drugs = get_bio2rdf_drugs(gene_set)

FABP2 --> https://www.drugbank.ca/drugs/DB01050
SLC16A3 --> https://www.drugbank.ca/drugs/DB00119
PCYT1B --> https://www.drugbank.ca/drugs/DB00122
NPC1L1 --> https://www.drugbank.ca/drugs/DB00973
PISD --> https://www.drugbank.ca/drugs/DB00144
TYMS --> https://www.drugbank.ca/drugs/DB00293
TYMS --> https://www.drugbank.ca/drugs/DB00432
TYMS --> https://www.drugbank.ca/drugs/DB00322
TYMS --> https://www.drugbank.ca/drugs/DB00441
TYMS --> https://www.drugbank.ca/drugs/DB00642
TYMS --> https://www.drugbank.ca/drugs/DB01101
TYMS --> https://www.drugbank.ca/drugs/DB06813
TYMS --> https://www.drugbank.ca/drugs/DB00440
TYMS --> https://www.drugbank.ca/drugs/DB00544
TYMS --> https://www.drugbank.ca/drugs/DB00650
SLC16A1 --> https://www.drugbank.ca/drugs/DB00119
SPTLC2 --> https://www.drugbank.ca/drugs/DB00133
SLC25A29 --> https://www.drugbank.ca/drugs/DB00583
GOT1 --> https://www.drugbank.ca/drugs/DB00128
GOT1 --> https://www.drugbank.ca/drugs/DB00151
FABP6 --> https://www.drugbank.ca/drugs/DB0

In [4]:
display(Markdown(f"**{len(potential_drugs)}** found **potential drugs** in bio2rdf"))

**32** found **potential drugs** in bio2rdf

## 3. Reconstructing the gene regulation network with pyBravo (max depth = 1)

In [18]:
!python ../src/pyBravo.py --regulation --input_file data/target_genes.csv -md 1 -excl mirtarbase msigdb -co -su -sy

Explored 1 regulators
Explored 4 regulators
Explored 6 regulators
Explored 7 regulators
Explored 8 regulators
Explored 9 regulators
Explored 11 regulators
Explored 11 regulators
Explored 11 regulators
Explored 11 regulators
Explored 12 regulators
Explored 18 regulators
Explored 18 regulators
Explored 18 regulators
Explored 18 regulators
Explored 18 regulators
Explored 23 regulators
Explored 24 regulators
Explored 27 regulators
Explored 28 regulators
Explored 28 regulators
Explored 30 regulators
Explored 34 regulators
Explored 34 regulators
Explored 36 regulators
Explored 36 regulators
Explored 38 regulators
Explored 38 regulators
Explored 38 regulators
Explored 43 regulators
Explored 44 regulators
Explored 44 regulators
Explored 44 regulators
Explored 45 regulators
Explored 45 regulators
Explored 45 regulators
Explored 46 regulators
Explored 47 regulators
Explored 47 regulators
Explored 49 regulators
Explored 49 regulators
Explored 50 regulators
Explored 50 regulators
Explored 50 regul

## 4. Extracting the gene regulators to be considered as potential drug targets

In [19]:
def read_regulatory_genes(filename):
    res = set()
    with open(filename, newline='') as csvfile:
        reader = csv.reader(csvfile, delimiter='\t', quotechar='|')
        for row in reader:
            for item in row:
                res.add(item)
    return res

reg_gene_set = read_regulatory_genes('out-unified.sif')
#reg_gene_set = read_regulatory_genes('out.sif')

to_be_explored_genes = reg_gene_set - gene_set

print('{} new potential target genes'.format(len(to_be_explored_genes)))

94 new potential target genes


## 5. Retrieving new potential drug targets 

In [20]:
%%time
#drug_targets = get_bio2rdf_drug_targets(gene_set)
#potential_drugs = get_bio2rdf_drugs(drug_targets)

new_potential_drugs = get_bio2rdf_drugs(to_be_explored_genes)

PIK3CA --> https://www.drugbank.ca/drugs/DB00201
PPARD --> https://www.drugbank.ca/drugs/DB00159
PPARD --> https://www.drugbank.ca/drugs/DB01393
PPARD --> https://www.drugbank.ca/drugs/DB00605
PPARD --> https://www.drugbank.ca/drugs/DB00374
VEGFA --> https://www.drugbank.ca/drugs/DB08885
VEGFA --> https://www.drugbank.ca/drugs/DB05294
VEGFA --> https://www.drugbank.ca/drugs/DB00112
VEGFA --> https://www.drugbank.ca/drugs/DB01120
VEGFA --> https://www.drugbank.ca/drugs/DB01270
VEGFA --> https://www.drugbank.ca/drugs/DB01017
VEGFA --> https://www.drugbank.ca/drugs/DB06779
VEGFA --> https://www.drugbank.ca/drugs/DB01136
NOS3 --> https://www.drugbank.ca/drugs/DB00155
NOS3 --> https://www.drugbank.ca/drugs/DB00360
NOS3 --> https://www.drugbank.ca/drugs/DB01110
NOS3 --> https://www.drugbank.ca/drugs/DB00125
HIF1A --> https://www.drugbank.ca/drugs/DB01136
AMN --> https://www.drugbank.ca/drugs/DB00200
NR1I2 --> https://www.drugbank.ca/drugs/DB00163
NR1I2 --> https://www.drugbank.ca/drugs/DB088

In [21]:
display(Markdown(f"**{len(new_potential_drugs)}** found **potential drugs** in bio2rdf"))

**92** found **potential drugs** in bio2rdf