# Query

What genes and pathways are uniquely targeted by HSCT conditioning drugs that are well- vs poorly- tolerated by FA patients?

# Workflow

**Input**: [Two HSCT conditioning drug sets: (1) well-tolerated by FA patients (Set1d); (2) poorly-tolerated by FA patients (Set2d)](#input)

**Step 1.** [Retrieve proteins targeted by set of well-tolerated HSCT conditioning drugs --> Set1p](#step1)

**Step 2.** [Retrieve proteins targeted by set of well-tolerated HSCT conditioning drugs --> Set2p](#step2)

**Step 3.** [Retrieve genes encoding proteins in Set1p vs Set2p --> Set1g, Set2g](#step3)

**Step 4.** [Retrieve pathways associated with genes in Set1g vs Set2g --> Set1pw, Set2pw](#step4)

**Step 5.** [Retreive other genes involved in pathways in Set1pw vs Set2pw --> Set1g', Set2g'](#step5)

**Step 6.** [Execute set comparison analysis to return the set of genes that is uniquely targetd by poorly tolerated drugs (i.e. effected directly or indirectly by poorly tolerated drugs, but not affected by well-tolerated drugs)](#step6)

**Output**: Set of genes that may be uniquely targeted by pre-conditioning drugs that are poorly tolerated by FA patients.

# Sources & Routes



**All Steps**: [Biothings IdListHandler](https://github.com/kevinxin90/jsonld_demo)

**BioThings Python Library is designed based on JSON-LD, and could be utilized to connect information from different biological entities, e.g. drug-protein-gene-pathway. Currently, it integrates API resource from MyGene.info, MyVariant.info, Drug and Compound API, etc.**

**Step 1 & 2**: **Drug to Protein** [Drug and Compound API](http://c.biothings.io)

**Step 3**: **Protein to Gene** [MyGene.info](http://mygene.info)

**Step 4**: **Gene to Pathway** [MyGene.info](http://mygene.info)

**Step 5**: **Pathway to Gene** [MyGene.info](http://mygene.info)

# Demo


<a id='input'></a>
**Input**: Two HSCT conditioning drug sets: (1) well-tolerated by FA patients (Set1d); (2) poorly-tolerated by FA patients (Set2d)

In [2]:
'''
Assume well-tolerated drugs: Fludarabine, Carmustine
Assume poorly-tolerated drugs: Etoposide, Tacrolimus
'''
drug_set1 = ['Fludarabine', 'Carmustine']
drug_set2 = ['Etoposide', 'Tacrolimus']

In [3]:
'''
Import BioThingsExplorer Python Package
'''
from BioThingsExplorer import IdListHandler
ih = IdListHandler()

In [4]:
'''
Get drugbank ID from drug symbol
'''
set1d = ih.list_handler(input_id_list=drug_set1, input_type='drug_symbol', output_type='drugbank_id')
set2d = ih.list_handler(input_id_list=drug_set2, input_type='drug_symbol', output_type='drugbank_id')

['aeolus.drug_name', 'chembl.pref_name']
aeolus.drug_name:Fludarabine OR chembl.pref_name
Fetching 1 drug(s) . . .
No results to return
Number of IDs from mydrug.info related to this query aeolus.drug_name:Fludarabine OR chembl.pref_name is : 1
['aeolus.drug_name', 'chembl.pref_name']
aeolus.drug_name:Carmustine OR chembl.pref_name
Fetching 1 drug(s) . . .
No results to return
Number of IDs from mydrug.info related to this query aeolus.drug_name:Carmustine OR chembl.pref_name is : 1
['aeolus.drug_name', 'chembl.pref_name']
aeolus.drug_name:Etoposide OR chembl.pref_name
Fetching 1 drug(s) . . .
No results to return
Number of IDs from mydrug.info related to this query aeolus.drug_name:Etoposide OR chembl.pref_name is : 1
['aeolus.drug_name', 'chembl.pref_name']
aeolus.drug_name:Tacrolimus OR chembl.pref_name
Fetching 0 drug(s) . . .
No results to return
Number of IDs from mydrug.info related to this query aeolus.drug_name:Tacrolimus OR chembl.pref_name is : 0


In [5]:
print(set1d)
print(set2d)

['DLGOEMSEDOSKAD-UHFFFAOYSA-N', 'HBUBKKRHXORPQB-FJFJXFQQSA-N']
['VJJPUSNTGOMMGY-MRVIYFEKSA-N']


<a id='step1'></a>
**Step 1**. Retrieve proteins (*uniprot_id*) targeted by set of well-tolerated HSCT conditioning drugs --> Set1p

In [8]:
'''
Use IdListHandler to retrieve a list of Uniprot_IDs correponding to Drugbank_IDs for Drug Set 1
'''
set1p = ih.list_handler(input_id_list=set1d, input_type='drugbank_id', output_type='uniprot_id', relation="oban:is_Target_of")
print('Protein Uniprot IDs related to Drugs in Drug Set 1 is: {}'.format(set1p))

Protein Uniprot IDs related to Drugs in Drug Set 1 is: ['P00390']


<a id='step2'></a>
**Step 2.** Retrieve proteins targeted by set of well-tolerated HSCT conditioning drugs --> Set2p

In [9]:
'''
Use IdListHandler to retrieve a list of Uniprot_IDs correponding to Drugbank_IDs for Drug Set 2
'''
set2p = ih.list_handler(input_id_list=set2d, input_type='drugbank_id', output_type='uniprot_id', relation="oban:is_Target_of")
print('Protein Uniprot IDs related to Drugs in Drug Set 2 is: {}'.format(set2p))

Protein Uniprot IDs related to Drugs in Drug Set 2 is: ['Q02880', 'P11388']


<a id='step3'></a>
**Step 3**. Retrieve genes encoding proteins in Set1p vs Set2p --> Set1g, Set2g

In [10]:
'''
Use IdListHandler to retrieve a list of Entrez_Gene_IDs correponding to Uniprot_IDs for Drug Set 1
'''
set1g = ih.list_handler(input_id_list=set1p, input_type='uniprot_id', output_type='entrez_gene_id')
print('Entrez Gene IDs related to Drugs in Drug Set 1 is: {}'.format(set1g))

['uniprot.Swiss-Prot']
uniprot.Swiss-Prot:P00390
Fetching 1 gene(s) . . .
Number of IDs from mygene.info related to this query uniprot.Swiss-Prot:P00390 is : 1
Entrez Gene IDs related to Drugs in Drug Set 1 is: ['2936']


In [11]:
'''
Use IdListHandler to retrieve a list of Entrez_Gene_IDs correponding to Uniprot_IDs for Drug Set 1
'''
set2g = ih.list_handler(input_id_list=set2p, input_type='uniprot_id', output_type='entrez_gene_id')
print('Entrez Gene IDs related to Drugs in Drug Set 1 is: {}'.format(set2g))

['uniprot.Swiss-Prot']
uniprot.Swiss-Prot:Q02880
Fetching 1 gene(s) . . .
Number of IDs from mygene.info related to this query uniprot.Swiss-Prot:Q02880 is : 1
['uniprot.Swiss-Prot']
uniprot.Swiss-Prot:P11388
Fetching 1 gene(s) . . .
Number of IDs from mygene.info related to this query uniprot.Swiss-Prot:P11388 is : 1
Entrez Gene IDs related to Drugs in Drug Set 1 is: ['7153', '7155']


<a id='step4'></a>
**Step 4.** Retrieve pathways associated with genes in Set1g vs Set2g --> Set1pw, Set2pw

In [12]:
'''
Use IdListHandler to retrieve a list of Wikipathway_IDs correponding to Entrez_Gene_IDs for Drug Set 1
'''
set1pw = ih.list_handler(input_id_list=set1g, input_type='entrez_gene_id', output_type='wikipathway_id')
print('Wikipathway IDs related to Drugs in Drug Set 1 is: {}'.format(set1pw))

Wikipathway IDs related to Drugs in Drug Set 1 is: ['WP2882', 'WP692', 'WP3925', 'WP100', 'WP15', 'WP3940', 'WP702', 'WP2884', 'WP408']


In [13]:
'''
Use IdListHandler to retrieve a list of Wikipathway_IDs correponding to Entrez_Gene_IDs for Drug Set 2
'''
set2pw = ih.list_handler(input_id_list=set2g, input_type='entrez_gene_id', output_type='wikipathway_id')
print('Wikipathway IDs related to Drugs in Drug Set 2 is: {}'.format(set2pw))

Wikipathway IDs related to Drugs in Drug Set 2 is: ['WP2363', 'WP2361', 'WP2377', 'WP2446']


<a id='step5'></a>
**Step 5.** Retreive other genes involved in pathways in Set1pw vs Set2pw --> Set1g', Set2g'

In [14]:
'''
Use IdListHandler to retrieve a list of Entrez_Gene_IDs correponding to Wikipathway_IDs for Drug Set 1
'''
set1g_other = ih.list_handler(input_id_list=set1pw, input_type='wikipathway_id', output_type='entrez_gene_id')
print('Other Entrez Gene IDs related to Drugs in Drug Set 1 is: {}'.format(set1g_other))

['pathway.wikipathways.id']
pathway.wikipathways.id:WP2882
Fetching 316 gene(s) . . .
Number of IDs from mygene.info related to this query pathway.wikipathways.id:WP2882 is : 316
['pathway.wikipathways.id']
pathway.wikipathways.id:WP692
Fetching 17 gene(s) . . .
Number of IDs from mygene.info related to this query pathway.wikipathways.id:WP692 is : 17
['pathway.wikipathways.id']
pathway.wikipathways.id:WP3925
Fetching 91 gene(s) . . .
Number of IDs from mygene.info related to this query pathway.wikipathways.id:WP3925 is : 91
['pathway.wikipathways.id']
pathway.wikipathways.id:WP100
Fetching 20 gene(s) . . .
Number of IDs from mygene.info related to this query pathway.wikipathways.id:WP100 is : 20
['pathway.wikipathways.id']
pathway.wikipathways.id:WP15
Fetching 83 gene(s) . . .
Number of IDs from mygene.info related to this query pathway.wikipathways.id:WP15 is : 83
['pathway.wikipathways.id']
pathway.wikipathways.id:WP3940
Fetching 49 gene(s) . . .
Number of IDs from mygene.info relat

In [15]:
'''
Use IdListHandler to retrieve a list of Entrez_Gene_IDs correponding to Wikipathway_IDs for Drug Set 2
'''
set2g_other = ih.list_handler(input_id_list=set2pw, input_type='wikipathway_id', output_type='entrez_gene_id')
print('Other Entrez Gene IDs related to Drugs in Drug Set 1 is: {}'.format(set2g_other))

['pathway.wikipathways.id']
pathway.wikipathways.id:WP2363
Fetching 32 gene(s) . . .
Number of IDs from mygene.info related to this query pathway.wikipathways.id:WP2363 is : 32
['pathway.wikipathways.id']
pathway.wikipathways.id:WP2361
Fetching 29 gene(s) . . .
Number of IDs from mygene.info related to this query pathway.wikipathways.id:WP2361 is : 29
['pathway.wikipathways.id']
pathway.wikipathways.id:WP2377
Fetching 170 gene(s) . . .
Number of IDs from mygene.info related to this query pathway.wikipathways.id:WP2377 is : 170
['pathway.wikipathways.id']
pathway.wikipathways.id:WP2446
Fetching 89 gene(s) . . .
Number of IDs from mygene.info related to this query pathway.wikipathways.id:WP2446 is : 89
Other Entrez Gene IDs related to Drugs in Drug Set 1 is: ['84823', '1019', '5879', '5426', '7298', '5601', '5604', '352954', '5156', '317', '56992', '5308', '55299', '641', '5290', '5931', '7490', '7186', '2931', '81620', '6117', '8819', '675', '7272', '3015', '4176', '5983', '962', '54891

<a id='step6'></a>
**Step 6.** Execute set comparison analysis to return the set of genes that is uniquely targetd by poorly tolerated drugs (i.e. effected directly or indirectly by poorly tolerated drugs, but not affected by well-tolerated drugs)

In [16]:
'''
Get Unique Entrez Gene IDs for both sets
'''
set1g_other_unique = set(set1g_other)
set2g_other_unique = set(set2g_other)
print('Total number of unique genes in gene set 1: {}'.format(len(set1g_other_unique)))
print('Total number of unique genes in gene set 2: {}'.format(len(set2g_other_unique)))

Total number of unique genes in gene set 1: 641
Total number of unique genes in gene set 2: 292


In [17]:
'''
Find the set of genes that is uniquely targetd by poorly tolerated drugs (e.g. only present in set2g_other_unique)
'''
set2g_only = set2g_other_unique - set1g_other_unique
print('Total number of genes uniquely targeted by poorly tolerated drugs: {}'.format(len(set2g_only)))

Total number of genes uniquely targeted by poorly tolerated drugs: 265
