### **Breast Cancer Drug Discovery Project using ChEMBL Databse**

Breast cancer is a heterogeneous and often aggressive malignancy that originates in the breast tissue. It is characterized by the uncontrolled growth of abnormal cells in the breast, which can invade surrounding tissues and potentially spread to other parts of the body. In computational drug discovery for breast cancer, particular attention is given to targeting the Epidermal Growth Factor Receptor (EGFR) protein family. This family plays a crucial role in regulating cell proliferation, survival, and differentiation, and aberrant activation of EGFR signaling is frequently implicated in breast cancer progression, making it a promising target for the development of novel therapeutic interventions.

[ChEMBL Database](https://www.ebi.ac.uk/chembl/g/) is a large set of bioactivity data, compiled from scientific papers and assays covering thousands of pathophysiological targets. 

**Install ChEMBL webresource client library: $ pip install chembl_webresource_client [Github](https://github.com/chembl/chembl_webresource_client) [NIH](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4489243/)** 

This library enables users with HTTPS protocol and caches results in local file systems for fast retrieval - based on Django QuerySet.

#### **Importing Libraries**

In [2]:
import pandas as pd
from chembl_webresource_client.new_client import new_client 

#### **Query Targets - EGFR Protein Family**

In [62]:
target = new_client.target
target_query = target.search('CHEMBL2363049')
targets = pd.DataFrame.from_dict(target_query)
targets

Unnamed: 0,cross_references,organism,pref_name,score,species_group_flag,target_chembl_id,target_components,target_type,tax_id
0,[],Homo sapiens,Epidermal growth factor receptor,12.0,False,CHEMBL2363049,"[{'accession': 'P04626', 'component_descriptio...",PROTEIN FAMILY,9606
1,"[{'xref_id': 'Q15303', 'xref_name': None, 'xre...",Homo sapiens,Receptor protein-tyrosine kinase erbB-4,11.0,False,CHEMBL3009,"[{'accession': 'Q15303', 'component_descriptio...",SINGLE PROTEIN,9606
2,"[{'xref_id': 'P21860', 'xref_name': None, 'xre...",Homo sapiens,Receptor tyrosine-protein kinase erbB-3,10.0,False,CHEMBL5838,"[{'accession': 'P21860', 'component_descriptio...",SINGLE PROTEIN,9606
3,[],Homo sapiens,ErbB-2/ErbB-3 heterodimer,10.0,False,CHEMBL4630723,"[{'accession': 'P04626', 'component_descriptio...",PROTEIN COMPLEX,9606
4,"[{'xref_id': 'P04626', 'xref_name': None, 'xre...",Homo sapiens,Receptor protein-tyrosine kinase erbB-2,9.0,False,CHEMBL1824,"[{'accession': 'P04626', 'component_descriptio...",SINGLE PROTEIN,9606
5,[],Homo sapiens,Epidermal growth factor receptor and ErbB2 (HE...,8.0,False,CHEMBL2111431,"[{'accession': 'P04626', 'component_descriptio...",PROTEIN FAMILY,9606
6,"[{'xref_id': 'P00533', 'xref_name': None, 'xre...",Homo sapiens,Epidermal growth factor receptor erbB1,7.0,False,CHEMBL203,"[{'accession': 'P00533', 'component_descriptio...",SINGLE PROTEIN,9606
7,[],Homo sapiens,FASN/HER2,7.0,False,CHEMBL4106134,"[{'accession': 'P04626', 'component_descriptio...",PROTEIN COMPLEX,9606
8,[],Homo sapiens,MER intracellular domain/EGFR extracellular do...,6.0,False,CHEMBL3137284,"[{'accession': 'P00533', 'component_descriptio...",CHIMERIC PROTEIN,9606
9,[],Homo sapiens,EGFR/PPP1CA,6.0,False,CHEMBL4523747,"[{'accession': 'P00533', 'component_descriptio...",PROTEIN-PROTEIN INTERACTION,9606


In [63]:
selected_target = targets.target_chembl_id[0] 
selected_target

'CHEMBL2363049'

#### *Retrieve bioactivity data for *EGFR* (CHEMBL2363049) that are reported as IC50 values in nanomolars (nM)*

In [65]:
activity = new_client.activity
res = activity.filter(target_chembl_id=selected_target).filter(standard_type="IC50", units='nM')
df = pd.DataFrame.from_dict(res)
df

Unnamed: 0,action_type,activity_comment,activity_id,activity_properties,assay_chembl_id,assay_description,assay_type,assay_variant_accession,assay_variant_mutation,bao_endpoint,...,target_organism,target_pref_name,target_tax_id,text_value,toid,type,units,uo_units,upper_value,value
0,,,3261308,[],CHEMBL1105218,Inhibition of EGFR Leu858Arg and Thr790Met mut...,B,P00533,"L858R,T790M",BAO_0000190,...,Homo sapiens,Epidermal growth factor receptor,9606,,,IC50,nM,UO_0000065,,140.0
1,,,3261309,[],CHEMBL1105218,Inhibition of EGFR Leu858Arg and Thr790Met mut...,B,P00533,"L858R,T790M",BAO_0000190,...,Homo sapiens,Epidermal growth factor receptor,9606,,,IC50,nM,UO_0000065,,1500.0
2,,,3261310,[],CHEMBL1105218,Inhibition of EGFR Leu858Arg and Thr790Met mut...,B,P00533,"L858R,T790M",BAO_0000190,...,Homo sapiens,Epidermal growth factor receptor,9606,,,IC50,nM,UO_0000065,,1000.0
3,,,3261311,[],CHEMBL1105218,Inhibition of EGFR Leu858Arg and Thr790Met mut...,B,P00533,"L858R,T790M",BAO_0000190,...,Homo sapiens,Epidermal growth factor receptor,9606,,,IC50,nM,UO_0000065,,1500.0
4,,,3261312,[],CHEMBL1105218,Inhibition of EGFR Leu858Arg and Thr790Met mut...,B,P00533,"L858R,T790M",BAO_0000190,...,Homo sapiens,Epidermal growth factor receptor,9606,,,IC50,nM,UO_0000065,,190.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
158,,415532,17779169,[],CHEMBL3888771,"Omnia Assay: Briefly, 10× stocks of EGFR-WT (P...",B,,,BAO_0000190,...,Homo sapiens,Epidermal growth factor receptor,9606,,,IC50,nM,UO_0000065,,5.0
159,,415533,17779170,[],CHEMBL3888771,"Omnia Assay: Briefly, 10× stocks of EGFR-WT (P...",B,,,BAO_0000190,...,Homo sapiens,Epidermal growth factor receptor,9606,,,IC50,nM,UO_0000065,,5.0
160,,415534,17779171,[],CHEMBL3888771,"Omnia Assay: Briefly, 10× stocks of EGFR-WT (P...",B,,,BAO_0000190,...,Homo sapiens,Epidermal growth factor receptor,9606,,,IC50,nM,UO_0000065,,200.0
161,,415535,17779172,[],CHEMBL3888771,"Omnia Assay: Briefly, 10× stocks of EGFR-WT (P...",B,,,BAO_0000190,...,Homo sapiens,Epidermal growth factor receptor,9606,,,IC50,nM,UO_0000065,,65.0
