<a href="https://colab.research.google.com/github/cmap/lincs-workshop-2020/blob/main/Clue_API_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Introduction**

# **API endpoints and SQL Tables**
See https://docs.google.com/presentation/d/1oqPhvKB1La5y7OBvILIDFqPam6UJkSNIU3RpTywfUR0/edit#slide=id.g887b3f139c_0_50

# **Code set up**

In [None]:
import requests
import pandas as pd
import os
# print Pandas df as formatted tables
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# declare the API Key. Copy your API Key from CLUE
os.environ['API_KEY'] ='4276e9d4fd7b366ab95aef720416610c1'

def query_clue_api(url, data=None, params=None):
  '''Query a Clue API endpoint and return the response as a Pandas dataframe
  '''
  api_key = os.environ.get('API_KEY')
  assert api_key is not None, 'Environment variable API_KEY not set'
  headers = {
    'Accept': 'application/json',
    'user_key': api_key,
  }  
  if data is None:
    # get request if parameters are specified
    response = requests.get(url, headers=headers, params=params)
  else:
    # Post request if data is provided
    response = requests.post(url, headers=headers, data=data)

  df = pd.DataFrame.from_dict(response.json())
  return(df)


# **Column names for each table**

To find the existing columns for siginfo, see the example below. Replace siginfo with any of the table names above to get the fields for the other tables

In [None]:
url = 'https://dev-api.clue.io/api/cmap_table_metadata'
params = (
    ('tableId', 'siginfo'),
)
df = query_clue_api(url, params=params)
df

Unnamed: 0,name,type
0,bead_batch,STRING
1,nearest_dose,FLOAT
2,pert_dose,FLOAT
3,pert_dose_unit,STRING
4,pert_idose,STRING
5,pert_itime,STRING
6,pert_time,FLOAT
7,pert_time_unit,STRING
8,cell_mfc_name,STRING
9,pert_mfc_id,STRING


# **Use cases and examples**

**Select all gene ids**

In [None]:
url = 'https://dev-api.clue.io/api/cmap_genes'
data = {
  'data': 'select gene_id from geneinfo'
}
df = query_clue_api(url, data=data)
df.head()

Unnamed: 0,gene_id
0,8420
1,10301
2,11257
3,23642
4,55384


**Find names and targets of compounds with an moa of 'CDK inhibitor'**

In [None]:
url = 'https://dev-api.clue.io/api/cmap_compounds'
data = {
  'data': 'select pert_id,cmap_name,target,canonical_smiles from compoundinfo where moa=\'CDK inhibitor\' LIMIT 10'
}
df = query_clue_api(url, data=data)
df

Unnamed: 0,pert_id,cmap_name,target,canonical_smiles
0,BRD-K45293975,7-hydroxystaurosporine,CDK4,CN[C@@H]1C[C@H]2O[C@@](C)([C@@H]1OC)n1c3ccccc3...
1,BRD-K45293975,7-hydroxystaurosporine,CHEK1,CN[C@@H]1C[C@H]2O[C@@](C)([C@@H]1OC)n1c3ccccc3...
2,BRD-K45293975,7-hydroxystaurosporine,CHEK2,CN[C@@H]1C[C@H]2O[C@@](C)([C@@H]1OC)n1c3ccccc3...
3,BRD-K45293975,7-hydroxystaurosporine,PDPK1,CN[C@@H]1C[C@H]2O[C@@](C)([C@@H]1OC)n1c3ccccc3...
4,BRD-K45293975,7-hydroxystaurosporine,PRKCA,CN[C@@H]1C[C@H]2O[C@@](C)([C@@H]1OC)n1c3ccccc3...
5,BRD-K45293975,7-hydroxystaurosporine,CDK1,CN[C@@H]1C[C@H]2O[C@@](C)([C@@H]1OC)n1c3ccccc3...
6,BRD-K78373679,Ro-3306,CDK1,O=C1N=C(NCc2cccs2)SC1=C/c1ccc2ncccc2c1
7,BRD-K87932577,NSC-693868,CDK5,Nc1n[nH]c2nc3ccccc3nc12
8,BRD-K87932577,NSC-693868,GSK3A,Nc1n[nH]c2nc3ccccc3nc12
9,BRD-K87932577,NSC-693868,CDK1,Nc1n[nH]c2nc3ccccc3nc12


**Count the number of compounds in each MoA group**


---



In [None]:
url = 'https://dev-api.clue.io/api/cmap_compounds'
data = {'data': 'select moa,count(*) as num_cp from compoundinfo where moa is not null group by moa order by num_cp desc'}
df = query_clue_api(url, data=data)
df


Unnamed: 0,moa,num_cp
0,Serotonin receptor antagonist,276
1,Dopamine receptor antagonist,218
2,Dopamine receptor agonist,216
3,VEGFR inhibitor,199
4,Adrenergic receptor antagonist,195
...,...,...
652,Neuropeptide receptor ligand,1
653,Oxytocin receptor antagonist,1
654,Cytokine production inhibitor,1
655,PAR agonist,1


**Find all the signatures for a compound**

In [None]:
url = 'https://dev-api.clue.io/api/cmap_sigs'
data = {
  'data': 'select sig_id,pert_iname,cell_iname,pert_idose,pert_itime from siginfo where pert_id=\'BRD-K45293975\''
}
df = query_clue_api(url, data=data)
df.head()

Unnamed: 0,sig_id,pert_iname,cell_iname,pert_idose,pert_itime
0,MOAR018_HAP1.TP53_24H:A21,7-hydroxystaurosporine,HAP1,,24 h
1,LCP001_MCF10A.PIK3CA.HH_24H:B23,7-hydroxystaurosporine,MCF10A,,24 h
2,MOAR018_HAP1.CDK4_24H:A04,7-hydroxystaurosporine,HAP1,,24 h
3,MOAR018_HAP1.PRKACA_24H:A13,7-hydroxystaurosporine,HAP1,,24 h
4,MOAR018_HAP1.PIK3CA_24H:A12,7-hydroxystaurosporine,HAP1,,24 h
