#### Based on LINCS data resources to discover which drugs can affect the expression level of tumor-specific TFs
##### LINCS description (http://www.bioconductor.org/packages/devel/data/experiment/vignettes/signatureSearchData/inst/doc/signatureSearchData.html#4_LINCS_Signature_Database)
Reference: SignatureSearch: Environment for Gene Expression Signature Searching and Functional Interpretation.2020 NAR <br>
In 2017, the LINCS Consortium generated a similar but much larger data set where the total number of gene expression signatures was scaled up to over one million. This was achieved by switching to a much more cost effective gene expression profiling technology called L1000 assay (Peck et al. 2006; Edgar, Domrachev, and Lash 2002). <font color="red">The current set of perturbations covered by the LINCS data set includes 19,811 drug-like small molecules applied at variable concentrations and treatment times to ~70 human non-cancer (normal) and cancer cell lines. </font>Additionally, it includes several thousand genetic perturbagens composed of gene knockdown and over-expression experiments.

The L1000 assay, used for generating the LINCS data, measures the expression of 978 landmark genes and 80 control genes by loading amplified mRNA populations onto beads and then detecting their abundance with a fluorescent-based method (Peck et al. 2006). The expression of 11,350 additional genes is imputed from the landmark genes by using as training data a large collection of Affymetrix gene chips (Edgar, Domrachev, and Lash 2002).

The LINCS data have been pre-processed by the Broad Institute to <font color="green">5 different levels</font> and are available for download from GEO. <br>Level 1 data are the raw mean fluorescent intensity values that come directly from the Luminex scanner. Level 2 data are the expression intensities of the 978 landmark genes. They have been normalized and used to impute the expression of the additional 11,350 genes, forming Level 3 data. A robust z-scoring procedure was used to generate differential expression values from the normalized profiles (Level 4). Finally, a moderated z-scoring procedure was applied to the replicated samples of each experiment (mostly 3 replicates) to compute a weighted average signature (Level 5). For a more detailed description of the preprocessing methods used by the LINCS project, readers want to refer to the LINCS user guide.

<font color="red">Disregarding replicates, the LINCS data set contains 473,647 signatures with unique cell type and treatment combinations. This includes 19,811 drug-like small molecules tested on different cell lines at multiple concentrations and treatment times.</font> In addition to compounds, several thousand genetic perturbations (gene knock-downs and over expressions) have been tested. Currently, the data described in this vignette are restricted to signatures of small molecule treatments across different cells lines. However, users have the option to assemble any custom collection of the LINCS data. <b>For consistency, only signatures at one specific concentration (10μM) and one time point (24h) have been selected for each small molecule in the default collection.</b> These choices are similar to the conditions used in primary high-throughput compound screens of cell lines. Since the selected compound concentrations and treatment duration have not been tested by LINCS across all cell types yet, a subset of compounds had to be selected that best met the chosen treatment requirements. <b>This left us with 8,104 compounds that were uniformly tested at the chosen concentration and treatment time, but across variable numbers of cell lines. The total number of expression signatures meeting this requirement is 45,956, while the total number of cell lines included in this data set is 30.</b>

#### Two levels of data of LINCs are stored in signatureSearchData
##### lincs(EH3226), lincs_expr(EH3227)
(1) lincs contains moderated z-scores from differential expression (DE) analysis of 12,328 genes from 8,140 compound treatments of 30 cell lines corresponding to a total of 45,956 signatures; The DEGs for the LINCS level 5 Z-score database can be defined by users by setting the cutoffs of Z-scores (e.g. +2 and -2) to define up/down regulated DEGs. 

(2) lincs_expr contains gene expression intensity values from 5,925 compound treatments of 30 cell lines corresponding to a total of 38,824 signatures;

In [17]:
library(ExperimentHub)
library(HDF5Array)
library(SummarizedExperiment)
eh <- ExperimentHub() 
setwd("/data/ExtraDisk/sdd/longzhilin/Data/drugData/CMAP_LINCS_2020")

snapshotDate(): 2020-10-27



In [18]:
query(eh, c("signatureSearchData", "lincs"))
lincs_path <- eh[["EH3226"]]
h5ls(lincs_path)
lincs <- SummarizedExperiment(HDF5Array(lincs_path, name="assay"))
lincs <- as.matrix(lincs@assays@data[[1]])
rownames(lincs) <- HDF5Array(lincs_path, name="rownames")[,1]
colnames(lincs) <- HDF5Array(lincs_path, name="colnames")[,1] #12328 x 45956
lincs

ExperimentHub with 4 records
# snapshotDate(): 2020-10-27
# $dataprovider: Broad Institute, DrugBank, Broad Institute, STITCH
# $species: Homo sapiens
# $rdataclass: character, list
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["EH3226"]]' 

           title             
  EH3226 | lincs             
  EH3227 | lincs_expr        
  EH3228 | dtlink_db_clue_sti
  EH3233 | taurefList        

see ?signatureSearchData and browseVignettes('signatureSearchData') for documentation

loading from cache



Unnamed: 0_level_0,group,name,otype,dclass,dim
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>
0,/,assay,H5I_DATASET,FLOAT,12328 x 45956
1,/,colnames,H5I_DATASET,STRING,45956 x 1
2,/,rownames,H5I_DATASET,STRING,12328 x 1


Unnamed: 0,BRD-A03772856__CD34__trt_cp,trichostatin-a__CD34__trt_cp,geldanamycin__CD34__trt_cp,KUC107191N__CD34__trt_cp,BRD-A35207501__CD34__trt_cp,iloprost__CD34__trt_cp,BRD-A72498772__CD34__trt_cp,BRD-A72796238__CD34__trt_cp,BRD-A73513677__CD34__trt_cp,BRD-A73558769__CD34__trt_cp,⋯,MW-MITF24__MCF7__trt_cp,l-buthionine-sulfoximine__PC3__trt_cp,alpha-tocopherol__PC3__trt_cp,cisplatin__PC3__trt_cp,iniparib__PC3__trt_cp,JMW-MITF1__PC3__trt_cp,bortezomib__PC3__trt_cp,selumetinib__PC3__trt_cp,glutamine__PC3__trt_cp,MW-MITF24__PC3__trt_cp
5720,0.36540002,-1.9333,-1.78014994,-0.61925000,-0.729250073,-2.01719809,0.18455000,-1.1793000,0.34680000,0.10225001,⋯,-2.1606,-0.2913,-1.2731,-0.5659,0.4571,0.0308,-0.2913,-0.9489,-1.0550,-1.9929
466,0.69385004,-4.0653,-2.45910001,0.05739999,1.069949985,-1.52120280,-0.19575000,1.3569000,0.32670000,0.36719999,⋯,-1.9576,-0.6859,0.5907,-0.2468,0.7582,0.4957,-1.0830,0.3153,1.3292,-0.5015
6009,0.55435002,0.1850,0.22250000,0.05124998,0.336199999,-1.32433784,0.78560001,-0.2471000,1.50240004,-0.47415000,⋯,-1.9373,-0.7577,-1.8149,0.0000,0.3941,-1.6190,0.1462,-2.1747,-0.2952,-2.1053
2309,-0.31579998,2.2660,0.67035002,-0.39039999,1.141200066,0.91430640,0.45165002,0.1099000,-0.58609998,-1.34329998,⋯,0.9808,-0.0848,-1.5376,0.6194,-0.2978,0.5139,2.9324,-0.9933,-0.0140,-1.1789
387,0.37180001,-0.2558,-2.59875011,-2.12490010,1.495700002,-3.34633827,-0.03835000,-0.9017001,1.18655002,0.68274999,⋯,-1.5168,0.0997,-0.3862,0.1973,-0.4246,-0.5682,0.6157,-0.3696,0.3436,-3.2787
3553,0.70824999,-0.5149,-0.73619998,-0.41515002,1.960999966,0.22129090,-0.20070001,0.8321000,2.93820000,-0.08945002,⋯,3.2767,0.0000,-0.1367,-0.6745,-1.0342,-1.3800,-0.9713,1.2338,-1.3174,-0.6564
427,0.18295000,-1.6045,0.42860001,0.30954999,1.228850007,0.06338993,0.44700000,0.3969500,0.57410002,-0.11010003,⋯,0.5427,-0.5669,-0.7753,-0.8029,1.6375,1.7251,1.2290,0.0243,-0.7690,-0.6058
5898,-0.40434998,3.5042,2.97979999,-0.84810001,1.905500054,0.31654096,-0.01730000,-0.1905500,-0.82900000,-0.52789998,⋯,-0.3057,0.1609,0.2776,0.0750,-2.8649,0.5809,1.7948,-0.0093,1.2902,0.2903
23365,0.49500000,0.4630,0.48894998,-0.25070000,0.521200001,-0.62915254,-0.67110002,-0.7869500,0.49564999,1.38380003,⋯,-0.7719,-0.3665,0.7769,0.3405,-0.6687,0.4276,-2.5250,0.0963,0.0482,0.0963
6657,-0.58120000,2.3181,1.18390000,0.32394999,0.963199973,1.12940836,-0.09670001,1.5269001,-0.28994998,-0.26674998,⋯,2.3400,0.8720,1.4368,0.5604,-1.3073,-0.8898,-0.4415,0.7747,-0.3321,2.5004


In [31]:
#### load tumor-specific TFs
tumor.specific.TFs <- readRDS("/data/active_data/lzl/RenalTumor-20200713/DataAnalysis-20210803/scATAC/5.Motif/Analysis/tumor.specific.TFs.rds")
idx <- which(tumor.specific.TFs$Name %in% c("HOXC5", "VENTX", "ISL1", "OTP"))
tumor.specific.TFs <- tumor.specific.TFs[idx,]
#convert to entrez id
source(file = "/home/longzhilin/Analysis_Code/IDConvert.R")
gene.info <- IDConvert(tumor.specific.TFs$Name, method = "clusterProfiler", fromType = "SYMBOL", toType = "ENTREZID")
gene.info

'select()' returned 1:1 mapping between keys and columns



Unnamed: 0_level_0,SYMBOL,ENTREZID
Unnamed: 0_level_1,<chr>,<chr>
1,VENTX,27287
2,OTP,23440
3,ISL1,3670
4,HOXC5,3222


In [32]:
index <- match(gene.info$ENTREZID, rownames(lincs))
lincs.TF.matrix <- lincs[na.omit(index),]
rownames(lincs.TF.matrix) <- gene.info$SYMBOL[which(!is.na(index))]
lincs.TF.matrix
dim(lincs.TF.matrix)

Unnamed: 0,BRD-A03772856__CD34__trt_cp,trichostatin-a__CD34__trt_cp,geldanamycin__CD34__trt_cp,KUC107191N__CD34__trt_cp,BRD-A35207501__CD34__trt_cp,iloprost__CD34__trt_cp,BRD-A72498772__CD34__trt_cp,BRD-A72796238__CD34__trt_cp,BRD-A73513677__CD34__trt_cp,BRD-A73558769__CD34__trt_cp,⋯,MW-MITF24__MCF7__trt_cp,l-buthionine-sulfoximine__PC3__trt_cp,alpha-tocopherol__PC3__trt_cp,cisplatin__PC3__trt_cp,iniparib__PC3__trt_cp,JMW-MITF1__PC3__trt_cp,bortezomib__PC3__trt_cp,selumetinib__PC3__trt_cp,glutamine__PC3__trt_cp,MW-MITF24__PC3__trt_cp
VENTX,-0.86745,2.6854,2.9314,0.31655,0.2672,1.3062376,-0.37835,0.5864,-0.19235,-0.36255,⋯,1.6858,0.4772,0.6672,0.4269,-0.6599,-0.7321,-0.5295,0.8246,-0.9867,1.9501
ISL1,-0.1834,0.029,2.1084,0.86345,0.12125,0.4669212,0.6024,1.9325,0.6498,-0.45065,⋯,2.4253,2.1372,-2.3109,0.4266,-0.1666,0.6976,-3.855,1.1637,0.0608,-0.5077
HOXC5,-0.59025,3.2782,2.69445,1.32265,0.61415,2.0156755,-0.212,1.0597,-0.4698,0.28235,⋯,1.1215,0.7762,0.2086,0.1258,-0.5124,0.6798,-1.9251,0.646,-1.3305,0.4147


In [33]:
####screen the drug with decreased the TF's expression
drug.lists <- list()
for(i in 1:nrow(lincs.TF.matrix)){
    zscore <- lincs.TF.matrix[i,]
    info <- data.frame(drug = colnames(lincs.TF.matrix), zscore = zscore)
    sig.drug <- which(info$zscore < -2) 
    if(length(sig.drug)>0){
        sig.info <- info[sig.drug,]
    }else{
        sig.info <- NULL
    }
    drug.lists <- c(drug.lists, list(sig.info))
}
names(drug.lists) <- rownames(lincs.TF.matrix)
drug.lists

Unnamed: 0_level_0,drug,zscore
Unnamed: 0_level_1,<chr>,<dbl>
butoconazole__HCC515__trt_cp,butoconazole__HCC515__trt_cp,-2.043097
amcinonide__VCAP__trt_cp,amcinonide__VCAP__trt_cp,-2.013700
verteporfin__VCAP__trt_cp,verteporfin__VCAP__trt_cp,-2.372383
napelline__HT29__trt_cp,napelline__HT29__trt_cp,-2.024536
clioquinol__HT29__trt_cp,clioquinol__HT29__trt_cp,-2.086267
NS-3694__VCAP__trt_cp,NS-3694__VCAP__trt_cp,-2.178900
AS-605240__VCAP__trt_cp,AS-605240__VCAP__trt_cp,-2.451528
SJ-172550__VCAP__trt_cp,SJ-172550__VCAP__trt_cp,-2.893954
JAS07-005__VCAP__trt_cp,JAS07-005__VCAP__trt_cp,-2.086582
BRD-A38425832__MCF7__trt_cp,BRD-A38425832__MCF7__trt_cp,-2.438600

Unnamed: 0_level_0,drug,zscore
Unnamed: 0_level_1,<chr>,<dbl>
BRD-K26684619__CD34__trt_cp,BRD-K26684619__CD34__trt_cp,-2.040950
BRD-A51929314__HA1E__trt_cp,BRD-A51929314__HA1E__trt_cp,-2.227154
MRS-1845__HCC515__trt_cp,MRS-1845__HCC515__trt_cp,-2.417191
melperone__HCC515__trt_cp,melperone__HCC515__trt_cp,-2.385237
FCCP__HCC515__trt_cp,FCCP__HCC515__trt_cp,-2.176191
BCL2-inhibitor__VCAP__trt_cp,BCL2-inhibitor__VCAP__trt_cp,-2.520154
ubenimex__VCAP__trt_cp,ubenimex__VCAP__trt_cp,-2.167702
epitestosterone__VCAP__trt_cp,epitestosterone__VCAP__trt_cp,-2.142600
procainamide__VCAP__trt_cp,procainamide__VCAP__trt_cp,-2.136900
anagrelide__HA1E__trt_cp,anagrelide__HA1E__trt_cp,-2.888882

Unnamed: 0_level_0,drug,zscore
Unnamed: 0_level_1,<chr>,<dbl>
aminomethyltransferase__HA1E__trt_cp,aminomethyltransferase__HA1E__trt_cp,-2.349426
ARP-101__HA1E__trt_cp,ARP-101__HA1E__trt_cp,-2.42085
homoharringtonine__HA1E__trt_cp,homoharringtonine__HA1E__trt_cp,-2.545411
CFM-1571__HCC515__trt_cp,CFM-1571__HCC515__trt_cp,-2.027797
apoptosis-activator-II__HCC515__trt_cp,apoptosis-activator-II__HCC515__trt_cp,-2.027338
puromycin__VCAP__trt_cp,puromycin__VCAP__trt_cp,-3.208967
NSC-632839__VCAP__trt_cp,NSC-632839__VCAP__trt_cp,-2.4379
NNC-711__VCAP__trt_cp,NNC-711__VCAP__trt_cp,-2.587158
MG-132__VCAP__trt_cp,MG-132__VCAP__trt_cp,-2.497801
Ro-15-4513__VCAP__trt_cp,Ro-15-4513__VCAP__trt_cp,-2.462467


In [7]:
#### which drug can target multiple TFs at the one experiment
multi.TFs <- sapply(drug.lists, function(x){
    return(rownames(x))
})
drugs <- unlist(multi.TFs)
sort(table(drugs),decreasing = T)

drugs
                  BRD-A30083233__VCAP__trt_cp 
                                            6 
parthenolide-(alternate-stereo)__VCAP__trt_cp 
                                            6 
                         BML-257__NEU__trt_cp 
                                            5 
                        tubacin__VCAP__trt_cp 
                                            5 
               1-phenylbiguanide__PHH__trt_cp 
                                            4 
                  BRD-A68274214__VCAP__trt_cp 
                                            4 
                  BRD-K25731886__A549__trt_cp 
                                            4 
                      genistein__VCAP__trt_cp 
                                            4 
              homoharringtonine__HA1E__trt_cp 
                                            4 
                    kitasamycin__VCAP__trt_cp 
                                            4 
                         LFM-A13__NEU__trt_cp 
       

In [34]:
####output 
library (plyr)
drug.result <- ldply (drug.lists, data.frame)
drug.result <- drug.result[,c(2,1,3)]
colnames(drug.result) <- c("Drug", "TF", "ZSCORE")
library(writexl)
drug.result$Drug.name <- gsub("__.*", "", drug.result$Drug)
drug.result$cellLine <- gsub("__trt_cp", "", drug.result$Drug)
drug.result$cellLine <- gsub(".*__", "", drug.result$cellLine)
drug.idx <- sort(table(drug.result$Drug.name),decreasing = T)
drug.result$Drug.name <- factor(drug.result$Drug.name, levels = names(drug.idx))
drug.result <- arrange(drug.result, Drug.name, TF, Drug)
write_xlsx(drug.result, "/data/active_data/lzl/RenalTumor-20200713/DataAnalysis-20210803/Drug/LINCS/lincs.TF.negative.drug.result.xlsx")
drug.result

Drug,TF,ZSCORE,Drug.name,cellLine
<chr>,<chr>,<dbl>,<fct>,<chr>
alpha-tocopherol__MCF7__trt_cp,ISL1,-2.502400,alpha-tocopherol,MCF7
alpha-tocopherol__PC3__trt_cp,ISL1,-2.310900,alpha-tocopherol,PC3
anisomycin__NEU__trt_cp,HOXC5,-2.530992,anisomycin,NEU
anisomycin__ASC__trt_cp,ISL1,-2.111786,anisomycin,ASC
apicidin__HEPG2__trt_cp,HOXC5,-2.335700,apicidin,HEPG2
apicidin__A375__trt_cp,ISL1,-2.681550,apicidin,A375
BG-1003__NEU__trt_cp,ISL1,-2.279669,BG-1003,NEU
BG-1003__NPC__trt_cp,ISL1,-2.050464,BG-1003,NPC
BG-1025__NEU__trt_cp,ISL1,-2.330200,BG-1025,NEU
BG-1025__NPC__trt_cp,ISL1,-2.107575,BG-1025,NPC


##### 查看Lincs中的药物信息和细胞系信息

In [35]:
####drug information
siginfo_beta <- read.table(file = "siginfo_beta.txt", header = T, stringsAsFactors = F, sep = "\t")
head(siginfo_beta)

Unnamed: 0_level_0,bead_batch,nearest_dose,pert_dose,pert_dose_unit,pert_idose,pert_itime,pert_time,pert_time_unit,cell_mfc_name,pert_mfc_id,⋯,sig_id,pert_type,cell_iname,det_wells,det_plates,distil_ids,build_name,project_code,cmap_name,is_ncs_exemplar
Unnamed: 0_level_1,<chr>,<dbl>,<dbl>,<chr>,<chr>,<chr>,<dbl>,<chr>,<chr>,<chr>,⋯,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<lgl>,<chr>,<chr>,<int>
1,b17,,100.0,ug/ml,100 ug/ml,336 h,336,h,N8,BRD-U44432129,⋯,MET001_N8_XH:BRD-U44432129:100:336,trt_cp,NAMEC8,H05|H06|H07|H08,MET001_N8_XH_X1_B17,MET001_N8_XH_X1_B17:H05|MET001_N8_XH_X1_B17:H06|MET001_N8_XH_X1_B17:H07|MET001_N8_XH_X1_B17:H08,,MET,BRD-U44432129,0
2,b15,10.0,10.0,uM,10 uM,3 h,3,h,A549,BRD-K81418486,⋯,ABY001_A549_XH:BRD-K81418486:10:3,trt_cp,A549,L04|L08|L12,ABY001_A549_XH_X1_B15,ABY001_A549_XH_X1_B15:L04|ABY001_A549_XH_X1_B15:L08|ABY001_A549_XH_X1_B15:L12,,ABY,vorinostat,0
3,b15,2.5,2.5,uM,2.5 uM,24 h,24,h,HT29,BRD-K70511574,⋯,ABY001_HT29_XH:BRD-K70511574:2.5:24,trt_cp,HT29,E18|E22,ABY001_HT29_XH_X1_B15,ABY001_HT29_XH_X1_B15:E18|ABY001_HT29_XH_X1_B15:E22,,ABY,HMN-214,0
4,b18,10.0,10.0,uM,10 uM,3 h,3,h,HME1,BRD-K81418486,⋯,LTC002_HME1_3H:BRD-K81418486:10,trt_cp,HME1,F19,LTC002_HME1_3H_X1_B18,LTC002_HME1_3H_X1_B18:F19,,LTC,vorinostat,0
5,b15,10.0,10.0,uM,10 uM,3 h,3,h,H1975,BRD-A61304759,⋯,ABY001_H1975_XH:BRD-A61304759:10:3,trt_cp,H1975,P01|P05|P09,ABY001_H1975_XH_X1_B15,ABY001_H1975_XH_X1_B15:P01|ABY001_H1975_XH_X1_B15:P05|ABY001_H1975_XH_X1_B15:P09,,ABY,tanespimycin,0
6,b15,10.0,10.0,uM,10 uM,24 h,24,h,H1975,BRD-K85606544,⋯,ABY001_H1975_XH:BRD-K85606544:10:24,trt_cp,H1975,A15|A19|A23,ABY001_H1975_XH_X1_B15,ABY001_H1975_XH_X1_B15:A15|ABY001_H1975_XH_X1_B15:A19|ABY001_H1975_XH_X1_B15:A23,,ABY,neratinib,0


In [36]:
cellinfo_beta <- read.table(file = "cellinfo_beta.modify.txt", header = T, stringsAsFactors = F, sep = "\t", comment.char = "!")
cellinfo_beta$cell_iname

In [37]:
#kidney cell line
index <- grep("kidney", cellinfo_beta$primary_disease)
cellinfo_beta[index,]

Unnamed: 0_level_0,cell_iname,cellosaurus_id,donor_age,donor_age_death,donor_disease_age_onset,doubling_time,growth_medium,provider_catalog_id,feature_id,cell_type,donor_ethnicity,donor_sex,donor_tumor_phase,cell_lineage,primary_disease,subtype,provider_name,growth_pattern,ccle_name,cell_alias
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<lgl>,<lgl>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
98,HA1E,,,,,60.0,MEM-ALPHA (Invitrogen A1049001) supplemented with 10% v/v fetal bovine serum (Sigma F4135) and 1% 100X penicillin-streptomycin-glutamine (Invitrogen 10378-016),,,normal,Unknown,Unknown,Unknown,kidney,normal kidney sample,normal kidney sample,,unknown,HA1E_KIDNEY,
99,HEK293,CVCL_0045,,,,24.0,"ATCC-formulated Eagles Minimum Essential Medium, Catalog No. 30-2003. To make the complete growth medium, add the following components to the base medium: fetal bovine serum to a final concentration of 10%.",,,normal,Unknown,F,Unknown,kidney,normal kidney sample,normal embryonal kidney sample,,adherent,,293
100,HEKTE,,,,,,,,,normal,Unknown,Unknown,Unknown,kidney,normal kidney sample,normal kidney sample,,unknown,HEKTE_KIDNEY,HEK TE
158,G401,CVCL_0270,3 months,,,48.0,McCoys5A,CRL-1441,c-108,tumor,Caucasian,M,Primary,soft_tissue,kidney cancer,carcinoma,ATCC,adherent,G401_SOFT_TISSUE,G 401|G-401
236,RCC10RGB,CVCL_1647,,,,108.0,DMEM+ 1%FBS,,c-526,tumor,Unknown,M,Unknown,kidney,kidney cancer,carcinoma,RIKEN,adherent,RCC10RGB_KIDNEY,10RGB


In [38]:
#Engineered Kidney --- HA1E, HA1E,HK2
#kidney; Embryo --- HEK293T
RCCl.cellLines <- c("769P", "786O", "A704", "A498", "ACHN", "CaKi", "HEK293", "RCC", "HEKTE", "HA1E", "G401", "RCC10RGB")
RCC.drug.results <- sapply(RCCl.cellLines, function(x){
    index <- grep(x, drug.result$Drug)
    if(length(index)>0){
        return(drug.result[index,])
    }else{
        return(NULL)
    }
})
RCC.drug.results

Unnamed: 0_level_0,Drug,TF,ZSCORE,Drug.name,cellLine
Unnamed: 0_level_1,<chr>,<chr>,<dbl>,<fct>,<chr>
67,aminomethyltransferase__HA1E__trt_cp,HOXC5,-2.349426,aminomethyltransferase,HA1E
72,anagrelide__HA1E__trt_cp,ISL1,-2.888882,anagrelide,HA1E
74,ARP-101__HA1E__trt_cp,HOXC5,-2.42085,ARP-101,HA1E
76,AVA__HA1E__trt_cp,VENTX,-2.028199,AVA,HA1E
101,BRD-A51929314__HA1E__trt_cp,ISL1,-2.227154,BRD-A51929314,HA1E
117,BRD-K05402890__HA1E__trt_cp,ISL1,-2.309873,BRD-K05402890,HA1E
231,homoharringtonine__HA1E__trt_cp,HOXC5,-2.545411,homoharringtonine,HA1E
282,R-59022__HA1E__trt_cp,HOXC5,-2.440392,R-59022,HA1E
303,ST-638__HA1E__trt_cp,ISL1,-2.139165,ST-638,HA1E
314,tributyltin__HA1E__trt_cp,ISL1,-3.09455,tributyltin,HA1E


In [39]:
####output
RCC.drug.result <- rbind(RCC.drug.results[[7]], RCC.drug.results[[10]])
library (plyr)
library(writexl)
drug.idx <- sort(table(RCC.drug.result$Drug),decreasing = T)
RCC.drug.result$Drug <- factor(RCC.drug.result$Drug, levels = names(drug.idx))
RCC.drug.result <- arrange(RCC.drug.result, Drug)
write_xlsx(RCC.drug.result, "/data/active_data/lzl/RenalTumor-20200713/DataAnalysis-20210803/Drug/LINCS/lincs.TF.RCC.negative.drug.result.xlsx")
RCC.drug.result

Drug,TF,ZSCORE,Drug.name,cellLine
<fct>,<chr>,<dbl>,<fct>,<chr>
aminomethyltransferase__HA1E__trt_cp,HOXC5,-2.349426,aminomethyltransferase,HA1E
anagrelide__HA1E__trt_cp,ISL1,-2.888882,anagrelide,HA1E
ARP-101__HA1E__trt_cp,HOXC5,-2.42085,ARP-101,HA1E
AVA__HA1E__trt_cp,VENTX,-2.028199,AVA,HA1E
BRD-A51929314__HA1E__trt_cp,ISL1,-2.227154,BRD-A51929314,HA1E
BRD-K05402890__HA1E__trt_cp,ISL1,-2.309873,BRD-K05402890,HA1E
homoharringtonine__HA1E__trt_cp,HOXC5,-2.545411,homoharringtonine,HA1E
R-59022__HA1E__trt_cp,HOXC5,-2.440392,R-59022,HA1E
ST-638__HA1E__trt_cp,ISL1,-2.139165,ST-638,HA1E
tributyltin__HA1E__trt_cp,ISL1,-3.09455,tributyltin,HA1E


In [40]:
####extract drug information
drug.RCC <- as.character(RCC.drug.result$Drug)
drug.RCC <- sapply(drug.RCC, function(x){
    a <- unlist(strsplit(x, "__"))
    return(a[1:2])
})
drug.RCC <- unique(t(drug.RCC))

idx <- apply(drug.RCC, 1, function(x){
    index <- which(siginfo_beta$cmap_name == x[1] & siginfo_beta$cell_iname == x[2])
    return(siginfo_beta[index, ])
})
drug..RCC.info <- ldply (idx, data.frame)
write_xlsx(drug..RCC.info, "/data/active_data/lzl/RenalTumor-20200713/DataAnalysis-20210803/Drug/LINCS/lincs.TF.RCC.drug.info.xlsx")
drug..RCC.info

.id,bead_batch,nearest_dose,pert_dose,pert_dose_unit,pert_idose,pert_itime,pert_time,pert_time_unit,cell_mfc_name,⋯,sig_id,pert_type,cell_iname,det_wells,det_plates,distil_ids,build_name,project_code,cmap_name,is_ncs_exemplar
<chr>,<chr>,<dbl>,<dbl>,<chr>,<chr>,<chr>,<dbl>,<chr>,<chr>,⋯,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<lgl>,<chr>,<chr>,<int>
aminomethyltransferase__HA1E__trt_cp,b3,10.0,10.0,uM,10 uM,24 h,24,h,HA1E,⋯,CPC001_HA1E_24H:BRD-A28318179-003-03-6:10,trt_cp,HA1E,J07,CPC001_HA1E_24H_X1_B3_DUO52HI53LO|CPC001_HA1E_24H_X2_B3_DUO52HI53LO|CPC001_HA1E_24H_X3_B3_DUO52HI53LO,CPC001_HA1E_24H_X1_B3_DUO52HI53LO:J07|CPC001_HA1E_24H_X2_B3_DUO52HI53LO:J07|CPC001_HA1E_24H_X3_B3_DUO52HI53LO:J07,,CPC,aminomethyltransferase,0
aminomethyltransferase__HA1E__trt_cp,b3,10.0,10.0,uM,10 uM,6 h,6,h,HA1E,⋯,CPC001_HA1E_6H:BRD-A28318179-003-03-6:10,trt_cp,HA1E,J07,CPC001_HA1E_6H_X1_B3_DUO52HI53LO|CPC001_HA1E_6H_X2_B3_DUO52HI53LO|CPC001_HA1E_6H_X3_B3_DUO52HI53LO,CPC001_HA1E_6H_X1_B3_DUO52HI53LO:J07|CPC001_HA1E_6H_X2_B3_DUO52HI53LO:J07|CPC001_HA1E_6H_X3_B3_DUO52HI53LO:J07,,CPC,aminomethyltransferase,1
anagrelide__HA1E__trt_cp,b3,10.0,10.0,uM,10 uM,6 h,6,h,HA1E,⋯,CPC004_HA1E_6H:BRD-K62200014-003-05-5:10,trt_cp,HA1E,L03,CPC004_HA1E_6H_X1_B3_DUO52HI53LO|CPC004_HA1E_6H_X2_B3_DUO52HI53LO|CPC004_HA1E_6H_X3_B3_DUO52HI53LO,CPC004_HA1E_6H_X1_B3_DUO52HI53LO:L03|CPC004_HA1E_6H_X2_B3_DUO52HI53LO:L03|CPC004_HA1E_6H_X3_B3_DUO52HI53LO:L03,,CPC,anagrelide,0
anagrelide__HA1E__trt_cp,f1b3,10.0,10.0,uM,10 uM,6 h,6,h,HA1E,⋯,CPC011_HA1E_6H:BRD-K62200014-003-07-1:10,trt_cp,HA1E,H10,CPC011_HA1E_6H_X1_F1B3_DUO52HI53LO|CPC011_HA1E_6H_X3_F1B3_DUO52HI53LO,CPC011_HA1E_6H_X1_F1B3_DUO52HI53LO:H10|CPC011_HA1E_6H_X3_F1B3_DUO52HI53LO:H10,,CPC,anagrelide,0
anagrelide__HA1E__trt_cp,b23,0.74,0.769231,uM,0.74 uM,24 h,24,h,HA1E,⋯,REP.B018_HA1E_24H:B14,trt_cp,HA1E,B14,REP.B018_HA1E_24H_X1_B23|REP.B018_HA1E_24H_X2_B23,REP.B018_HA1E_24H_X1_B23:B14|REP.B018_HA1E_24H_X2_B23:B14,,REP,anagrelide,1
anagrelide__HA1E__trt_cp,b24,1.11,1.11111,uM,1.11 uM,24 h,24,h,HA1E,⋯,REP.A018_HA1E_24H:B15,trt_cp,HA1E,B15,REP.A018_HA1E_24H_X1_B24|REP.A018_HA1E_24H_X2_B23|REP.A018_HA1E_24H_X3_B23,REP.A018_HA1E_24H_X1_B24:B15|REP.A018_HA1E_24H_X2_B23:B15|REP.A018_HA1E_24H_X3_B23:B15,,REP,anagrelide,0
anagrelide__HA1E__trt_cp,b24,0.04,0.0411523,uM,0.04 uM,24 h,24,h,HA1E,⋯,REP.A018_HA1E_24H:B18,trt_cp,HA1E,B18,REP.A018_HA1E_24H_X1_B24|REP.A018_HA1E_24H_X2_B23|REP.A018_HA1E_24H_X3_B23,REP.A018_HA1E_24H_X1_B24:B18|REP.A018_HA1E_24H_X2_B23:B18|REP.A018_HA1E_24H_X3_B23:B18,,REP,anagrelide,0
anagrelide__HA1E__trt_cp,b23,0.08,0.0854701,uM,0.08 uM,24 h,24,h,HA1E,⋯,REP.B018_HA1E_24H:B16,trt_cp,HA1E,B16,REP.B018_HA1E_24H_X1_B23|REP.B018_HA1E_24H_X2_B23,REP.B018_HA1E_24H_X1_B23:B16|REP.B018_HA1E_24H_X2_B23:B16,,REP,anagrelide,0
anagrelide__HA1E__trt_cp,b24,10.0,10.0,uM,10 uM,24 h,24,h,HA1E,⋯,REP.A018_HA1E_24H:B13,trt_cp,HA1E,B13,REP.A018_HA1E_24H_X1_B24|REP.A018_HA1E_24H_X2_B23|REP.A018_HA1E_24H_X3_B23,REP.A018_HA1E_24H_X1_B24:B13|REP.A018_HA1E_24H_X2_B23:B13|REP.A018_HA1E_24H_X3_B23:B13,,REP,anagrelide,0
anagrelide__HA1E__trt_cp,b23,0.25,0.25641,uM,0.25 uM,24 h,24,h,HA1E,⋯,REP.B018_HA1E_24H:B15,trt_cp,HA1E,B15,REP.B018_HA1E_24H_X1_B23|REP.B018_HA1E_24H_X2_B23,REP.B018_HA1E_24H_X1_B23:B15|REP.B018_HA1E_24H_X2_B23:B15,,REP,anagrelide,0


In [41]:
####identify the candidate drugs with Z-score
mean.drug <- tapply(RCC.drug.result$ZSCORE, as.character(RCC.drug.result$Drug), mean)
mean.drug <- data.frame(Drug = names(mean.drug), mean.Zscore = mean.drug)
mean.drug <- mean.drug[order(mean.drug$mean.Zscore, decreasing = T),]
mean.drug

rankMean <- RCC.drug.result
rankMean$Drug <- factor(rankMean $Drug, levels = mean.drug$Drug)
rankMean <- arrange(rankMean, Drug)
write_xlsx(rankMean, "/data/active_data/lzl/RenalTumor-20200713/DataAnalysis-20210803/Drug/LINCS/lincs.TF.RCC.drug.meanRank.xlsx")

Unnamed: 0_level_0,Drug,mean.Zscore
Unnamed: 0_level_1,<chr>,<dbl>
triphenyl-tin__HA1E__trt_cp,triphenyl-tin__HA1E__trt_cp,-2.005953
AVA__HA1E__trt_cp,AVA__HA1E__trt_cp,-2.028199
ST-638__HA1E__trt_cp,ST-638__HA1E__trt_cp,-2.139165
BRD-A51929314__HA1E__trt_cp,BRD-A51929314__HA1E__trt_cp,-2.227154
BRD-K05402890__HA1E__trt_cp,BRD-K05402890__HA1E__trt_cp,-2.309873
aminomethyltransferase__HA1E__trt_cp,aminomethyltransferase__HA1E__trt_cp,-2.349426
ARP-101__HA1E__trt_cp,ARP-101__HA1E__trt_cp,-2.42085
R-59022__HA1E__trt_cp,R-59022__HA1E__trt_cp,-2.440392
homoharringtonine__HA1E__trt_cp,homoharringtonine__HA1E__trt_cp,-2.545411
anagrelide__HA1E__trt_cp,anagrelide__HA1E__trt_cp,-2.888882


In [42]:
#### drug down-regulated the gene expression
negative.zscore <- RCC.drug.result[which(RCC.drug.result$ZSCORE<0),]
negative.zscore.mean <- tapply(negative.zscore$ZSCORE, as.character(negative.zscore$Drug), mean)
negative.zscore.mean <- data.frame(Drug = names(negative.zscore.mean), mean.Zscore = negative.zscore.mean)
negative.zscore.mean <- negative.zscore.mean[order(negative.zscore.mean$mean.Zscore),]
#negative.zscore.mean

negative.rankMean <- negative.zscore
negative.rankMean$Drug.name <- gsub("__.*", "", negative.rankMean$Drug)
idx <- names(sort(table(negative.rankMean$Drug.name), decreasing = T))
negative.rankMean$Drug.name <- factor(negative.rankMean$Drug.name, levels = idx)
negative.rankMean <- arrange(negative.rankMean, Drug.name)
write_xlsx(negative.rankMean, "/data/active_data/lzl/RenalTumor-20200713/DataAnalysis-20210803/Drug/LINCS/lincs.TF.RCC.drug.negative.rankMean.xlsx")
negative.rankMean

Drug,TF,ZSCORE,Drug.name,cellLine
<fct>,<chr>,<dbl>,<fct>,<chr>
aminomethyltransferase__HA1E__trt_cp,HOXC5,-2.349426,aminomethyltransferase,HA1E
anagrelide__HA1E__trt_cp,ISL1,-2.888882,anagrelide,HA1E
ARP-101__HA1E__trt_cp,HOXC5,-2.42085,ARP-101,HA1E
AVA__HA1E__trt_cp,VENTX,-2.028199,AVA,HA1E
BRD-A51929314__HA1E__trt_cp,ISL1,-2.227154,BRD-A51929314,HA1E
BRD-K05402890__HA1E__trt_cp,ISL1,-2.309873,BRD-K05402890,HA1E
homoharringtonine__HA1E__trt_cp,HOXC5,-2.545411,homoharringtonine,HA1E
R-59022__HA1E__trt_cp,HOXC5,-2.440392,R-59022,HA1E
ST-638__HA1E__trt_cp,ISL1,-2.139165,ST-638,HA1E
tributyltin__HA1E__trt_cp,ISL1,-3.09455,tributyltin,HA1E
