Precision Anti-Cancer Drug Selection via Neural Ranking

Authors: Vishal Dey, Xia Ning

Workshop paper: Accepted in BioKDD '23

Full version: In Progress

This repository provides the source code for the proposed methods: $Pair - PushC$ , $List - One$ and $List - All$ in our paper.

Environments

Operating systems: Red Hat Enterprise Linux (RHEL) 7.7

Install packages under conda environments

conda create -n drugrank python=3.9
conda activate drugrank
conda install -y -c rdkit rdkit=2023.03.3
conda install -y numpy=1.26.0 scipy=1.9.1
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
pip3 install torch=1.13.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu116

Datasets

CCLE: Download CCLE expression data from here and save it as data/CCLE/CCLE_expression.csv.
Combined: Download the combined_rnaseq_data file with gene expression data for cell lines renamed for CTRPv2 from here and save it in Combined/.
Please use the provided CTRPv2 (with adjusted AUCs) and PRISM datasets which are already processed. The processed datasets can be downloaded from here. Unzip the ctrpv2.zip and prism.zip inside the data directory.
Find detailed instructions to download and process data in data/README.md.

Experiments

For pre-training $GeneAE$

Check scripts/run_ae.sh on how to run the pretraining code in src/train_ae.py.
Each pre-training run will take only a few minutes on a single V100 GPU.

For ranking

Run the below code to train $List - One$ with default hyper-parameters

export DATA_FOLDER="data/ctrpv2/"
python src/cross_validate.py --model listone --data_path $DATA_FOLDER/LCO/aucs.txt --smiles_path $DATA_FOLDER/cmpd_smiles.txt --splits_path $DATA_FOLDER/LCO/pletorg/ --pretrained_ae -ae_path ${ae_path} -fgen morgan_count --setup LCO

Run the below code to train $List - All$ with default hyper-parameters

export DATA_FOLDER="data/ctrpv2/"
python src/cross_validate.py --model listall --data_path $DATA_FOLDER/LCO/aucs.txt --smiles_path $DATA_FOLDER/cmpd_smiles.txt --splits_path $DATA_FOLDER/LCO/pletorg/ --pretrained_ae -ae_path ${ae_path} -fgen morgan_count -M 0.5 --setup LCO

Run the below code to train $Pair - PushC$ with default hyper-parameters

export DATA_FOLDER="data/ctrpv2/"
python src/cross_validate.py --model pairpushc --data_path $DATA_FOLDER/LCO/aucs.txt --smiles_path $DATA_FOLDER/cmpd_smiles.txt --splits_path $DATA_FOLDER/LCO/pletorg/ --pretrained_ae -ae_path ${ae_path} -classc -fgen morgan_count --setup LCO

where ${ae_path} should be the path to the directory containing the saved models.

model specifies the type of model to train.
data_path specifies the file path containing final processed list of cell ID, drug ID and AUC values (comma-separated).
smiles_path specifies the file path containing the list of tab-separated drug ID and its SMILES string, the SMILES string must be the last column in this file.
splits_path specifies the path to the directory containing the folds, where each fold is saved as a directory.
ae_path specifies the path to the directory containing the pretrained $GeneAE$ model.
check utils/args.py for other hyper-parameters.
Use export DATA_FOLDER="data/ctrpv2/" and change ae_ind to 17743 and ae_path to data/Combined/combined_rnaseq_data for all experiments on CTRP dataset.
Use export DATA_FOLDER="data/prism/" and change ae_ind to 19177 and ae_path to data/CCLE/CCLE_expression.csv for all experiments on PRISM dataset.
change the splits_path to $DATA_FOLDER/LRO/, data_path to $DATA_FOLDER/LRO/aucs.txt and setup=LRO for the LRO experiments.

Check the following scripts for hyper-parameter grid-search and cross-validation:

scripts/run_listone.sh for $List - One$ .
scripts/run_listall.sh for $List - All$ .
scripts/run_pairpushc.sh for $Pair - PushC$ .

Name	Name	Last commit message	Last commit date
Latest commit vishaldeyiiest added detailed preprocessing commands May 16, 2024 f6917ec · May 16, 2024 History 11 Commits
data	data	added detailed preprocessing commands	May 16, 2024
scripts	scripts	update data	Oct 31, 2023
src	src	update data	Oct 31, 2023
.gitignore	.gitignore	clean repo	May 21, 2023
README.md	README.md	cleaned and updated readme	May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Precision Anti-Cancer Drug Selection via Neural Ranking

Environments

Datasets

Experiments

For pre-training $GeneAE$

For ranking

About

Releases

Packages

Languages

ninglab/DrugRanker

Folders and files

Latest commit

History

Repository files navigation

Precision Anti-Cancer Drug Selection via Neural Ranking

Environments

Datasets

Experiments

For pre-training GeneAE

For ranking

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

For pre-training $GeneAE$

Packages