GitHub - GuanLab/TAIJI: TAIJI: Approaching Experimental Replicates-Level Accuracy for Drug Synergy Prediction --Yuanfang's DREAM Drug Synergy Winning Algorithm

TAIJI: Approaching Experimental Replicates-level Accuracy for Drug Synergy Prediction

This is the package of Yuanfang's winning algorithm in the AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge

background: Drug Combination Prediction

see also: Yuanfang Guan's 1st Place Solution and Original Code

Please contact (gyuanfan@umich.edu or hyangl@umich.edu) if you have any questions or suggestions.

Installation

Git clone a copy of TAIJI:

git clone https://github.com/GuanLab/TAIJI.git

Required dependencies

python (3.6.5)
numpy (1.13.3). It comes pre-packaged in Anaconda.
scikit-learn (0.19.0) A popular machine learning package. It can be installed by:

pip install -U scikit-learn

Examples

1. Preparing monotherapy data

To run TAIJI, the monotherapy data are required to extract features. Two types of files are needed:

(1) summary table (e.g. data/test_monotherapy.csv)

CELL_LINE	COMPOUND_A	COMPOUND_B	MAX_CONC_A	MAX_CONC_B	IC50_A	H_A	Einf_A
MFM-223	ADAM17	AKT	3	1	0.226	1.9	17.47
UM-UC-3	AKT	ATR_4	3	3	0.977	2.7	46.19

IC50_B	H_B	Einf_B	SYNERGY_SCORE	QA	COMBINATION_ID	FILENAME
0.562	1.2	70.23	NA	1	ADAM17.AKT	ADAM17.AKT.MFM-223.Rep1.csv
1.12	1.9	81.79	NA	1	AKT.ATR_4	AKT.ATR_4.UM-UC-3.Rep1.csv

CELL_LINE: Normalized cell line name.
COMPOUND_A: Name of drug/compound A described by its target.
COMPOUND_B: Name of drug/compound B described by its target.
MAX_CONC_A: Maximum concentration (uM) of drug A.
MAX_CONC_B: Maximum concentration (uM) of drug B.
IC50_A: Concentration where half of the maximum kill is obtained with drug A.
H_A: Slope of the dose-response curve for drug A.
Einf_A: Maximum cells killed (percentage) with drug A.
IC50_B: Concentration where half of the maximum kill is obtained with drug B.
H_B: Slope of the dose-response curve for drug B.
Einf_B: Maximum cells killed (percentage) with drug B.
SYNERGY_SCORE: To be predicted
QA: Quality assessment flag of combination assays (1: good, -1: bad)
COMBINATION_ID: Name for drug A and drug B combination
FILENAME: the filename of the corresponding monotherapy spreadsheets

(2) monotherapy spreadsheets (e.g. data/monotherapy_spread_csv/drug1.drug2.cell_line.Rep1.csv)


	0	0.03	0.1	0.3	1	3	(=Agent 2)
0	100	113.8	106.4	98.1	100.2	91.7
0.01	104.5	NA	NA	NA	NA	NA
0.03	108.5	NA	NA	NA	NA	NA
0.1	73	NA	NA	NA	NA	NA
0.3	34.5	NA	NA	NA	NA	NA
1	18.8	NA	NA	NA	NA	NA
(=Agent1)
Agent1	DRUG1
Agent2	DRUG2
Unit1	\muM
Unit2	\muM
Title	CELL_LINE

the 1st row is the 5 concentrations (0.03, 0.1, 0.3, 1, 3) of drug A (Agent1)
the 2nd row is the corresponding 5 responses (113.8, 106.4, 98.1, 100.2, 91.7) to drug A
the 1st column is the 5 concentrations (0.01, 0.03, 0.1, 0.3, 1) of drug B (Agent2)
the 2nd column is the corresponding 5 responses (104.5, 108.5, 73, 34.5, 18.8) to drug B

The example monotherapy data of 100 query drug combinations are provided. Of note, the monotherapy data were simulated and they are NOT real data.

2. Making predictions (using default parameters)

Once you prepare and format the input monotherapy data, you can put them in the correponding data folder (and remove the example data). Then you can simply run TAIJI by

python TAIJI.py

TAIJI automatically generates a prediction file prediction.csv in a very short time. If you run it using the simulated example data of 100 drug combinations, it takes about 5mins.

You can also provide the input file/directory and output directory:

python TAIJI.py -i1 data/test_monotherapy.csv -i2 data/monotherapy_spread_csv/ -o ../

The argument i1 is the monotherapy summary table. The i2 is the directory of monotherapy spread sheets. The o is the output directory.

3. Making predictions (using customized parameters)

To run TAIJI with advanced options, you can check the descriptions by running:

python TAIJI.py -h

It returns:

  -i1 INPUT1, --input1 INPUT1
                        Input summary table (example: data/test_monotherapy.csv)
  -i2 INPUT2, --input2 INPUT2
                        Directory of the input monotherapy spread sheets (example: data/monotherapy_spread_csv/)
  -w1 WEIGHT1, --weight1 WEIGHT1
                        Weight for the monotherapy component
  -w2 WEIGHT2, --weight2 WEIGHT2
                        Weight for the cell line-drug component
  -w3 WEIGHT3, --weight3 WEIGHT3
                        Weight for the molecular profile component
  -o OUTPUTDIR, --outputdir OUTPUTDIR
                        Output directory

TAIJI has three prediction components based on different inputs: 1. monotherapy treatment effect 2. cell line-drug frequency 3. molecular profiles. The default weights of these three components are 1:1:2, which are based on Yuanfang's winning solution.

These weights can be adjusted based on the input data. For example, if the user believes that their monotherapy input data is less reliable (e.g. due to large noise), they can adjust these weights (e.g. from 1:1:2 to 0.8:2:4) by running TAIJI like this:

python TAIJI.py -w1 0.8 -w2 2 -w3 4

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
TEST_GS		TEST_GS
count_model		count_model
data		data
figure		figure
molecular_model		molecular_model
monotherapy_model		monotherapy_model
train		train
README.md		README.md
TAIJI.py		TAIJI.py
bash.sh		bash.sh
call_all_count.pl		call_all_count.pl
combine.pl		combine.pl
count_pred.py		count_pred.py
dt_count_test.pl		dt_count_test.pl
dt_molecular_test_all.pl		dt_molecular_test_all.pl
dt_spread_test_all.pl		dt_spread_test_all.pl
molecular_pred.py		molecular_pred.py
monotherapy_pred.py		monotherapy_pred.py
prediction.csv		prediction.csv
step1_split_final.pl		step1_split_final.pl

GuanLab/TAIJI

Folders and files

Latest commit

History

Repository files navigation

TAIJI: Approaching Experimental Replicates-level Accuracy for Drug Synergy Prediction

Installation

Required dependencies

Examples

1. Preparing monotherapy data

2. Making predictions (using default parameters)

3. Making predictions (using customized parameters)

About

Resources

Stars

Watchers

Forks

Languages