Reusability Report: Few-shot learning creates predictive models of drug response that translate from high-throughput screens to individual patients (Version 2.6)
This is the code to replicate all figures from "Reusability Report: Few-shot learning creates predictive models of drug response that translate from high-throughput screens to individual patients". The goal of this paper is to reproduce the results from the Nature Cancer paper "Few-shot learning creates predictive models of drug response that translate from high-throughput screens to individual patients"[1]
There are multiple different ways you can interact with this capsule
- Use the App Panel dropdown on the left sidebar to select a reference-validation combination that you would like to run few-shot TCRP analysis on, as well as baseline comparison against established machine learning methods
- Click "Reproducible Run" at the top right of the capsule
- The results you will receive are the baseline performance results, TCRP performance results and baseline/TCRP performance comparison. (N.B. due to the limited compute hours you may have with your Code Ocean compute account, some good examples to run would be
gCSI_PDTC
,CTRP_PDTC
orGDSC1_GSE41998
as they are very small in compute runtime)
- reference files in
/data/references
and/data/transfers/
to understand input files that are needed. - Run the correct preprocessing notebook. Substitute dataset names where appropriate Decisions for the appropriate notebook are made as follows:
a.
/code/preprocessing/extract_features/new_everything_preprocessing.ipynb
if you are substituting both reference and validation cell line datasets b./code/preprocessing/extract_features/new_PDTX_preprocessing.ipynb
if you are susbtituting validation cell line dataset c./code/preprocessing/extract_features/new_clinical_processing.ipynb
if you are substiting validation clinical datasets with R/NR response - save drug features under
/data/drug_features
with the corresponding nomenclature of {reference}_{validation}
-
Navigate to
/code/tcrp_model/pipelines/prepare_complete_run.py
and replace lines 33 and 34 with the correct drug feature/tissue name -
A folder will be created under
/code/tcrp_model/created_models
with the namecreated_models_{reference}_{validation}
-
Navigate to
/code/tcrp_model/created_models/created_models_{reference}_{validation}/baseline_cmd/subcommands
and execute ALL bash scripts in the directory (the order does not matter)
- Example command:
python3 /root/capsule/code/tcrp_model/model/MAML_DRUG.py --dataset CTRP_PDTC --tissue PDTC --drug Axitinib --K 10 --num_trials 20 --tissue_num 12 --meta_batch_size 10 --meta_lr 0.01 --inner_lr 0.01 --layer 1 --run_name 210803_drug-baseline-models
- Navigate to
/code/tcrp_model/created_models/created_models_{reference}_{validation}/MAML_cmd/subcommands
and execute ALL bash scripts in the directory (the order does not matter)
- N.B Drugs whose performance can be visualized are drugs that completed both baseline and TCRP execution
- Use the notebook
/code/tcrp_model/model/find_max_run.ipynb
to find the most optimal run (substituting tissue/dataset/drug name appropriately). This notebook will tell you which output file has the best TCRP correlation, which you can search for in the corresponding subcommands script to find the full command. - If, alternatively, you would like to calculate additional performance metrics for measuring TCRP in comparison to baseline results (the default is Pearson's correlation), you can visit
/code/tcrp_model/model/new_score.ipynb
to view how that information can be generated. The new arrays containing the new perforamnce metric should then be saved to replace the originalTCRP_performance.npz
for each dataset-drug-tissue. - For each drug, rerun the optimal TCRP command found in step 5 to replace the current result with the optimized result.
- Run
/code/tcrp_model/model_comparisons/plot_results.py
with the corresponding flags of--reference {reference} --validation {validation}
- if you would like to, without rerunning inference code, plot reproducibility results - use
/data/original_data/original_TCRP
and/data/original_data/spear_TCRP
in combination with the notebook/code/tcrp_model/model_comparisons/1-gather-baselines-and-fewshot.ipynb
.
- Your baseline comparison vs. TCRP PNG file will be saved in
/results/
Should you have any inquiries or questions, pelase contact emily.so@mail.utoronto.ca
- Ma, J. et al. Few-shot learning creates predictive models of drug response that translate from high-throughput screens to individual patients. Nat Cancer 2, 233–244 (2021).