VAE for chemotherapy drug response prediction

Code repository accompanying the published paper "Predicting chemotherapy response using a variational autoencoder approach" by Qi Wei and Stephen A. Ramsey BMC Bioinformatics (2021) 22:453.

You can find full text of my paper on this link: https://doi.org/10.1186/s12859-021-04339-6

Your can cite my paper using the following bibtex citation:

@article{article,
author = {Wei, Qi and Ramsey, Stephen},
year = {2021},
month = {09},
pages = {453},
title = {Predicting chemotherapy response using a variational autoencoder approach},
volume = {22},
journal = {BMC Bioinformatics},
doi = {10.1186/s12859-021-04339-6}
}

Introduction & Installation

The idea is using a variational auto-encoder to extract lower dimension abstract data from gene expression data. Then applying those lower dimension abstract data to predict chemotherapy response on various type of cancers using xgboost classifier.

You can create the virtual environment contains all necessory packages from the enviroment file provided, and with code:

conda env create -f tensorflow-gpu-environment-stable.yml

Notice: You need to install Cuda, and Cudnn on you workstation first. You may find instructions here: https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html

Training Dataset

Dataset link:https://xenabrowser.net/datapages/
Clinical record link:https://xenabrowser.net/datapages/ https://www.cbioportal.org/datasets
Five cancer types: colon adenocarcinomas (COAD), pancreatic adenocarcinoma (PAAD), bladder carcinoma (BLCA), sarcoma (SARC), and breast invasive carcinoma (BRCA), selected based on availability of a sufficient amount of labeled data in TCGA
Binary labels for chemotherapy response constructed from the combined clinical record: https://github.com/ATHED/VAE_for_chemotherapy_drug_response_prediction/tree/master/binary_labels

Code Structure

VAE training code: https://github.com/ATHED/VAE_for_chemotherapy_drug_response_prediction/tree/master/VAE_models
R script for adding chemotherapy treatment response labels: https://github.com/ATHED/VAE_for_chemotherapy_drug_response_prediction/tree/master/R_scripts_for_adding_labels
Classification task codes for VAE, PCA, and original dimension data: https://github.com/ATHED/VAE_for_chemotherapy_drug_response_prediction/tree/master/Benchmark_codes_xgboost
Plotting scripts for results: https://github.com/ATHED/VAE_for_chemotherapy_drug_response_prediction/tree/master/R_scripts_for_plotting

Notebook version of codes: https://github.com/ATHED/VAE_for_chemotherapy_drug_response_prediction/tree/master/Notebooks

Diagrammatic representation of the VAE-XGBoost method that we used for predicting tumor response.

Input layer with top 20 most variably expressed genes (size m), configurable multiple fully connected dense layers (e.g., three layers, and six layers) as encoding neural network (encoder), encoder outputting two vectors of configurable latent variable size h (h: manually selected latent vector size, e.g, 50, 500, 1000, etc., h << m for dimension reduction): a vector of means $\mu$ , and another vector of standard deviations $\sigma$ . They form the parameters of a vector of random variables of length h, with ith element of $\mu _i$ and $\sigma _i$ being the mean and standard deviation of the ith random variable $\boldsymbol{\mathrm{z}} _i$ . The sampled encoding then being passed to the decoding neural network (decoder), which is configured by the same number of fully connected dense layers as the encoder part. The decoder will then reconstruct the training data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VAE for chemotherapy drug response prediction

Introduction & Installation

Training Dataset

Code Structure

Diagrammatic representation of the VAE-XGBoost method that we used for predicting tumor response.

Results

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
Benchmark_codes_xgboost		Benchmark_codes_xgboost
Notebooks		Notebooks
R_scripts_for_adding_labels		R_scripts_for_adding_labels
R_scripts_for_plotting		R_scripts_for_plotting
VAE_models		VAE_models
binary_labels		binary_labels
images		images
LICENSE		LICENSE
README.md		README.md
tensorflow-gpu-environment-stable.yml		tensorflow-gpu-environment-stable.yml

License

ATHED/VAE_for_chemotherapy_drug_response_prediction

Folders and files

Latest commit

History

Repository files navigation

VAE for chemotherapy drug response prediction

Introduction & Installation

Training Dataset

Code Structure

Diagrammatic representation of the VAE-XGBoost method that we used for predicting tumor response.

Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages