Skip to content

ATHED/VAE_for_chemotherapy_drug_response_prediction

Repository files navigation

VAE for chemotherapy drug response prediction

Code repository accompanying the published paper "Predicting chemotherapy response using a variational autoencoder approach" by Qi Wei and Stephen A. Ramsey BMC Bioinformatics (2021) 22:453.

You can find full text of my paper on this link: https://doi.org/10.1186/s12859-021-04339-6

Your can cite my paper using the following bibtex citation:

@article{article,
author = {Wei, Qi and Ramsey, Stephen},
year = {2021},
month = {09},
pages = {453},
title = {Predicting chemotherapy response using a variational autoencoder approach},
volume = {22},
journal = {BMC Bioinformatics},
doi = {10.1186/s12859-021-04339-6}
}

Introduction & Installation

The idea is using a variational auto-encoder to extract lower dimension abstract data from gene expression data. Then applying those lower dimension abstract data to predict chemotherapy response on various type of cancers using xgboost classifier.

You can create the virtual environment contains all necessory packages from the enviroment file provided, and with code:

conda env create -f tensorflow-gpu-environment-stable.yml

Notice: You need to install Cuda, and Cudnn on you workstation first. You may find instructions here: https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html

Training Dataset

Code Structure

Notebook version of codes: https://github.com/ATHED/VAE_for_chemotherapy_drug_response_prediction/tree/master/Notebooks

Diagrammatic representation of the VAE-XGBoost method that we used for predicting tumor response.

Image of our VAE model

Input layer with top 20 most variably expressed genes (size m), configurable multiple fully connected dense layers (e.g., three layers, and six layers) as encoding neural network (encoder), encoder outputting two vectors of configurable latent variable size h (h: manually selected latent vector size, e.g, 50, 500, 1000, etc., h << m for dimension reduction): a vector of means , and another vector of standard deviations . They form the parameters of a vector of random variables of length h, with ith element of and being the mean and standard deviation of the ith random variable . The sampled encoding then being passed to the decoding neural network (decoder), which is configured by the same number of fully connected dense layers as the encoder part. The decoder will then reconstruct the training data.

Results

Image of Table 1

Image of Table 1

Image of Table 1

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published