Oversampling with the Conditional Variational Autoencoder (CVAE)

Oversampling such as SMOTE is very popular in addressing imbalanced data classification. However, feature space distance-based learning algorithms such as SMOTE seem to fail to capture the true joint distribution of features and labels. While generative models such as Variational Autoencoders (VAE) are able to learn the underlying joint distribution. Inspired by the previous work, our team proposed a new generative method, namely the VoS: a Method for Variational OverSampling of Imbalanced Data. Note that we have made material changes since we published it in arXiv and the new paper is under review currently.

Major difference between VAE and CVAE

In CVAE, we inject the label information both in encoding phase and decoding phase. Once the model is trained, we can sample from the latent space z ~ Q(z|X,y), usually following a Gaussian distribution, to generate synthetic data. In our paper we showed that empirically the CVAE outperformed other generative models such as SMOTE, CGAN, ACGAN etc. when oversampling the minority class examples in imbalanced data.

How to run this demo locally

Firstly clone this project into your local

$ git clone git@github.com:HongleiXie/demo-CVAE.git
$ cd demo-CVAE

Next, run the following commands to set up the enviroment

$ conda env create -f environment.yml # or environment-windows.yml on Windows
$ conda activate demo

Next let's train the model locally (make sure you are under the demo env)

$ python train.py --batch_size 32 --EPOCHS 10 --latent_dim 2 --inter_dim 128

Make sure you are able to see the checkpoint being saved in ~/saved_model/ folder. You may also check out the log files under ~/logs/.

Now you are able to launch the app by

$ streamlit run app.py

Following the URL shown in your terminal. Done!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
saved_model		saved_model
.gitignore		.gitignore
.slugignore		.slugignore
CVAE.png		CVAE.png
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
__init__.py		__init__.py
app.py		app.py
environment.yml		environment.yml
graph.png		graph.png
model.py		model.py
requirements.txt		requirements.txt
setup.sh		setup.sh
train.py		train.py
web.png		web.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Oversampling with the Conditional Variational Autoencoder (CVAE)

Major difference between VAE and CVAE

How to run this demo locally

About

Releases

Packages

Languages

License

HongleiXie/demo-CVAE

Folders and files

Latest commit

History

Repository files navigation

Oversampling with the Conditional Variational Autoencoder (CVAE)

Major difference between VAE and CVAE

How to run this demo locally

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages