Denoising Autoencoder

By Oscar Bennett, 2019

Overview

This is a TensorFlow (1.x) implementation of a simple denoising autoencoder applied to the MNIST dataset. A denoising autoencoder is a type of encoding-decoding neural network which compresses data down to a lower dimensional representation in an unsupervised manner and can learn to remove noise in the process. A nice explanation of the theory can be found here.

The Dataset

The MNIST dataset is a large collection of images of hand written digits. Its a nice and simple dataset to apply data science and machine learning methods to in order to demonstrate their use and benchmark their performance.

Example Results

Some example results after applying gaussian noise to MNIST digit images of 0s and 5s is shown here:

Original: Corrupted: Reconstruction:

As you can see it can get pretty good at finding a signal in a lot of noise!

How To Run

To run, clone the repo and then execute the following commands:

> conda create -n Denoise_AE python=3.7 pip
> source activate Denoise_AE
> pip install -r requirements.txt
> python run.py

This will setup the environment in conda, train the model, save it, and then generate and plot some examples of reconstructions chosen randomly from the validation set. (like shown above)

The final trained tensorflow model checkpoints are saved in a model/ directory.

Method Details

To improve the performance of the model I implemented a few basic model and training features such batch normalisation, early stopping, L2 regularisation, and encoding-decoding weight tying. If you're curious about these techniques just follow the links to discover more.

Variable Parameters

The effect of these techniques as well as changes to the structure of the network can be explored by altering the hyperparameter variables near the top of the run.py file:

##### Variable Hyperparameters #####

max_n_epochs = 150
patience = 5
batch_size = 50

l1_loss_lambda = 0
l2_loss_lambda = 0.00001
TIE_WEIGHTS = True
BATCH_NORM = True

n_inputs = 28*28
n_hidden1 = 300
n_hidden2 = 100
n_hidden3 = 50
n_hidden4 = 100
n_hidden5 = 300
n_outputs = 28*28

Have fun playing with it. I've also included a noise_mag_ph tensorflow placeholder which allows you to experiment with injecting different amounts of noise at different stages of the training and inference process. Just alter its value in the feed_dicts.

Please feel free to let me know about any suggestions or issues!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
resources		resources
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py
tools.py		tools.py
train_data.npy		train_data.npy
train_target.npy		train_target.npy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Denoising Autoencoder

Overview

The Dataset

Example Results

How To Run

Method Details

Variable Parameters

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Denoising Autoencoder

Overview

The Dataset

Example Results

How To Run

Method Details

Variable Parameters

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages