# Autoencoders: Dimensionality Reduction, Generation, and Clustering

Alex Eftimiades

# Why Use Autoencoders?
* Denoising
* Dimensionality reduction
* Generative models

# Autoencoders Tested
* Vanilla
* Double
* Variational
* Variational with batch sample estimated kl loss

# Building Blocks

* Preprocess: Rescale [0, max] to [0, 1]
* Convolutional: RELU
* Dense: RELU, linear bottleneck layer
* Inverse convolutional: RELU
* Mean squared error (gaurantees finite penalities)
* Similar results from binary cross entropy

# Hyperparameters

* batch size: 128
* num epochs:  40
* kernel size: 4
* bottleneck: 2, 32
* strides = 2
* layer filters = (32, 64)

# Base Model

<img src="http://localhost:8888/files/deepsig/vanilla/encoder_2.png">

# Vanilla

<img src="http://localhost:8888/files/deepsig/vanilla/digits_over_latent.png">

# Double

* Vanilla plus reconstruct latent hidden variables
* Sampled from independent isotropic unit gaussians
* 10 sets of epochs run alternating between ecoder(decoder) and decoder(encoder)
* MSE validation loss of vanilla remains similar

<img src="http://localhost:8888/files/deepsig/double/digits_over_latent.png">

# Variational

* Extra loss acts as regularizer that forces latent distributions to unit gaussians
* Keeps latent distributions close to independent unit gaussians
* 2 dimensional bottleneck sufficient for generation, not reconstruction

<img src="http://localhost:8888/files/deepsig/variational/digits_over_latent.png">

# Variational With Sample Estimated Loss

* Latent kl loss derived from estimators over each batch
* Regularizer no longer obviously detracts reconstruction fidelity
* Ultimately produced similar results to usual variational implementation

<img src="http://localhost:8888/files/deepsig/variational_sample/digits_over_latent.png">

# Reconstruction Comparison 2D Bottleneck

<img src="http://localhost:8888/files/deepsig/reconstructed_2.png">

Note the models' worst reconstructions are a realistic looking wrong number

# Reconstruction Comparison 32D Bottleneck

<img src="http://localhost:8888/files/deepsig/reconstructed_32.png">

# K-means Clustering Accuracy 2D Bottleneck

* Vanilla: 61.71%
* Double: 55.26%
* Variational: 17.13%
* Variational Sample Loss: 49.78%

# K-means Clustering Accuracy 32D Bottleneck

* Vanilla: 85.0%
* Double: 85.12%
* Variational: 36.52%
* Variational Sample Loss: 76.17%

# GMM Clustering Accuracy 2D Bottleneck

* Vanilla: 61.67%
* Double: 54.67%
* Variational: 11.36%
* Variational Sample Loss: 39.47%

# GMM Clustering Accuracy 32D Bottleneck

* Vanilla: 83.64%
* Double: 82.87%
* Variational: 13.67%
* Variational Sample Loss: 49.65%

# Clustering

* Vanilla and Double do comparably well for clustering
* GMM type analysis may be useful when categories are not mutually exclusive
* May be useful for detecting new objects given few examples

# Future Directions

* Resnet blocks
* Mainstream image processing networks (Renset, VGG, etc)
* Multiclass identification
* Further analysis on diverse color images like CIFAR10
* Softmax output with normalized input
* Combine double and variational with sample estimated loss

# Questions/Demo time!