# Table of Contents
* [Intro](#Intro)
	* [Autoencoder](#Autoencoder)
* [Numbers Encoding (Keras)](#Numbers-Encoding-%28Keras%29)
* [MNIST (Keras)](#MNIST-%28Keras%29)
* [Variational Autoencoders](#Variational-Autoencoders)
	* [Encoder](#Encoder)
	* [Decoder](#Decoder)


# Intro

Exploratory notebook related to Autoencoders. Includes toy examples implementation and testing of related techniques or subjects.

## Autoencoder

The goal of an autoencoder is to learn a compressed and distributed representation of a dataset. In the most general case it is then required for the autoencoder to be able to reconstruct the original input as accurately as possible. This technique implicitly operates feature extraction and learning, which generally would outperform handcrafted features results.

For a single-layer feedforward net this can be achieved by using an hidden size smaller than the input one, and training on a function that consider how well the net is then able to reconstruct the input data. If hidden size is equal or higher than input size, the net should learn the identity matrix.

Additional concepts:
* sparsity and regularization
* Denoising Autoencoders (DAE):  where the training is between a corrupted version of the input and the correct one as output
* Variational Autoencoder

In [None]:
import time
import numpy as np
import pdb
import sys
import os
import seaborn as sns

import matplotlib
import matplotlib.pyplot as plt
from matplotlib import animation

from keras.models import Sequential
from keras.models import Model
from keras.layers.core import Activation, Dense
from keras import backend as K
from keras import optimizers

sns.set_style("dark")
sns.set_context("paper")

%matplotlib notebook

sys.path.append(os.path.join(os.getcwd(), os.pardir))
from utils.plot_utils import plot_sample_imgs

# Numbers Encoding (Keras)

An autoencoder that tries to learn a compressed (?binary) representation for one-hot encoded numbers.
    
1 = 00001  
2 = 00010  
3 = 00100  
4 = 01000  
5 = 10000

In [None]:
# create one-hot encoded numbers
input_dim = 10
nums = np.eye(input_dim)[np.arange(input_dim)]
nums

In [None]:
# model parameters
hidden_size = input_dim//2

# Keras model
model = Sequential()
model.add(Dense(hidden_size, input_dim=input_dim, activation=K.sigmoid))
model.add(Dense(input_dim, activation=K.sigmoid))
          
# compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

In [None]:
# fit model
model.fit(nums, nums, epochs=100)

In [None]:
model.summary()
layer_name = 'dense_2'

In [None]:
# hidden layer weights
sns.heatmap(model.get_layer(layer_name).get_weights()[0])
sns.plt.show()

In [None]:
# get hidden layer output building "intermediate model"
intermediate_layer_model = Model(inputs=model.input,
                                 outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(nums)

In [None]:
intermediate_output

In [None]:
# predictions
sns.heatmap(model.predict(nums[np.array([1,2,3,5,6])]))
sns.plt.show()

# MNIST (Keras)

Train autoencoder on the MNIST dataset.

In [None]:
from keras.datasets import mnist

In [None]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [None]:
# flatten 28*28 images to a 784 vector for each image
num_pixels = X_train.shape[1] * X_train.shape[2]
# get only subset of images
num_images = 1000
X_train = X_train[:num_images].reshape(num_images, num_pixels).astype('float32')
X_test = X_test[:num_images].reshape(num_images, num_pixels).astype('float32')

In [None]:
# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255

In [None]:
# Keras model
model = Sequential()
model.add(Dense(512, input_dim=num_pixels, activation=K.relu))
model.add(Dense(256, activation=K.relu))
model.add(Dense(512, activation=K.relu))
model.add(Dense(num_pixels, activation=K.relu))
          
# compile model
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])

In [None]:
model.summary()

In [None]:
model.fit(X_train, X_train, batch_size=100, epochs=10)

In [None]:
# show original test example
sns.plt.imshow(X_test[5].reshape(28, 28), cmap='gray')

In [None]:
# show predicted results
pred = model.predict(X_test[5].reshape(1, num_pixels))
plt.imshow(pred.reshape(28, 28), cmap='gray')

In [None]:
# show several original test examples
plot_sample_imgs(lambda size: np.random.choice(X_train, size), (28, 28), 
                 plot_side=5)

In [None]:
# show predicted results
plot_sample_imgs(lambda size: np.random.choice(X_test, size), (28, 28), 
                 plot_side=5)

# Variational Autoencoders

Just one constrain separates a normal autoencoder from a variational one: forcing "it to generate latent vectors that roughly follow a unit Gaussian distribution". The generation process is then about sampling a latent vector and feeding it to the decoder.

[Source](http://kvfrans.com/variational-autoencoders-explained/)

In [None]:
from keras.layers import *
from keras.models import *
from keras.optimizers import *
from keras.initializers import *
from keras.callbacks import *
from keras.utils.generic_utils import Progbar

In [None]:
img_shape = (28, 28, 1)
latent_dim = 10

## Encoder

In [None]:
# utility for the standard convolution block used in the encoder
def encoder_conv_block(filters, block_input, kernel_size=(3, 3), strides=(1, 1)):
    block = Convolution2D(filters, kernel_size, strides=strides, padding='same')(block_input)
    block = LeakyReLU()(block)
    return block

In [None]:
# takes an image and generates two vectors: means and standards deviations
def encoder_model(input_shape, latent_dim, init_filters=64, num_conv_blocks=2):
    input_image = Input(shape=input_shape)
    
    x = input_image
    for i in range(num_conv_blocks):
        x = encoder_conv_block(init_filters*(2**i), block_input=x)

    features = Flatten()(x)
    
    mean_vector = Dense(latent_dim, activation='linear')(features)
    std_vector = Dense(latent_dim, activation='linear')(features)
    
    return Model(inputs=[input_image], outputs=[mean_vector, std_vector])

In [None]:
# instantiate discriminator model
encoder = encoder_model(input_shape=img_shape, latent_dim=latent_dim, init_filters=128)
encoder.summary()

In [None]:
encoder.predict(np.random.randint(0, 256, (1, 28,28, 1)))

## Decoder

In [None]:
# utility for the standard deconvolution block used in the decoder
def decoder_deconv_block(filters, block_input, kernel_size=(3, 3), strides=(1, 1)):
    block = UpSampling2D()(block_input)
    block = Convolution2D(filters, (3, 3), strides=strides, padding='same')(block)
    block = LeakyReLU()(block)
    block = BatchNormalization()(block)
    return block

In [None]:
# takes as input both the prior sample (noise) and the image class
def decoder_model(latent_dim, init_filters=128, init_side=7, num_deconv_blocks=2):
    latent_vector = Input([latent_dim])
    
    # CNN part
    x = Dense(1024)(latent_vector)
    x = LeakyReLU()(x)
    
    x = Dense(init_side*init_side*init_filters)(x)
    x = LeakyReLU()(x)
    x = BatchNormalization()(x)
    x = Reshape((init_side, init_side, init_filters))(x)

    for i in range(num_deconv_blocks):
        x = decoder_deconv_block(init_filters//(2**i+1), block_input=x)

    x = Convolution2D(1, (2, 2), padding='same', activation='tanh')(x)
    
    return Model(inputs=latent_vector, outputs=x)

In [None]:
# instantiate generate model
decoder = decoder_model(latent_dim=latent_dim, init_filters=128)
decoder.summary()

In [None]:
# plot random generated image
plt.imshow(decoder.predict([np.random.randn(1, latent_dim)])[0]
           .reshape(28, 28))
plt.show()

## V-Autoencoder Model

In [None]:
# init model components
encoder = encoder_model(input_shape=img_shape, latent_dim=latent_dim, init_filters=128)
decoder = decoder_model(latent_dim=latent_dim, init_filters=128)

In [None]:
# Build model

input_img = Input(shape=(img_shape))
mean_vector, std_vector = encoder(inputs=input_img)
latent_vector = 
output_img = decoder(latent_vector)

vaut = Model(inputs=input_img, outputs=output_img)
vaut.compile(loss=[d_loss, 'sparse_categorical_crossentropy'], 
            optimizer=RMSprop(lr=5e-5))