# LB01a.0 Simple Autoencoder (40%)


The term Auto-Encoder (AE) refers to a neural network that is trained on a very specific task: to
reconstruct the data that is sent into the network, the input $\vec{x}$, as its output $\vec{y} = \vec{x}$. A machine learning (ML) model of this type always consists of two major building blocks: an encoder as well as a
decoder. The algorithms used in typical AEs are exactly the same as those in Artificial Neural Networks (ANNs). AEs are therefore still function estimators of their inputs, and use gradient descent to adjust their neurons’ weights by calculating the loss function’s gradient. Since the task of copying does not require labelling, AEs are attributed to the unsupervised learning branch of ML and can be seen as some of the first models from the discipline of deep learning (DL) that were successfully trained.

<img src="resources/LB01a_simple_autoencoder.png"/>

The architecture of an AE, shown in the figure above, can be split into the encoder, which is
used to generate a code $\vec{h}$ through some encoding function:

<center>$\large \vec{h} = encode(\vec{x})$</center>

$\vec{h}$ is also called representation, latent space or context vector in related literature. Using this code, the
decoder tries to find a reconstruction $\vec{y}$ that resembles $\vec{x}$ by decoding with:

<center>$\large \vec{y} = decode(\vec{h})$</center>

Typically, AEs are constrained in some way, making the copying task harder and, therefore, preventing an exact reconstruction of $\vec{x}$. This behavior is intended, since it compels the neural network to find a way to prioritize certain parts of $\vec{x}$ it deems most relevant for the reconstruction task. Thus, information needed in the reconstruction of $\vec{x}$ is saved in $\vec{h}$, while redundancy in the input data is stored in the trainable parameters of an AE.

In [None]:
# Importing the packages needed for this lecture
import sys
import os
import os.path
import numpy as np
import tensorflow as tf

import keras as K
from keras.datasets import mnist
from keras.layers import Input, Dense
from keras.models import Model, Sequential, load_model
from keras.callbacks import Callback, EarlyStopping, TensorBoard

import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
import time

from skimage.metrics import mean_squared_error
from datetime import datetime

In [None]:
# Defining the log folder for tensorboard (helps by visualizing training curves)
logdir = "logs/"
modeldir = "models/"

if not os.path.exists(logdir):
    os.makedirs(logdir)
    
if not os.path.exists(modeldir):
    os.makedirs(modeldir)

In [None]:
# Function for plotting a specified number of images: original vs. encoded vs. decoded
def plot_encoded_img(imgs, encoded_img, rnd_idx, aspect_ratio=0.1, decoded_img= None, title= None):
    plt.figure(figsize=(18, 8))
    if title is not None:
        plt.suptitle(title, fontsize= 16)

    for i, image_idx in enumerate(rnd_idx):
        # plot original image (input, x)
        ax = plt.subplot(3, num_images, i + 1)
        plt.imshow(imgs[image_idx].reshape(28, 28))
        plt.gray()
        ax.get_xaxis().set_visible(False)
        ax.get_yaxis().set_visible(False)

        # plot encoded image (latent space, h)
        ax = plt.subplot(3, num_images, num_images + i + 1)
        plt.imshow(encoded_img[image_idx].reshape(-1, 1), aspect=aspect_ratio)
        plt.gray()
        ax.get_xaxis().set_visible(False)
        ax.get_yaxis().set_visible(False)

        if decoded_img is not None:
            # plot reconstructed image (output, y)
            ax = plt.subplot(3, num_images, 2 * num_images + i + 1)
            plt.imshow(decoded_img[image_idx].reshape(28, 28))
            plt.gray()
            ax.get_xaxis().set_visible(False)
            ax.get_yaxis().set_visible(False)

## LB01a.1 Data preparation

* Load the [MNIST](http://yann.lecun.com/exdb/mnist/) data using `mnist.load_data()`

* Prepare the input images:
    * Convert images to float32-datatype (`.astype()`)
    * Scale images to the interval [0, 1]
    * Images have a 28x28 pixel resolution, transform them to a 1-dimensional vector using `.reshape()`


In [None]:
# TODO: load the mnist image data (28x28px)
(x_train, _), (x_test, _) = ...

# TODO: normalize images within the interval [0,1]
x_train = ...
x_test = ...

# TODO: flatten the 28x28 images into a 1d vector (784,1)
x_train = ...
x_test = ...

# TODO: print new flattened shapes of x_train and x_test
print(x_train.shape)
print(x_test.shape)

## LB01a.2 Autoencoder definition

* Define an autoencoder which has the dimensionality of the latent space of $d = 128$
* Two dense layers (encoder, decoder) are needed
* Hint: use a `ReLU` for the encoding layer and a `sigmoid` for the decoding layer's activation function.
* Choose an appropriate optimizer and loss function for your task
* Compute and output the compression factor of latent space versus input dimensionality

In [None]:
# TODO: get dimensions for the AE's input layer
input_dim = ...

# TODO: define size of encoded representations , i.e. latent space
encoding_dim = ...

# TODO: compute compression factor, i.e. dimensionality reduction
compression_factor = ...
print('Compression factor: %.1f' % compression_factor)

In [None]:
# TODO: define the AE's architecture using the sequential model, give your model the name "Simple_Autoencoder"
autoencoder = ...

# TODO: add needed layers to your model
...

print(autoencoder.summary())

In [None]:
# compile the autoencoder with suited optimizer and loss function
autoencoder.compile(...)

In [None]:
# TODO: set the number of epochs to 50
max_epochs= ...

In [None]:
tensorboard_callback = TensorBoard(log_dir=logdir + "AE_Simple_" + datetime.now().strftime("%Y.%m.%d-%H:%M:%S"))

# TODO: train the autoencoder using input and target accordingly. Think about what we want an AE
# TODO: to do. Also shuffle training data and provide a validation split. 
# TODO: Apply the tensorboard callback to the fit command
# TODO: Set the number of epochs to max_epochs.
# input = target
autoencoder.fit(...)

# TODO: save the entire model graph with weights to the following model path
model_path = modeldir + 'autoencoder.h5'
...

del autoencoder

In order to see the training curves you can now activate the tensorboard in your docker container using the following command after navigating to the working directory (e.g. `/notebooks/<your-working-directory>/`): 

`tensorboard --logdir logs --host 0.0.0.0`

Please note that the `--logdir` parameter has to be the same as your `logdir` variable. 

Afterwards navigate to [http://localhost:6006](http://localhost:6006) in your internet browser.


In [None]:
if not os.path.isfile(modeldir + 'autoencoder.h5'):
    print('No model file for autoencoder found.')
    sys.exit(-1)
    
# load the entire model (no compilation necessary)
autoencoder = load_model(modeldir + 'autoencoder.h5')

## LB01a.3 Evaluation

* Extract the encoder layer of the autoencoder for the visualization
* Use the autoencoder model to predict the test images
* Compute the average mean squared error of all images in the test set

In [None]:
# TODO: extract just the encoder part of the autoencoder to visualize the encoded representation (latent space)
encoder_layer = ...

# TODO: create a new sequential model containing only the encoder part
encoder = ...
encoder.add(...)

print(encoder.summary())

In [None]:
# TODO: encode the test images using just the encoder model
encoded_imgs = ...
# TODO: encode/decode the test images with the full autoencoder model
decoded_imgs = ...

# TODO: compute the average MSE over all test images (original/decoded)
...

avg_mse = ...
print('Average MSE for all original/decoded images: %.4f' % avg_mse)

In [None]:
# plot a random selection of images: original vs. encoded vs. decoded
# just an example how to generate a random index in order to select original/encoded/decoded images from
# the test set
num_images = 20
np.random.seed(42)
random_images = np.random.randint(x_test.shape[0], size=num_images)
plot_encoded_img(x_test, encoded_imgs, random_images, aspect_ratio=0.1, decoded_img= decoded_imgs, title='Simple Autoencoder - Original vs. Encoded vs. Decoded')
plt.show(block=True)