# Autoencoders

**Exercise objectives**
- Discover autoencoders
- Get a deeper understanding of CNNs

<hr>
<hr>

In this notebook, we look at a particular architecture used in deep learning: autoencoders. Autoencoders are neural network architectures trained to **output something as close as possible to the very input they were given**. It may seem strange but it's useful, we promise. 

The interest comes from the fact that there is a bottleneck in the network architecture i.e. a layer with a low number of neurons. If the autoencoder can reproduce its input, it means that the information that flows within the network is sufficient to recreate the input data. 

In particular, the **information contained at the bottleneck** - meaning the representation of the data at the low-dimensional layer - **accurately captures the data at hand and can recreate it**. It have many applications (compression, denoising etc...)

<img src='https://github.com/lewagon/data-images/blob/master/DL/autoencoder.png?raw=true'>

## 1. The data

In this notebook, we will train an auto-encoder to work on 28x28 grey images from the MNIST dataset, available in keras. Run the cells below

In [3]:
from tensorflow.keras.datasets import mnist

(images_train, labels_train), (images_test, labels_test) = mnist.load_data()
print(images_train.shape)
print(images_test.shape)

In [4]:
# Add a channels for the colors and normalize data
X_train = images_train.reshape((60000, 28, 28, 1)) / 255.
X_test = images_test.reshape((10000, 28, 28, 1)) / 255.

In [6]:
# Plot some images
import matplotlib.pyplot as plt

f, axs = plt.subplots(1, 10, figsize=(20, 4))
for i, ax in enumerate(axs):
    ax.axis('off')
    ax.imshow(X_train[i].reshape(28, 28), cmap='Greys')
    
plt.show()

## 2. The encoder

First, we will build the "Encoder" part for you (in blue in the network picture above)

💡 Notice how it looks similar to a Convolution classifier of `latent_dimension` labels, except for the `tanh` activation of the final dense layer

In [7]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

def build_encoder(latent_dimension):
    '''returns an encoder model, of output_shape equals to latent_dimension'''
    encoder = Sequential()
    
    encoder.add(Conv2D(8, (2,2), input_shape=(28, 28, 1), activation='relu'))
    encoder.add(MaxPooling2D(2))

    encoder.add(Conv2D(16, (2, 2), activation='relu'))
    encoder.add(MaxPooling2D(2))

    encoder.add(Conv2D(32, (2, 2), activation='relu'))
    encoder.add(MaxPooling2D(2))     

    encoder.add(Flatten())
    encoder.add(Dense(latent_dimension, activation='tanh'))
    
    return encoder

❓ **Question** ❓ Build your encoder with  `latent_dimension=2` and look at the number of parameters.

In [8]:
# YOUR CODE HERE

## 3. Decoder

It's your turn to build the decoder this time!

We need to build a reverse CNN that takes a dense layer as input, and output image of shape `(28,28,1)` similar to our MNIST images. 

For that, we will use a new layer called [`Conv2DTranspose`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2DTranspose), which does what it say: the opposite of a convolution

💡 We could start by reshaping the Dense input layer into images of shape `(7,7,..)`, then apply to `Conv2DTranspose` with `stride=2` to double its image shape to `(14,14,..)` then another one up to `(28,28,1)`

❓ **Question** ❓ Define a the decoder architecture in the method below as follow:
- a `Dense` layer with `7*7*8` neurons, and input shape `(latent_dimension,)` and the `tanh` activation function. 
- a Reshape layer that reshapes to `(7, 7, 8)` tensors
- a Conv2DTranspose with `8` filters, `(2,2)` kernels, strides of `2`, padding `same` and activation being `relu`
- a second Conv2DTranspose layer with `1` filter, `(2,2)` kernels, strides of `2`, padding `same`, and the `relu` activation function.

In [16]:
from tensorflow.keras.layers import Reshape, Conv2DTranspose

def build_decoder(latent_dimension):
    pass  # YOUR CODE HERE

❓ **Question** ❓ Build your decoder with `latent_dimension=2` and check that it outputs images of same shape than the encoder input

In [17]:
# YOUR CODE HERE

## 4. Auto-Encoder

We will now concatenate both the encoder and the decoder thanks to the `Model` class in Keras, using the `functionalAPI`.

In [26]:
from tensorflow.keras import Model
from tensorflow.keras.layers import Input

def build_autoencoder(encoder, decoder):
    inp = Input((28, 28,1))
    encoded = encoder(inp)
    decoded = decoder(encoded)
    autoencoder = Model(inp, decoded)
    return autoencoder

❓ **Question** ❓ Try to understand syntax above, build your autoencoder and look at the number of parameters

In [27]:
# YOUR CODE HERE

❓ **Question** ❓ Defines a method which compiles your model. Pick an appropriate loss.

Think carefully: on which mathematical objects are we going to compare predictions and ground truth for the computation of loss and the metric?


<details>
    <summary>🆘 Answer</summary>

It should compare two images (Black and White in our case), pixel-by-pixel!
    
The MSE loss seems appropriate for pixel-by-pixel error minimization.
</details>

In [28]:
# YOUR CODE HERE

❓ **Question** ❓  Compile your model and fit it with  `batch_size = 32` and `epochs=20`. What is the label `y` in this case?

**Note:** In this notebook, always set. The goal of this exercise is not to carefully deal with overfitting or to perfectly train the models but to understand autoencoders.

In [29]:
# YOUR CODE HERE

❓ **Question** ❓ Look at predicted images from the autoencoder, are they close to the original ? 

In [62]:
# YOUR CODE HERE

❓ **Question** ❓ Using only the encoder part of the network, encode your dataset and save it under `X_encoded` . 

Each image is now represented by two values (that correspond to the dimension of the latent space, of the bottleneck; aka the `latent_dimension`. 

In [44]:
# YOUR CODE HERE

❓ **Question** ❓ Each encoded 2D-datapoint corresponds to a given label, between 0 and 9 (which is the initial written data). 

Represent on a 2D plot the encoded data (only a small subset of it for visibility purpose
- Each point of the scatter plot will correspond to an encoded image
- Color the dot according to the label (digit representation) it corresponds to.
- For instance, all the "4" should be represented by a color on this scatter plot, while the "5" should be represented by another color.

What do you remark on this plot? 

In [56]:
# YOUR CODE HERE

# 5. Application: Image denoising


❓ **Question** ❓ We will here add some noise to the input data. Run the following code and plot pair of initial and related noisy data

In [58]:
import numpy as np

noise_factor = 0.5
X_train_noisy = X_train + noise_factor * np.random.normal(0., 1., size=X_train.shape)
X_test_noisy = X_test + noise_factor * np.random.normal(0., 1., size=X_test.shape)

In [65]:
# YOUR CODE HERE

❓ **Question** ❓ Now, reinitialize your autoencoder (with a latent space of 2) and train it to predict the denoised image from the noisy one. 

(keep batch_size = 32 and epochs=5)

In [67]:
# YOUR CODE HERE

❓ **Question** ❓ For some noisy test data, predict the denoised data and plot the result

In [69]:
# YOUR CODE HERE

❓ **Question** ❓ Now, try to evaluate which `latent_dimension` is the best in order to have the best image reconstruction (aka denoise the data as much as possible.

In [None]:
# YOUR ANSWER HERE