# Autoencoders

🎯 **Exercise objectives**
- Discover ***autoencoders***
- Get a deeper understanding of CNNs

<hr>

👉 There exists a very particular architecture in Deep Learning called **`Autoencoders`**. Autoencoders are Neural Network architectures trained to return **outputs that are as similar as possible to the original inputs fed to them**. Why would we do that?  

Before answering the question _"why"_, let's answer the question _"how"_.

👩🏻‍🏫 <u>***How does an autoencoder work ?***</u>

There are two parts in an autoencoder: the  **`encoder`** and the **`decoder`**.

1. In the encoder, we will make the information flow through different dense layers with a decreasing number of neurons. It will create a **`bottleneck`** where the information is compressed.

2. In the decoder, we will try to recreate the original data based on the compressed data.

🔥 <u>***Why is it powerful or useful?***</u>

If it works well, it means two important things:

* ✅ We can afford to **compress our dataset** and use a compressed version of it when fitting another Neural Network! 

* ✅ The **information contained in the bottleneck** - i.e. the data compressed in a low-dimensional layer - **accurately captures the patterns of our dataset** and the autoencoder is able to decode the compressed information!

🌠 <u>**Applications:**</u>
- Image compression
- Denoising (cf. Google Pixel phones...)
- Image generation!


<img src='https://wagon-public-datasets.s3.amazonaws.com/data-science-images/DL/autoencoder.png'>

## Google Colab Setup

Repeat the same process from the last challenge to upload your challenge folder and open your notebook:

1. access your [Google Drive](https://drive.google.com/)
2. go into the Colab Notebooks folder
3. drag and drop this challenge's folder into it
4. right-click the notebook file and select `Open with` $\rightarrow$ `Google Colaboratory`

Don't forget to enable GPU acceleration!

`Runtime` $\rightarrow$ `Change runtime type` $\rightarrow$ `Hardware accelerator` $\rightarrow$ `GPU`

When this is done, run the cells below and get to work!

In [None]:
# Mount GDrive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Put Colab in the context of this challenge
import os

# os.chdir allows you to change directories, like cd in the Terminal
os.chdir('/content/drive/MyDrive/Colab Notebooks/data-autoencoders')

You are now good to go, proceed with the challenge! Don't forget to copy everything back to your PC to upload to Kitt 🚀

## (0) The MNIST Dataset

In this notebook, we will train an auto-encoder to work on 28x28 grey images from the MNIST dataset, available in Keras. Run the cells below

In [None]:
from tensorflow.keras.datasets import mnist

(images_train, labels_train), (images_test, labels_test) = mnist.load_data()
print(images_train.shape)
print(images_test.shape)

In [None]:
# Add a channels for the colors and normalize data
X_train = images_train.reshape((60000, 28, 28, 1)) / 255.
X_test = images_test.reshape((10000, 28, 28, 1)) / 255.

In [None]:
# Plot some images
import matplotlib.pyplot as plt

f, axs = plt.subplots(1, 10, figsize=(20, 4))
for i, ax in enumerate(axs):
    ax.axis('off')
    ax.imshow(X_train[i].reshape(28, 28), cmap='Greys')
    
plt.show()

## (1) The encoder

🎁 First, we built the "Encoder" part for you.

👉  Notice how similar it looks compared to a Convolutional Classifier with **latent_dimension** neurons at the end. However, we using the "tanh" activation function in the final dense layer instead of "relu".

In [None]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

def build_encoder(latent_dimension):
    '''returns an encoder model, of output_shape equals to latent_dimension'''
    encoder = Sequential()
    
    encoder.add(Conv2D(8, (2,2), input_shape=(28, 28, 1), activation='relu'))
    encoder.add(MaxPooling2D(2))

    encoder.add(Conv2D(16, (2, 2), activation='relu'))
    encoder.add(MaxPooling2D(2))

    encoder.add(Conv2D(32, (2, 2), activation='relu'))
    encoder.add(MaxPooling2D(2))     

    encoder.add(Flatten())
    encoder.add(Dense(latent_dimension, activation='tanh'))
    
    return encoder

❓ **Question: building an encoder** ❓ 

Build your encoder with **`latent_dimension = 2`** and look at the number of parameters.

In [None]:
# YOUR CODE HERE

## (2) Decoder

It's your turn to build the decoder this time!

We need to build a 🔥 **`reversed CNN` 🔥** that 
* takes a dense layer as input,
* and outputs an image of shape $ (28,28,1) $ similar to our MNIST images. 

📚 For this purpose, we will use a new layer called <a href="https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2DTranspose">**`Conv2DTranspose`**</a> 📚
    
The name of this layer speaks for itself: it performs the opposite of a convolution operation!

💡 We will follow this strategy:
* Start by reshaping the Dense Input Layer into an Image of shape $(7,7,..)$
* Then apply the `Conv2DTranspose` operation with ***strides = 2*** to double the output shape to $(14,14,..)$
* then add another Conv2DTranpose layer on top of the first one to make it $(28,28,1)$.

<hr>

❓ **Question: the architecture of a decoder** ❓ 


Define a **`decoding architecture`** in the method below as follows:
- a *Dense* layer with:
    - $7 \times 7 \times 8$ neurons, 
    - *input_shape* = (latent_dimension, )
    - *tanh* activation function. 
- a *Reshape* layer that reshapes to $(7, 7, 8)$ tensors
- a *Conv2DTranspose* with:
    - $8$ filters, 
    - $(2,2)$ kernels, 
    - strides of $2$, 
    - padding *same* 
    - _relu_ activation function
- a second Conv2DTranspose layer with:
    - $1$ filter,
    - $(2,2)$ kernels,
    - strides of $2$,
    - padding _same_,
    - _relu_ activation function

In [None]:
from tensorflow.keras.layers import Reshape, Conv2DTranspose

def build_decoder(latent_dimension):
    pass  # YOUR CODE HERE

❓ **Question: buiding a decoder** ❓ 

Build your decoder with **`latent_dimension = 2`** and check that it outputs images of same shape than the encoder input

In [None]:
# YOUR CODE HERE

## (3) Auto-Encoder

🎉 We can now **concatenate** both **`the encoder and the decoder`** thanks to the **`Model`** class in Keras, using the **`Functional API`**.

In [None]:
from tensorflow.keras import Model
from tensorflow.keras.layers import Input

def build_autoencoder(encoder, decoder):
    inp = Input((28, 28,1))
    encoded = encoder(inp)
    decoded = decoder(encoded)
    autoencoder = Model(inp, decoded)
    return autoencoder

❓ **Questions** ❓ 

* Try to understand syntax above 👆 
* Build your autoencoder
* Have a look at the number of parameters

In [None]:
# YOUR CODE HERE

❓ **Question: Compiling an autoencoder** ❓ 

Define a method which compiles your model. Pick an appropriate loss.

<u><i>Think carefully:</i></u> 🤔 On which mathematical object are we going to compare *predictions* and the *ground truth* for the computation of the loss function and the metrics?


<details>
    <summary><i>Answer</i></summary>

It should compare two images (Black and White in our case), pixel-by-pixel!
    
The MSE loss seems to be an appropriate loss function for pixel-by-pixel error minimization.
</details>

In [None]:
# YOUR CODE HERE

❓ **Question: Training an autoencoder** ❓  

* Compile your model and fit it with `batch_size = 32` and `epochs = 20`. 
* What is the label `y_train` in this case?

<i>Note:</i> Don't waste your time fighting overfitting in this challenge, you will have time to care about this during the project weeks :)

In [None]:
# YOUR CODE HERE

❓ **Question: Encoding the dataset** ❓

* Using only the encoder part of the network, encode your dataset and save it under `X_encoded` . 
    * Each image is now represented by two values (that correspond to the dimension of the latent space, of the bottleneck; aka the `latent_dimension`. 

In [None]:
# YOUR CODE HERE

🤔 Where are we after running the encoder?

* Each image was compressed into a 2D space. 
* Each of these handwritten digit have a given label, between 0 and 9, but the goal here is not to classify these pictures like in the first challenge but to **reconstruct the original image before the compression**.

❓ **Question: Visualizing handwritten digits in the latent space** ❓ 

Scatterplot the encoded data (only a small fraction of the encoded dataset for visibility purposes...)
- Each point of the scatter plot  corresponds to an encoded image
- Color the dots according to their respective labels (digit representation):
    - for instance, all the "4"s should be represented by a color on this scatter plot...
    - ...while the "5" should be represented by another color
    - choose a set of [`qualitative colormaps`](https://matplotlib.org/stable/gallery/color/colormap_reference.html)

What do you remark about this plot? 

In [None]:
labels_train[:300]

In [None]:
# YOUR CODE HERE

## (4) Application: Image generation

❓ **Questions: Generate new digits** ❓ 

* Let's create some new digits!
* Run the following code editing the latent coordinates.
* Play with the coordinates start with ones from the graph above.
* For example [0.75, 0.75] is the "zero" area with latent space (don't forget to experiment outside the boundaries of our original dataset)

In [None]:
import numpy as np

In [None]:
latent_coords = np.array([[0.75, -0.75]]) 
generated_img = decoder.predict(latent_coords) 
plt.imshow(generated_img.reshape(28,28), cmap='Greys')

❗️ We can reuse the decoder of auto encoders as one of the simplest forms of generative deep learning. When we enter coordinates that were not in the training dataset **we are creating never seen before digits!**

## (5) Application: Image denoising

❓ **Questions: Creating some noise in the dataset** ❓ 

* Let's add some noise to the input data. 
* Run the following code
* Plot some handwritten digits and their noisy versions

In [None]:
import numpy as np

noise_factor = 0.5

X_train_noisy = X_train + noise_factor * np.random.normal(0., 1., size=X_train.shape)
X_test_noisy = X_test + noise_factor * np.random.normal(0., 1., size=X_test.shape)

In [None]:
# YOUR CODE HERE

❓ **Question: decoding the noisy pictures** ❓ 

* Reinitialize your autoencoder (with a latent space of 2) 
* Train it again, this time using the noisy train dataset instead of the normal train dataset
    * *Keep `batch_size = 32` and `epochs = 5`*
* What do you expect if you run the autoencoder on the noisy data instead of the original data in terms of performance?

In [None]:
# YOUR CODE HERE

❓ **Question: comparing the noisy test images with the denoised images** ❓ 

For some noisy test images, predict the denoised images and plot the results side by side...

In [None]:
# YOUR CODE HERE

❓ **Question: choosing the "correct" latent_dimension** ❓ 

Now, try to evaluate which **`latent_dimension`** is the most suitable in order to have **`the best image reconstruction preprocess`** $ \Leftrightarrow $ How to remove as much noise as possible in the noisy dataset using the latent dimension?`

In [None]:
# YOUR ANSWER HERE

🥡 <b><u>Conclusion</u></b>


* It is obvious that:
    * if you compress your pictures of size $ 28 \times 28 $ into a 1D space, you will lose a ton of information. 
    * if you compress them into a $ 28 \times 28 = 784$ -space, you are actually not compressing them
    
* We can still use this graph of **Loss vs. Latent dimensions** reading it from right to left to decide in which latent space it would be advisable to compress the pictures without losing to much information: `latent_space = 8` seems a sweat spot here using the Elbow Method.

---

🏁 **Congratulations** 🏁 

1. Download this notebook from your `Google Drive` or directly from `Google Colab` 
2. Drag-and-drop it from your `Downloads` folder to your local challenge folder


💾 Don't forget to push your code

3. Follow the usual procedure on your terminal inside the challenge folder:
      * *git add autoencoders.ipynb*
      * *git commit -m "I am the god of Transfer Learning"*
      * *git push origin master*

*Hint*: To find where this Colab notebook has been saved, click on `File` $\rightarrow$ `Locate in Drive`.

😉 That was the last challenge of this module!