# Exercise: Generative Models


## Goal of this exercise

This exercise is all about Autoencoders. You will learn how to train them, visualize their latent space and use them for image completion: 
The exercise is an adaptation of https://www.tensorflow.org/tutorials/generative/autoencoder


You can execute individual code blocks by pressing SHIFT+Enter consecutively.

You can trigger auto completion with TABULATOR

## Import required packages

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tensorflow as tf

from sklearn.metrics import accuracy_score, precision_score, recall_score
from sklearn.model_selection import train_test_split
from tensorflow.keras import layers, losses
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.models import Model
from matplotlib.offsetbox import OffsetImage, AnnotationBbox, TextArea
import umap

## Task 1: Load the dataset
We want to train the autoencoder on the Fashion MNIST dataset. Therefore we need to load it first.
x_train and x_test are the images of the dataset and y_train and y_test are the labels as integers.
The mappings to the actual class names is provided by the variable class_names

**tasks:** 
- print the shape of both x_train, y_train, x_test, y_test and explain what each dimension represents.
- use matplotlib to display the first image of the training set and also print its class name in the title of the figure (plt.title()).

**your answer here:**

In [None]:
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

class_names = ["T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot"
]



## Task 2: Basic Autoencoder

In the below cells you find everything to define the Autoencoder, train on the dataset and visualize the training results.
We are also training to variants of the autoencoder, where we use 1. latent space of 2, 2. latent space of 64
**TODO:** execute all the cells below to train both models and visualize the results, then answer the following questions in the markdown cell below:

**Questions:**

1. What is the purpose of the encoder?
2. What is the purpose of decoder?
3. What type of loss are we using and what does it do?
4. What is the role of the latent space and why does it lead to different results in the visualizations?


**Your answers here:**

In [None]:
class Autoencoder(Model):
  def __init__(self, latent_dim, shape):
    super(Autoencoder, self).__init__()
    self.latent_dim = latent_dim
    self.shape = shape
    self.encoder = tf.keras.Sequential([
      layers.Flatten(),
      layers.Dense(latent_dim, activation='relu'),
    ])
    self.decoder = tf.keras.Sequential([
      layers.Dense(tf.math.reduce_prod(shape), activation='sigmoid'),
      layers.Reshape(shape)
    ])

  def call(self, x):
    encoded = self.encoder(x)
    decoded = self.decoder(encoded)
    return decoded


shape = x_test.shape[1:]

latent_dim64 = 64
autoencoder64 = Autoencoder(latent_dim64, shape)

latent_dim2 = 2
autoencoder2 = Autoencoder(latent_dim2, shape)

latent_dim_miss = 64
autoencoder_miss = Autoencoder(latent_dim_miss, shape)


In [None]:
autoencoder64.compile(optimizer='adam', loss=losses.MeanSquaredError())
autoencoder2.compile(optimizer='adam', loss=losses.MeanSquaredError())

In [None]:
autoencoder64.fit(x_train, x_train,
                epochs=5,
                shuffle=True,
                validation_data=(x_test, x_test))

In [None]:
autoencoder2.fit(x_train, x_train,
                epochs=5,
                shuffle=True,
                validation_data=(x_test, x_test))

Now that the model is trained, let's test it by encoding and decoding images from the test set.

In [None]:
encoded_imgs64 = autoencoder64.encoder(x_test).numpy()
decoded_imgs64 = autoencoder64.decoder(encoded_imgs64).numpy()

encoded_imgs2 = autoencoder2.encoder(x_test).numpy()
decoded_imgs2 = autoencoder2.decoder(encoded_imgs2).numpy()

In [None]:
# visualization of the results

n = 10
plt.figure(figsize=(20, 4))
for i in range(n):
  # display original
  ax = plt.subplot(2, n, i + 1)
  plt.imshow(x_test[i])
  plt.title("original")
  plt.gray()
  ax.get_xaxis().set_visible(False)
  ax.get_yaxis().set_visible(False)

  # display reconstruction
  ax = plt.subplot(2, n, i + 1 + n)
  plt.imshow(decoded_imgs2[i])
  plt.title("reconstructed")
  plt.gray()
  ax.get_xaxis().set_visible(False)
  ax.get_yaxis().set_visible(False)
plt.suptitle('Latent Space = 2')
plt.show()


plt.figure(figsize=(20, 4))
for i in range(n):
  # display original
  ax = plt.subplot(2, n, i + 1)
  plt.imshow(x_test[i])
  plt.title("original")
  plt.gray()
  ax.get_xaxis().set_visible(False)
  ax.get_yaxis().set_visible(False)

  # display reconstruction
  ax = plt.subplot(2, n, i + 1 + n)
  plt.imshow(decoded_imgs64[i])
  plt.title("reconstructed")
  plt.gray()
  ax.get_xaxis().set_visible(False)
  ax.get_yaxis().set_visible(False)
plt.suptitle('Latent Space = 64')
plt.show()

## Task 3: Latent Space Visualization

Now we would like to visualize the latent space of the two Autoencoder results.
For the autoencoder with the 2 dimensional latent space we can directly plot the results in a scatterplot.
When using a latent space of 64, we first need to apply dimensionality reduction to reduce the amount of features to a number of 2.

**TODO:**
1. execute the cell below and visualize the dim=2 latent space
2. use umap to reduce the embedding space from 64 to 2 and visualize the latent space in the same way
3. answer the following Question

**Questions:**

1. What are your observations when comparing both visualizations of the latent space?
2. Which latent space shows better clusters and why is that the case?

**Your answers here:**

In [None]:
# visualization latent space dim=2

# select a random amount of points to avoid plotting all points
idx = np.random.choice(len(x_test), 1000)

images = x_test[idx]
encodings = encoded_imgs2[idx]
labels = y_test[idx]


print(encodings.shape)
fig, ax = plt.subplots(figsize=(10, 7))
ax.set_title("latent space autoencoder dim 2")
plt.scatter(encodings[:, 0], encodings[:, 1], c=labels,cmap = "viridis")
plt.colorbar()
for i in range(10):
    class_center = np.mean(encodings[labels == i], axis=0)
    text = TextArea('{} ({})'.format(class_names[i], i))
    ab = AnnotationBbox(text, class_center, xycoords='data', frameon=True)
    ax.add_artist(ab)
plt.show()

## Task 4 Learning to fill missing patches

Before we used the autoencoder for reconstructing the same image.
Now we would like to use the exact same architecture for filling in missing data (image inpainting).
Specifically, we want to use the fashion mnist images as the input, but we manipulate the image data in a way that for each image 
a 10 x 10 pixel block in the middle is missing (values set to zero)

**TODO:**
1. you now need create copies of x_train and x_test where a pixel block of 10 x 10 in the middle of each image is set to 0.
2. name these two copies x_train_m and x_test_m
3. Now follow the structure from Task 2 and train the autoencoder on filling in the missing pixel blocks
4. Visualize and the results and discuss them in the markdown cell below

**Your answer here:**