#COMP 760 Generative Modeling Assignment

**Please place your full name here:**

## Purpose

This assignment is intended to assess your ability to:

- Implement a Fully Connected Autoencoder
- Implement a Convolutional Autoencoder
- Understand the Variational Autoencoder intuition

## Task:

In this assignment, you will design and implement a Fully-Connected and a CNN Autoencoder. With a simple change in your Fully-Connected Autoencoder, you will become more familiar with Variational Autoencoders.

You will complete the following tasks in a Jupyter Notebook with Python and Keras (please use the [functional API Model](https://keras.io/guides/functional_api/). Do not use the sequential model.) https://keras.io/api/ is a nice reference to the Keras APIs.

Studying the program examples in Ch. 3 from the text book can be very helpful.

**DataSet**: You will work with  the MNIST handwritten digit dataset. It has 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images.

There are three tasks and one optional (bonus) task.
Please complete the tasks by writing Python code in the designated cells. Please add comments to help the instructor understand your code.
Some code and utility functions have been provided.

Put your resspones to the questions in a separate Word document.

###Preparation

####Imports

In [None]:
import numpy as np
import keras
from keras import layers, models, optimizers, utils, datasets

####Utility function

In [None]:
#display images
import matplotlib.pyplot as plt
def display2(
    images, n=10, size=(20, 4), cmap="gray_r", as_type="float32", shape=(28,28)
):
    """
    Displays n random images from each one of the supplied arrays.
    """
    # n:How many digits we will display
    plt.figure(figsize=size)
    for i in range(n):
        ax = plt.subplot(2, n, i + 1)
        # Reshape the image to the right size based on the shape argument
        plt.imshow(images[i].astype(as_type).reshape(shape), cmap=cmap)
        #plt.gray()
        ax.get_xaxis().set_visible(False)
        ax.get_yaxis().set_visible(False)
plt.show()

####Load training data


In [None]:
# Load training data in Google Colab
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

assert x_train.shape == (60000, 28, 28)

If you can see some images corretly displayed, the dataset would have been loaded sauccesfully.

In [None]:
# display 10 random images
display2(x_train)
# check the image dataset shape
print(x_train.shape)

#### Preprocess data
In the following preprocess function, we take into account that in order to train the model, we have to convert `uint8` data to `float32` and normalize the pixel values to [0, 1]. Pay attention to how each image is padded to 32 x 32 for easier manipulation of the tensor shape as it passes through the network.

In [None]:
# Preprocess the data
def preprocess(imgs):
    """
    Normalize and reshape the images
    """
    # normalize all values between 0 and 1 and pad image to 32x32
    imgs = imgs.astype("float32") / 255.0
    imgs = np.pad(imgs, ((0, 0), (2, 2), (2, 2)), constant_values=0.0)
    imgs = np.expand_dims(imgs, -1)

    imgs = imgs.reshape((len(imgs), np.prod(imgs.shape[1:]))) # flatten the image
    return imgs

x_train = preprocess(x_train)
x_test = preprocess(x_test)

#check the image dataset shape
print(x_train.shape)
# display 10 randome images
display2(x_train, shape=(32,32))


### Task 1 (15 points):

Implement a Fully-Connected Autoencoder in Keras with these layers: Input, Dense. The encoder and decoder should include one or more layers, with the size chosen by you. Your Autoencoder should have a bottleneck with two neurons and `mean_squared_error` (MSE) as the loss function to begin with.

The model can be created by using `keras.Model` to group the layers. Next, set your optimizer, loss function and compile your model (keras.compile).
Train your model with the training set. Randomly select 10 images from the test set, encode them and visualize the decoded images.
Try to change your autoencoder, e.g. increase the bottleneck size, add more layers, use the `binary_crossentropy` loss function. Compare the loss values and reconstructed images.

#### 1.1 Buid the encoder

In [None]:
# Encoder

####1.2 Build the decoder

In [None]:
# Decoder

####1.3 Build the autoencoder

In [None]:
# Autoencoder

#### 1.4 Train the autoencoder

In [None]:
# Compile and train the autoencoder


#### 1.5 Reconstruct some images using the autoencoder

In [None]:
# Reconstruct some images

### Task 2 (15 points):
Implement a Convolutional Autoencoder (CAE) that uses only the following types of layers: convolution, pooling, upsampling and transpose. Use `mean_squared_error`. The encoder and decoder should include one or more layers, with the size and number of filters chosen by you. Start with a bottleneck of size 2, train your model on MNIST. Randomly select 10 images from the test set, encode them and visualize the decoded images. Are the reconstructed images readable for humans? If not, try to some different architectures for your CAE, including a larger bottleneck, that is powerful enough to generate readable images. The bottleneck should be as small as possible for readability. Try more layers and use the `binary_crossentropy` loss function.

#### 2.1 Build the encoder

In [None]:
# Encoder

####2.2 Build the decoder

In [None]:
# Decoder

####2.3 Build the autoencoder

In [None]:
# Autoencoder

####2.4 Train the autoencoder

In [None]:
# Compile and train the autoencoder

#### 2.5 Reconstruct some images using the autoencoder

In [None]:
# Reconstruct some images

###Task 3 (20 points):
This question is about using an Autoencoder to generate similar but not identical hand-written digits. We use a naive approach: Try to see if a trained decoder can map randomly generated inputs (random numbers) to a recognizable hand-written digit.

Start with your Fully-Connected and trained Autoencoder from Task 1. Try to generate new images by inputting some random numbers to the decoder (i.e. the bottleneck layer) and report your results. Hint: This is not easy. You probably want to input at least 10 random numbers.

Now restrict the Autoencoder hidden bottleneck layer(s) to have a standard multi-variate normal distribution (https://numpy.org/doc/stable/reference/random/generated/numpy.random.multivariate_normal.html) with mean zeroes and the identity matrix as variance (i.e. no correlations). An identity matrix is a square matrix having 1s on the main diagonal, and 0s everywhere else.

For example:
$
\begin{bmatrix}
    1 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 1
 \end{bmatrix}
$

Retrain the Fully-Connected Autoencoder with the normalized bottleneck. Now randomly generate inputs to the bottleneck layer that are drawn from the multi-variate standard normal distribution, and use the random inputs to generate new images. Report your result.

Are the output images different between Tasks 1 and 2? If so, why do you think this difference occurs?

####3.1 The encoder
Restrict the hidden bottleneck layer(s) to have a standard multi-variate normal distribution

In [None]:
# Your code for this task

####3.2 The decoder

In [None]:
# Your code for this task

####3.3 The autoencoder

In [None]:
# Your code for this task

####3.3 Train your new autoencoder

In [None]:
# Your code for this task

#### 3.4 Generate some new images

In [None]:
# Your code for this task

### Submission

Your notebook should include the final code and results. Your exploations and answers to the questions should be in a separate Word document.

Download ( rename as: `firstname_lastname_comp670_gm.ipynb`) and submit your notebook along with the Word document on Canvas Assignment: Generative Modeling, by the due date.

### Optional Task 4 (15 points)
Change the Autoencoder which you developed in the last part of Task 3 so that it becomes a Variational Autoencoder (refer to our lecture and the textbook Ch. 3). Does the VAE produce a different quality of output image?


In [None]:
# Your code for Task 4: