## Deep Learning Course (980)
## Assignment Four 

__Assignment Goals__:

- Implementing Fully Connected AutoEncoders
- Implementing Convolutional AutoEncoders
- Understand Variational Autoncoder intuition


In this assignment, you will be asked to design a Fully Connected and a CNN AutoEncoder. With a simple change in your Fully Connected AutoEncoder, you will become more familiar with Variational AutoEncoder. 

__DataSet:__ In this Assignment, you will use the MNIST handwritten digit database. You can use  (x_train, _), (x_test, _)  = tensorflow.keras.datasets.mnist.load_data() to load the dataset.

1. (30 points) Implement a Fully Connected AutoEncoder in TensorFlow (cf. Chapter 7). Your AutoEncoder should have a bottleneck with two neurons and Mean Squared Error (MSE) as the objective function. In an AutoEncoder, the layer with the least number of neurons is referred to as a bottleneck. Train your model on MNIST. Plot the train and test loss. Randomly select 10 images from the test set, encode them and visualize the decoded images.
     
2. (35 points) Implement a convolutional AutoEncoder (CAE) that uses only the following types of layers: convolution, pooling, upsampling and transpose. You are limited to use MSE. The encoder and decoder should include one or more layers, with the size and number of filters chosen by you. Start with a bottleneck of size 2, train your model on MNIST and plot the train and test loss. Randomly select 10 images from the test set, encode them and visualize the decoded images. Are the reconstructed images readable for humans? If not, try to find a CAE architecture, including a larger bottleneck, that is powerful enough to generate readable images. The bottleneck should be as small as possible for readability, this is part of the grading criteria.

3. (35 points) This question is about using an AutoEncoder to generate similar but not identical hand digits. We use a naive approach: Try to see if a trained decoder can map randomly generated inputs (random numbers) to a recognizable hand-written digit. 
    1. Start with your Fully Connected and trained AutoEncoder from part 1. Try to generate new images by inputting some random numbers  to the decoder (i.e. the bottleneck layer) and report your results. Hint: This is not easy. You probably want to input at least 10 random numbers. 
    2. Now restrict the AutoEncoder hidden bottleneck layer(s) to have a standard multi-variate normal distribution with mean zeroes and the identity matrix as variance (i.e. no correlations). Retrain the Fully Connected AutoEncoder with the normalized bottleneck. Now randomly generate inputs to the bottleneck layer that are drawn from the multi-variate standard normal distribution, and use the random inputs to generate new images. Report your result.
    3. Are the output images different between 1) and 2)? If so, why do you think this difference occurs?

4. (20 points) Optional: change the AutoEncoder which you developed in the last part of section 3 so that it becomes a Variational AutoEncoder (Introduced by Kingma 2014; see Chapter 7.1). Does the VAE produce a different quality of output image?



__Submission Notes__:

Please use Jupyter Notebook. The notebook should include the final code, results, and answers. You should submit your Notebook in .pdf and .ipynb format. (penalty 10 points).
Your AutoEncoders should have only one bottleneck.
 



__Instructions__:

The university policy on academic dishonesty and plagiarism (cheating) will be taken very seriously in this course. Everything submitted should be your writing or coding. You must not let other students copy your work. Spelling and grammar count.



In [65]:
import tensorflow as tf
from tensorflow import keras
from keras.models import Model, Sequential, load_model
from keras.layers import Dense, Input, Conv2D, MaxPooling2D, UpSampling2D, Flatten, Reshape
from keras.callbacks import ModelCheckpoint
import numpy as np
from tensorflow.python.client import device_lib
from matplotlib import pyplot as plt

(x_train, _), (x_test, _) = keras.datasets.mnist.load_data()

# normalize the data
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

In [58]:
''' Part 1: Implement a Fully Connected AutoEncoder in TensorFlow
'''

# flatten the input image data
input_size = 784
x_train_flatten = x_train.reshape(-1, input_size)
x_test_flatten = x_test.reshape(-1, input_size)

# build the network with a bottleneck of two neurons
autoencoder_fc = Sequential()
autoencoder_fc.add(Dense(256, input_shape=(input_size,), activation='relu'))
autoencoder_fc.add(Dense(2, activation='relu'))
autoencoder_fc.add(Dense(256, input_shape=(input_size,), activation='relu'))
autoencoder_fc.add(Dense(input_size, activation='sigmoid'))

autoencoder_fc.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_9 (Dense)              (None, 256)               200960    
_________________________________________________________________
dense_10 (Dense)             (None, 2)                 514       
_________________________________________________________________
dense_11 (Dense)             (None, 256)               768       
_________________________________________________________________
dense_12 (Dense)             (None, 784)               201488    
Total params: 403,730
Trainable params: 403,730
Non-trainable params: 0
_________________________________________________________________


In [63]:
# train model

print(device_lib.list_local_devices())

autoencoder_fc.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
best_model_checkpoint = ModelCheckpoint(
    './best_model_fc.pth', 
    monitor="val_acc", 
    save_best_only=True, 
    save_weights_only=False
)
autoencoder_fc_history = autoencoder_fc.fit(
    x_train_flatten, 
    x_train_flatten, 
    epochs=50, 
    batch_size=256, 
    shuffle=True, 
    validation_data=(x_test_flatten, x_test_flatten),
    callbacks=[best_model_checkpoint]
)

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 3517610447304484559
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 6700198133
locality {
  bus_id: 1
  links {
  }
}
incarnation: 2315278478698892369
physical_device_desc: "device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1"
]
Train on 60000 samples, validate on 10000 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50

In [None]:
img_num = 10
autoencoder_fc_best = load_model('./best_model_fc.pth')
plt.figure(figsize=(18, 4))
for i in range(img_num):
    # randomly pick an image from test dataset
    chosen_test_img = x_test_flatten[np.random.randint(x_test_flatten.shape[0])]
    img_decoded = autoencoder_fc_best.predict(chosen_test_img.reshape(1, 784))
    ax = plt.subplot(2, num_images, i+1)
    plt.imshow(chosen_test_img.reshape(28, 28))
    ax.axis('off')

    ax = plt.subplot(2, num_images, num_images+i+ 1)
    plt.imshow(img_decoded.reshape(28, 28))
    ax.axis('off')
plt.show()