# DL Assignment 4 - Conditional Generative Adversarial Networks

Welcome to the **fourth graded assignment** of the DL course. In the last lab, you implemented a generative adversarial network (GAN) for generating some flower imagery. However, you had no control on the type of flower to be generated. 
In this assignment, you will implement a **conditional generative adversarial network** (cGAN) to incorporate class information into the training and generation process in order to control what the model will generate.

***

**Instructions**
- You'll be using Python 3 in the iPython based Google Colaboratory
- Lines encapsulated in "<font color='green'>`### START YOUR CODE HERE ###`</font>" and "<font color='green'>`### END YOUR CODE HERE ###`</font>", or marked by "<font color='green'>`# TODO`</font>", denote the code fragments to be completed by you.
- There's no need to write any other code.
- After writing your code, you can run the cell by either pressing `SHIFT`+`ENTER` or by clicking on the play symbol on the left side of the cell.
- We may specify "<font color='green'>`(≈ X LOC)`</font>" to tell you about how many lines of code you need to write. This is just a rough estimate, so don't feel bad if your code is longer or shorter.

**Much success!**

***

<font color='darkblue'>
  
**Remember**  
- Run your cells using `SHIFT`+`ENTER` (or "Run cell")
- Write code in the designated areas using Python 3 only
- Do not modify the code outside of the designated areas
- Do not import/use any other packages. Code relying on packages other than the provided ones won't be graded.
- Activate GPU acceleration by clicking `Runtime` -> `Change runtime type` and select `GPU` from the dropdown menu entitled `Hardware accelerator`
</font>

***

<font color='red'>
  
**Note**  
You have to develop and submit your own solution. If we have reasons to believe you shared or did not submit your own work, we have to consider an attempted fraud. In this case your submission will be graded zero points and we reserve additional measures.
</font>

# 0 - Preparation

## Imports and Test for GPU

Execute the cells below to import the required and modules and test for GPU availability:

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Model

!wget -nv -t 0 --show-progress https://cloud.tu-ilmenau.de/s/K5gomnsHSKGcFFR/download/utils.py

import utils

In [None]:
#@title Print TF version and GPU stats
print('TensorFlow version:', tf.__version__)

device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
   raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name), '', sep='\n')
!nvidia-smi

# 1 - Data Preparation

At first you need to load and prepare the dataset. In this assignment, you will be using the [**MNIST** database of handwritten digits](https://www.tensorflow.org/datasets/catalog/mnist). We will use the [TensorFlow Datasets](https://www.tensorflow.org/datasets/overview) (`tfds`) to load the data. 

For image generation, it's best practice to **normalize the images to `[-1, +1]`**. In addition, you need to convert the labels from sparse encoding to **one-hot encoding**.

**TASK**: Complete the `load_mnist` function below by implementing the functions
- `normalize_img`: *normalize* to `[-1, +1]`, and
- `encode_one_hot`: convert the labels to *one-hot encoding*
- Map the data splits to your functions.


*Hint*: After normalization and one-hot encoding, the data shall look look like this:
```
  Min-max and label of preprocessed sample:
  Example input [min/max]: [-1.0, 1.0]
  Example output: [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
```

In [None]:
# GRADED FUNCTION: normalize_img (1 point)
# GRADED FUNCTION: encode_one_hot (1 point)
# GRADED FUNCTION: load_mnist (1 point)

import tensorflow_datasets as tfds

# Constant and hyperparameters
NUM_CLASSES =   10  # number of classes
BATCH_SIZE =    64  # batch size
IMG_SIZE =      28  # image size
NUM_CHANNELS =  1   # image channels
LATENT_SIZE =   128 # seed size

def load_mnist():

  def normalize_img(image, label):
    ### START YOUR CODE HERE ### ( ≈ 1 LOC)
    
    ### END YOUR CODE HERE ###
    return image, label

  def encode_one_hot(image, label):
    ### START YOUR CODE HERE ### ( ≈ 1 LOC)
    
    ### END YOUR CODE HERE ###
    return image, label

  def print_sample(dataset):
    x_sample, y_sample = list(dataset.take(1).as_numpy_iterator())[0]
    print(f'Example input [min/max]: [{np.min(x_sample)}, {np.max(x_sample)}]')
    print(f'Example output:', y_sample)

  def prepare_dataset(dataset):
    dataset = dataset.cache()
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.prefetch(tf.data.experimental.AUTOTUNE)
    return dataset

  (train_data, test_data), mnist_info = tfds.load(
      'mnist',
      split=['train', 'test'],
      as_supervised=True,
      shuffle_files=True,
      with_info=True
  )

  fig = tfds.show_examples(train_data, mnist_info)

  print('\nMin-max and label of original sample:')
  print_sample(train_data)

  # map splits to preprocessing functions
  ### START YOUR CODE HERE ### ( ≈ 2 LOC)  
  
  ### END YOUR CODE HERE ###

  print('\nMin-max and label of preprocessed sample:')
  print_sample(train_data)

  # create batches
  train_data = prepare_dataset(train_data)
  test_data = prepare_dataset(test_data)

  return train_data, test_data

train_data, test_data = load_mnist()

# 2 - Build Generator and Discriminator

You will implement the conditioning mechanism described in [Conditional Generative Adversarial Nets](https://arxiv.org/abs/1411.1784) by Mehdi Mirza and Simon Osindero. In detail, you will append the one-hot encoded class labels to the inputs for both the discriminator and the generator:

<div>
<img src="https://machinelearningmastery.com/wp-content/uploads/2019/05/Example-of-a-Conditional-Generator-and-a-Conditional-Discriminator-in-a-Conditional-Generative-Adversarial-Network.png" width="640"/>
</div>

Hence, the `generator_in_channels` and `discriminator_in_channels` amount to

In [None]:
generator_in_channels = LATENT_SIZE + NUM_CLASSES
print(f'Generator input channels: {generator_in_channels}')

discriminator_in_channels = NUM_CHANNELS + NUM_CLASSES
print(f'Discriminator input channels: {discriminator_in_channels}')

## 2.1 - Discriminator

Next you need to build the models. Start with the discriminator, which is a simple convolutional neural network.

**Task**: Complete the function `build_discriminator` to create the discriminator model as follows:

```
Model: "discriminator"
________________________________________________________________________________________________________________________________
 Layer (type)                                            Output Shape                                       Param #             
================================================================================================================================
 input_1 (InputLayer)                                    [(None, 28, 28, 11)]                               0                   
                                                                                                                                
 conv2d (Conv2D)                                         (None, 14, 14, 64)                                 6400                
                                                                                                                                
 leaky_re_lu (LeakyReLU)                                 (None, 14, 14, 64)                                 0                   
                                                                                                                                
 batch_normalization (BatchNormalization)                (None, 14, 14, 64)                                 256                 
                                                                                                                                
 conv2d_1 (Conv2D)                                       (None, 7, 7, 128)                                  73856               
                                                                                                                                
 leaky_re_lu_1 (LeakyReLU)                               (None, 7, 7, 128)                                  0                   
                                                                                                                                
 batch_normalization_1 (BatchNormalization)              (None, 7, 7, 128)                                  512                 
                                                                                                                                
 global_max_pooling2d (GlobalMaxPooling2D)               (None, 128)                                        0                   
                                                                                                                                
 dropout (Dropout)                                       (None, 128)                                        0                   
                                                                                                                                
 dense (Dense)                                           (None, 1)                                          129                 
                                                                                                                                
================================================================================================================================
Total params: 81,153
Trainable params: 80,769
Non-trainable params: 384
________________________________________________________________________________________________________________________________
```

The **dropout rate** shall be `0.2`. Think about the correct **activation function** for the discriminators output.

In [None]:
# GRADED FUNCTION: build_discriminator (3 points)

def build_discriminator(img_size, input_channels, summary=True):

  ### START YOUR CODE HERE ### ( ≈ 11 LOC)
  
  ### END YOUR CODE HERE ###
  
  if summary:
    print(discriminator.summary(line_length=128))

  return discriminator

Initialize an instance of your discriminator to test the build:

In [None]:
build_discriminator(IMG_SIZE, discriminator_in_channels, summary=True)

## 2.2 - Generator

Two different versions of a generator shall be implemented
- one with **transposed convolutions**, 
- and the other one using a combination of **2D upsampling** and **convolutions**.

### 2.2.1 - Transposed Convolutions based Generator

The first version is using [transposed 2D convolutions](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2DTranspose). 

**Task**: Complete the function `build_generator_transposed_conv` to create the following model:

```
Model: "transposed_conv_generator"
________________________________________________________________________________________________________________________________
 Layer (type)                                            Output Shape                                       Param #             
================================================================================================================================
 input_2 (InputLayer)                                    [(None, 138)]                                      0                   
                                                                                                                                
 dense_1 (Dense)                                         (None, 6762)                                       939918              
                                                                                                                                
 leaky_re_lu_2 (LeakyReLU)                               (None, 6762)                                       0                   
                                                                                                                                
 batch_normalization_2 (BatchNormalization)              (None, 6762)                                       27048               
                                                                                                                                
 reshape (Reshape)                                       (None, 7, 7, 138)                                  0                   
                                                                                                                                
 conv2d_transpose (Conv2DTranspose)                      (None, 14, 14, 128)                                282752              
                                                                                                                                
 leaky_re_lu_3 (LeakyReLU)                               (None, 14, 14, 128)                                0                   
                                                                                                                                
 batch_normalization_3 (BatchNormalization)              (None, 14, 14, 128)                                512                 
                                                                                                                                
 conv2d_transpose_1 (Conv2DTranspose)                    (None, 28, 28, 64)                                 131136              
                                                                                                                                
 leaky_re_lu_4 (LeakyReLU)                               (None, 28, 28, 64)                                 0                   
                                                                                                                                
 batch_normalization_4 (BatchNormalization)              (None, 28, 28, 64)                                 256                 
                                                                                                                                
 conv2d_2 (Conv2D)                                       (None, 28, 28, 1)                                  1601                
                                                                                                                                
================================================================================================================================
Total params: 1,383,223
Trainable params: 1,369,315
Non-trainable params: 13,908
```
Think about the **correct activation function** for the output layer.

In [None]:
# GRADED FUNCTION: build_generator_transposed_conv (5 points)
def build_generator_transposed_conv(input_channels, summary=True):

  ### START YOUR CODE HERE ### ( ≈ 13 LOC)
  
  ### END YOUR CODE HERE ###

  if summary:
    print(generator.summary(line_length=128))

  return generator

Initialize an instance of your generator to test the build:

In [None]:
build_generator_transposed_conv(generator_in_channels)

### 2.2.2 - Resize Convolutions based Generator

The second version is using the [resize-convolution approach](https://distill.pub/2016/deconv-checkerboard/) discussed in Lab 4.1. This approach is based on [`Upsampling2D`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/UpSampling2D) layers followed by conventional [`Convolution2D`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D) layers. Ultimately, the model shall look as defined as follows:
```
Model: "resize_conv_generator"
________________________________________________________________________________________________________________________________
 Layer (type)                                            Output Shape                                       Param #             
================================================================================================================================
 input_3 (InputLayer)                                    [(None, 138)]                                      0                   
                                                                                                                                
 dense_2 (Dense)                                         (None, 6762)                                       939918              
                                                                                                                                
 leaky_re_lu_5 (LeakyReLU)                               (None, 6762)                                       0                   
                                                                                                                                
 batch_normalization_5 (BatchNormalization)              (None, 6762)                                       27048               
                                                                                                                                
 reshape_1 (Reshape)                                     (None, 7, 7, 138)                                  0                   
                                                                                                                                
 up_sampling2d (UpSampling2D)                            (None, 14, 14, 138)                                0                   
                                                                                                                                
 conv2d_3 (Conv2D)                                       (None, 14, 14, 128)                                159104              
                                                                                                                                
 leaky_re_lu_6 (LeakyReLU)                               (None, 14, 14, 128)                                0                   
                                                                                                                                
 batch_normalization_6 (BatchNormalization)              (None, 14, 14, 128)                                512                 
                                                                                                                                
 up_sampling2d_1 (UpSampling2D)                          (None, 28, 28, 128)                                0                   
                                                                                                                                
 conv2d_4 (Conv2D)                                       (None, 28, 28, 64)                                 73792               
                                                                                                                                
 leaky_re_lu_7 (LeakyReLU)                               (None, 28, 28, 64)                                 0                   
                                                                                                                                
 batch_normalization_7 (BatchNormalization)              (None, 28, 28, 64)                                 256                 
                                                                                                                                
 conv2d_5 (Conv2D)                                       (None, 28, 28, 1)                                  577                 
                                                                                                                                
================================================================================================================================
Total params: 1,201,207
Trainable params: 1,187,299
Non-trainable params: 13,908
```

**Task**: Complete the function `build_generator_resize_conv`:

In [None]:
# GRADED FUNCTION: build_generator_resize_conv and plot history (5 points)
def build_generator_resize_conv(input_channels, summary=True):

  ### START YOUR CODE HERE ### ( ≈ 15 LOC)
  
  ### END YOUR CODE HERE ###

  if summary:
    print(generator.summary(line_length=128))

  return generator

Initialize an instance of your resize-conv generator to test the build:

In [None]:
build_generator_resize_conv(generator_in_channels)

# 3 - Train!

Now it's time to train the two versions of your conditional GAN.

## 3.1 - Transposed Convolution based Generator

Execute the next cell to run the training:

In [None]:
cgan_transposed_conv, cgan_transposed_conv_history = utils.train_gan(
    build_discriminator(IMG_SIZE, discriminator_in_channels, summary=False),
    build_generator_transposed_conv(generator_in_channels, summary=False),
    train_data,
    epochs=20
)

utils.plot_history(cgan_transposed_conv_history)

## 3.2 - Resize Convolution based Generator

**Task**: Train a GAN using the resize convolutions-based generator and plot the history. 

In [None]:
# GRADED: train resize-convolution GAN (1 point)

### START YOUR CODE HERE ### ( ≈ 2 LOC)

cgan_resize_conv, cgan_resize_conv_history = 

### END YOUR CODE HERE ###

# 4 - Interpolate Classes

Instead of using one-hot encoded vectors as class labels, you can also use the generator to create images for **interpolated classes**. 

The idea is as follows: the generator shall create a defined number of images. The number of images is defined as `steps`. The first image shall show the class with `class_one_idx` and the last image shall show `class_two_idx`. The images in between shall display interpolations between these two classes. Per step, the class labels transition from `class_one_idx` to `class_two_idx`. 
Example: the first interpolated image shall represent 75% of `class_one_idx` and 25% of `class_two_idx`. The next interpolated image shall represent 50% of `class_one_idx` and 50% of `class_two_idx`, and so on.

Therefore, the task is twofold:

1) The generator requires a random seed. To observe only the impact of the class interpolation **the same seed shall be reused for all generation steps**. Hence, you need to generate a random seed vector and then repeat it for the number of images to be generated (=`steps`).

Example: consider 3 steps and a seed length (=`LATENT_SIZE`) of 4, the `gen_seed` array could be:
```
[[-0.62280726 -0.25039887  0.02668798 -1.9732285 ]
 [-0.62280726 -0.25039887  0.02668798 -1.9732285 ]
 [-0.62280726 -0.25039887  0.02668798 -1.9732285 ]]
```

2) In addition to the seed, the generator is conditioned by class probability vectors concatenated to the random seed. During training, these were one-hot encoded vectors because each image displays exactly one class. To interpolate between classes, you need to generate class probability vectors that transition from `class_one_idx` to `class_two_idx` in a given number of steps. 

Example: consider 10 classes and 3 steps, `class_one_idx = 2` and `class_two_idx`= 5, the `cl_interp` array should be:
```
[[0.   0.   1.   0.   0.   0.   0.   0.   0.   0.  ]
 [0.   0.   0.5  0.   0.   0.5  0.   0.   0.   0.  ]
 [0.   0.   0.   0.   0.   1.   0.   0.   0.   0.  ]]
```

In order to do so, you just need to implement the following steps:
- Sample a noise vector of shape `(1, LATENT_SIZE)` from a normal distribution.
- Repeat the noise vector along the first axis for `steps` to obtain a vector of shape `(steps, LATENT_SIZE)`.
- Interpolate the one-hot encoded labels, i.e., 
 - at index `class_one_idx`: continuously decrease values from `1` to `0` for `steps` number of steps (or stepsize = `1./steps`);
 - at index `class_two_idx`: continuously increase values from `0` to `1` for `steps` number of steps (or stepsize = `1./steps`).


**Task**: Complete the function `interpolate_classes` following the described steps.

In [None]:
# GRADED FUNCTION: interpolate_classes (3 points)

def interpolate_classes(cgan, class_one_idx, class_two_idx, steps=32):

  for x in (class_one_idx, class_two_idx):
    assert x in range(NUM_CLASSES), f'input must be in {range(NUM_CLASSES)}'
    assert class_one_idx != class_two_idx, "use dissimilar classes"

  ### START YOUR CODE HERE ### ( ≈ 4 LOC)
  
  # Get random seed for generator
  gen_seed = 

  # Repeat random seed for `steps` time
  gen_seed = 

  # Interpolate one-hot encoded labels
  class_interp = np.zeros( (steps, NUM_CLASSES) )
  class_interp[:, class_one_idx] = 
  class_interp[:, class_two_idx] = 

  ### END YOUR CODE HERE ###

  gen_seed = tf.concat([gen_seed, class_interp], axis=1)

  # Generate images based on interpolated labels
  generated_images = cgan.generator(gen_seed)
  generated_images = (generated_images + 1)/2.
  generated_images.numpy()

  return generated_images

Now you can interpolate between two classes, e.g., 0 and 9, and plot the resulting images:

In [None]:
utils.plot_interpolated_images( interpolate_classes(cgan_transposed_conv, 0, 9) )

In [None]:
utils.plot_interpolated_images( interpolate_classes(cgan_resize_conv, 0, 9) )

---

# Congratulations on completing Assignment 4!

Complete the steps below for submission.

# Submission Instructions

You may now submit your notebook to moodle:
- Save the notebook (`CTRL`+ `s` or '*File*' -> '*Save*')
- Click on '*File*' -> '*Download .ipynb*' for downloading the notebook as IPython Notebook file.
- Upload the downloaded IPython Notebook file to **Moodle**.