<h1 style="font-size:40px;"><center>Exercise V:<br> GANs
</center></h1>

## Short summary
In this exercise, we will design a generative network to generate the last rgb image given the first image. These folder has **three files**: 
- **configGAN.py:** this involves definitions of all parameters and data paths
- **utilsGAN.py:** includes utility functions required to grab and visualize data 
- **runGAN.ipynb:** contains the script to design, train and test the network 

Make sure that before running this script, you created an environment and **installed all required libraries** such 
as keras.

## The data
There exists also a subfolder called **data** which contains the traning, validation, and testing data each has both RGB input images together with the corresponding ground truth images.


## The exercises
As for the previous lab all exercises are found below.


## The different 'Cells'
This notebook contains several cells with python code, together with the markdown cells (like this one) with only text. Each of the cells with python code has a "header" markdown cell with information about the code. The table below provides a short overview of the code cells. 

| #  |  CellName | CellType | Comment |
| :--- | :-------- | :-------- | :------- |
| 1 | Init | Needed | Sets up the environment|
| 2 | Ex | Exercise 1| A class definition of a network model  |
| 3 | Loading | Needed | Loading parameters and initializing the model |
| 4 | Stats | Needed | Show data distribution | 
| 5 | Data | Needed | Generating the data batches |
| 6 | Debug | Needed | Debugging the data |
| 7 | Device | Needed | Selecting CPU/GPU |
| 8 | Init | Needed | Sets up the timer and other neccessary components |
| 9 | Training | Exercise 1-2 | Training the model   |
| 10 | Testing | Exercise 1-2| Testing the  method   |  


In order for you to start with the exercise you need to run all cells. It is important that you do this in the correct order, starting from the top and continuing with the next cells. Later when you have started to work with the notebook it may be easier to use the command "Run All" found in the "Cell" dropdown menu.

## Writing the report

There is no need to provide any report. However, implemented network architecuture and observed experimental results must be presented as a short presentation in the last lecture, May 28.

1) We first start with importing all required modules

In [1]:
import os
from configGAN import *
cfg = flying_objects_config()
if cfg.GPU >=0:
    print("creating network model using gpu " + str(cfg.GPU))
    os.environ['CUDA_VISIBLE_DEVICES'] = str(cfg.GPU)
elif cfg.GPU >=-1:
    print("creating network model using cpu ")  
    os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"   # see issue #152
    os.environ["CUDA_VISIBLE_DEVICES"] = ""

import tensorflow as tf
from tensorflow import keras
from utilsGAN import *
from sklearn.metrics import confusion_matrix
# import seaborn as sns
from datetime import datetime
import imageio
from skimage import img_as_ubyte

import pprint
# import the necessary packages
from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv3D, Conv2D, Conv1D, Convolution2D, Deconvolution2D, Cropping2D, UpSampling2D
from keras.layers import Input, Conv2DTranspose, ConvLSTM2D, TimeDistributed
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers import Concatenate, concatenate, Reshape
from keras.layers.core import Flatten
from keras.layers.core import Dropout
from keras.layers.core import Dense
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import Adam
from keras.models import Model
from keras.callbacks import TensorBoard
from keras.applications.vgg16 import VGG16, preprocess_input, decode_predictions
from keras.layers import Input, merge
from keras.regularizers import l2
from keras.layers import Input, merge, Convolution2D, MaxPooling2D, UpSampling2D, Reshape, core, Dropout, LeakyReLU
import keras.backend as kb


creating network model using gpu 0


2) Here, we have the network model class definition. In this class, the most important functions are **build_generator()** and **build_discriminator()**. As defined in the exercises section, your task is to update the both network architectures defined in these functions.

In [2]:
class GANModel():
    def __init__(self, batch_size=32, inputShape=(64, 64, 3), dropout_prob=0.25): 
        self.batch_size = batch_size
        self.inputShape = inputShape
        self.dropout_prob = dropout_prob

        # Calculate the shape of patches
        patch = int(self.inputShape[0] / 2**4)
        self.disc_patch = (patch, patch, 1)
  
        # Build and compile the discriminator
        self.discriminator = self.build_discriminator()
        self.discriminator.compile(loss='mse', optimizer=Adam(0.0002, 0.5),metrics=['accuracy'])
 
        # Build the generator
        self.generator = self.build_generator()

        # Input images and their conditioning images
        first_frame = Input(shape=self.inputShape)
        last_frame = Input(shape=self.inputShape)

        # By conditioning on the first frame generate a fake version of the last frame
        fake_last_frame = self.generator(first_frame)

        # For the combined model we will only train the generator
        self.discriminator.trainable = False
        
        # Discriminators determines validity of fake and condition first image pairs
        valid = self.discriminator([fake_last_frame, first_frame])

        self.combined = Model(inputs=[last_frame, first_frame], outputs=[valid, fake_last_frame])
        self.combined.compile(loss=['mse', 'mae'], # mean squared and mean absolute errors
                              loss_weights=[1, 100],
                              optimizer=Adam(0.0002, 0.5))

    def build_generator(self):
        inputs = Input(shape=self.inputShape)
        
        conv1= Conv2D(filters=32,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(inputs)
        conv1 = BatchNormalization(momentum=0.8)(conv1)
        conv1= Conv2D(filters=32,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(conv1)
        conv1 = BatchNormalization(momentum=0.8)(conv1)
        pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
        
        conv2= Conv2D(filters=64,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(pool1)
        conv2 = BatchNormalization(momentum=0.8)(conv2)
        conv2= Conv2D(filters=64,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(conv2)
        conv2 = BatchNormalization(momentum=0.8)(conv2)
        pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
        
        conv3= Conv2D(filters=128,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(pool2)
        conv3 = BatchNormalization(momentum=0.8)(conv3)
        conv3= Conv2D(filters=128,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(conv3)
        conv3 = BatchNormalization(momentum=0.8)(conv3)
        pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
        
        conv4= Conv2D(filters=256,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(pool3)
        conv4 = BatchNormalization(momentum=0.8)(conv4)
        conv4= Conv2D(filters=256,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(conv4)
        conv4 = BatchNormalization(momentum=0.8)(conv4)
        drop4 = Dropout(0.5)(conv4)
        '''pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)
        
        conv5= Conv2D(filters=256,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(pool4)
        conv5= Conv2D(filters=256,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(conv5)
        #pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
        drop5 = Dropout(0.5)(conv5)
        
        
        up6 = UpSampling2D(size=(2, 2))(drop5)
        up6 = Conv2D(filters=256,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(up6)
        up6 = Concatenate(axis=3)([drop4, up6])
        conv6 = Conv2D(filters=256,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(up6)
        conv6 = Conv2D(filters=256,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(conv6)'''

        up7 = UpSampling2D(size=(2, 2))(conv4)
        up7 = Conv2D(filters=256,kernel_size=2,activation='relu',padding='same',kernel_initializer='he_normal')(up7)
        up7 = BatchNormalization(momentum=0.8)(up7)
        up7 = Concatenate(axis=3)([conv3, up7])
        conv7 = Conv2D(filters=256,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(up7)
        conv7 = BatchNormalization(momentum=0.8)(conv7)
        conv7 = Conv2D(filters=256,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(conv7)
        conv7 = BatchNormalization(momentum=0.8)(conv7)

        up8 = UpSampling2D(size=(2, 2))(conv7)
        up8 = Conv2D(filters=128,kernel_size=2,activation='relu',padding='same',kernel_initializer='he_normal')(up8)
        up8 = BatchNormalization(momentum=0.8)(up8)
        up8 = Concatenate(axis=3)([conv2, up8])
        conv8 = Conv2D(filters=128,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(up8)
        conv8 = BatchNormalization(momentum=0.8)(conv8)
        conv8 = Conv2D(filters=128,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(conv8)

        up9 = UpSampling2D(size=(2, 2))(conv8)
        up9 = Conv2D(filters=64,kernel_size=2,activation='relu',padding='same',kernel_initializer='he_normal')(up9)
        up9 = BatchNormalization(momentum=0.8)(up9)
        up9 = Concatenate(axis=3)([conv1, up9])
        conv9 = Conv2D(filters=64,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(up9)
        conv9 = BatchNormalization(momentum=0.8)(conv9)
        conv9 = Conv2D(filters=64,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(conv9)
        conv9 = BatchNormalization(momentum=0.8)(conv9)
        conv9 = Conv2D(filters=32,kernel_size=3,activation='relu',padding='same',kernel_initializer='he_normal')(conv9)
        conv9 = BatchNormalization(momentum=0.8)(conv9)
        
        nbr_img_channels = self.inputShape[2]

        outputs = Conv2D(nbr_img_channels, (1, 1), activation='sigmoid')(conv9)

        model = Model(inputs=inputs, outputs=outputs, name='Generator')
        model.summary()
 
        '''inputs = Input(shape=self.inputShape)
        print(inputs.shape)
 
        down1 = Conv2D(32, (3, 3),padding='same')(inputs)
        down1 = Activation('relu')(down1) 
        down1_pool = MaxPooling2D((2, 2), strides=(2, 2))(down1)
         
        down2 = Conv2D(64, (3, 3), padding='same')(down1_pool)
        down2 = Activation('relu')(down2) 
         

        up1 = UpSampling2D((2, 2))(down2)
        up1 = concatenate([down1, up1], axis=3)
        up1 = Conv2D(256, (3, 3), padding='same')(up1) 
        up1 = Activation('relu')(up1) 
        
        
        up2 = Conv2D(256, (3, 3), padding='same')(up1) 
        up2 = Activation('relu')(up2) 
        
        nbr_img_channels = self.inputShape[2]
        outputs = Conv2D(nbr_img_channels, (1, 1), activation='sigmoid')(up2)

        model = Model(inputs=inputs, outputs=outputs, name='Generator')
        model.summary()'''
        
        return model

    def build_discriminator(self):
  
        last_img = Input(shape=self.inputShape)
        first_img = Input(shape=self.inputShape)

        # Concatenate image and conditioning image by channels to produce input
        combined_imgs = Concatenate(axis=-1)([last_img, first_img])
  
        d1 = Conv2D(32, (3, 3), strides=2, padding='same')(combined_imgs) 
        d1 = Activation('relu')(d1) 
        d2 = Conv2D(64, (3, 3), strides=2, padding='same')(d1)
        d2 = Activation('relu')(d2) 
        d3 = Conv2D(128, (3, 3), strides=2, padding='same')(d2)
        d3 = Activation('relu')(d3) 
         
        validity = Conv2D(1, (3, 3), strides=2, padding='same')(d3)

        model = Model([last_img, first_img], validity)
        model.summary()

        return model

3) We import the network **hyperparameters** and build a simple network by calling the class introduced in the previous step. Please note that to change the hyperparameters, you just need to change the values in the file called **configPredictor.py.**

In [3]:
image_shape = (cfg.IMAGE_HEIGHT, cfg.IMAGE_WIDTH, cfg.IMAGE_CHANNEL)
modelObj = GANModel(batch_size=cfg.BATCH_SIZE, inputShape=image_shape,
                                 dropout_prob=cfg.DROPOUT_PROB)

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 64, 64, 3)]  0                                            
__________________________________________________________________________________________________
input_2 (InputLayer)            [(None, 64, 64, 3)]  0                                            
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 64, 64, 6)    0           input_1[0][0]                    
                                                                 input_2[0][0]                    
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 32, 32, 32)   1760        concatenate[0][0]            

4) We call the utility function **show_statistics** to display the data distribution. This is just for debugging purpose.

In [4]:
#### show how the data looks like
show_statistics(cfg.training_data_dir, fineGrained=False, title=" Training Data Statistics ")
show_statistics(cfg.validation_data_dir, fineGrained=False, title=" Validation Data Statistics ")
show_statistics(cfg.testing_data_dir, fineGrained=False, title=" Testing Data Statistics ")


######################################################################
##################### Training Data Statistics #####################
######################################################################
total image number 	 10817
total class number 	 3
class square 	 3488 images
class circular 	 3626 images
class triangle 	 3703 images
######################################################################

######################################################################
##################### Validation Data Statistics #####################
######################################################################
total image number 	 2241
total class number 	 3
class triangle 	 745 images
class square 	 783 images
class circular 	 713 images
######################################################################

######################################################################
##################### Testing Data Statistics #####################
##########################

5) We now create batch generators to get small batches from the entire dataset. There is no need to change these functions as they already return **normalized inputs as batches**.

In [5]:
nbr_train_data = get_dataset_size(cfg.training_data_dir)
nbr_valid_data = get_dataset_size(cfg.validation_data_dir)
nbr_test_data = get_dataset_size(cfg.testing_data_dir)
train_batch_generator = generate_lastframepredictor_batches(cfg.training_data_dir, image_shape, cfg.BATCH_SIZE)
valid_batch_generator = generate_lastframepredictor_batches(cfg.validation_data_dir, image_shape, cfg.BATCH_SIZE)
test_batch_generator = generate_lastframepredictor_batches(cfg.testing_data_dir, image_shape, cfg.BATCH_SIZE)
print("Data batch generators are created!")

Data batch generators are created!


6) We can visualize how the data looks like for debugging purpose

In [6]:
if cfg.DEBUG_MODE:
    t_x, t_y = next(train_batch_generator)
    print('train_x', t_x.shape, t_x.dtype, t_x.min(), t_x.max())
    print('train_y', t_y.shape, t_y.dtype, t_y.min(), t_y.max()) 
    #plot_sample_lastframepredictor_data_with_groundtruth(t_x, t_y, t_y)
    pprint.pprint (cfg)

train_x (30, 64, 64, 3) float32 0.0 1.0
train_y (30, 64, 64, 3) float32 0.0 1.0
{'BATCH_SIZE': 30,
 'DATA_AUGMENTATION': True,
 'DEBUG_MODE': True,
 'DROPOUT_PROB': 0.5,
 'GPU': 0,
 'IMAGE_CHANNEL': 3,
 'IMAGE_HEIGHT': 64,
 'IMAGE_WIDTH': 64,
 'LEARNING_RATE': 0.01,
 'LR_DECAY_FACTOR': 0.1,
 'NUM_EPOCHS': 5,
 'PRINT_EVERY': 20,
 'SAVE_EVERY': 1,
 'SEQUENCE_LENGTH': 10,
 'testing_data_dir': '../data/FlyingObjectDataset_10K/testing',
 'training_data_dir': '../data/FlyingObjectDataset_10K/training',
 'validation_data_dir': '../data/FlyingObjectDataset_10K/validation'}


7) Start timer and init matrices

In [7]:
start_time = datetime.now()
# Adversarial loss ground truths
valid = np.ones((cfg.BATCH_SIZE,) + modelObj.disc_patch)
fake = np.zeros((cfg.BATCH_SIZE,) + modelObj.disc_patch)
# log file
output_log_dir = "./logs/{}".format(datetime.now().strftime("%Y%m%d-%H%M%S"))
if not os.path.exists(output_log_dir):
    os.makedirs(output_log_dir)

8) We can now feed the training and validation data to the network. This will train the network for **some epochs**. Note that the epoch number is also predefined in the file called **configGAN.py.**

In [None]:
import imageio
import matplotlib.pyplot as plt
from skimage import img_as_ubyte
import numpy as np 

%matplotlib inline


test_first_imgs, test_last_imgs = next(test_batch_generator)

for epoch in range(cfg.NUM_EPOCHS):
    steps_per_epoch = (nbr_train_data // cfg.BATCH_SIZE) 
    for batch_i in range(steps_per_epoch):
        first_frames, last_frames= next(train_batch_generator)
        if first_frames.shape[0] == cfg.BATCH_SIZE: 
             
            # Condition on the first frame and generate the last frame
            fake_last_frames = modelObj.generator.predict(first_frames)
            #plt.imshow(fake_last_frames[1])
            print(fake_last_frames.shape)
            #print(tf.keras.backend.mean(fake_last_frames[0]))
            print(np.mean(fake_last_frames[0]))

            # Train the discriminator with combined loss  
            d_loss_real = modelObj.discriminator.train_on_batch([last_frames, first_frames], valid)
            d_loss_fake = modelObj.discriminator.train_on_batch([fake_last_frames, first_frames], fake)
            d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)
 
            # Train the generator
            g_loss = modelObj.combined.train_on_batch([last_frames, first_frames], [valid, last_frames])

            elapsed_time = datetime.now() - start_time 
            print ("[Epoch %d/%d] [Batch %d/%d] [D loss: %f] [G loss: %f] time: %s" % (epoch, cfg.NUM_EPOCHS,
                                                                                               batch_i,
                                                                                               steps_per_epoch,
                                                                                               d_loss[0], 
                                                                                               g_loss[0],
                                                                                               elapsed_time))
            # run some tests to check how the generated images evolve during training
            test_fake_last_imgs = modelObj.generator.predict(test_first_imgs)
            test_img_name = output_log_dir + "/gen_img_epoc_" + str(epoch) + ".png"
            merged_img = np.vstack((first_frames[0],last_frames[0],fake_last_frames[0]))
            imageio.imwrite(test_img_name, img_as_ubyte(merged_img)) #scipy.misc.imsave(test_img_name, merged_img)
  

(30, 64, 64, 3)
0.5123859
[Epoch 0/5] [Batch 0/360] [D loss: 0.521882] [G loss: 51.729744] time: 0:01:55.399744
(30, 64, 64, 3)
0.46975246
[Epoch 0/5] [Batch 1/360] [D loss: 0.382826] [G loss: 49.014347] time: 0:01:55.661834
(30, 64, 64, 3)
0.5017248
[Epoch 0/5] [Batch 2/360] [D loss: 0.308842] [G loss: 47.192730] time: 0:01:55.842164
(30, 64, 64, 3)
0.5099958
[Epoch 0/5] [Batch 3/360] [D loss: 0.263464] [G loss: 45.672604] time: 0:01:56.036658
(30, 64, 64, 3)
0.51462656
[Epoch 0/5] [Batch 4/360] [D loss: 0.232008] [G loss: 44.569897] time: 0:01:56.244516
(30, 64, 64, 3)
0.52320975
[Epoch 0/5] [Batch 5/360] [D loss: 0.214573] [G loss: 42.836452] time: 0:01:56.460845
(30, 64, 64, 3)
0.5407674
[Epoch 0/5] [Batch 6/360] [D loss: 0.204692] [G loss: 41.815075] time: 0:01:56.650737
(30, 64, 64, 3)
0.56274575
[Epoch 0/5] [Batch 7/360] [D loss: 0.199852] [G loss: 40.513672] time: 0:01:56.869258
(30, 64, 64, 3)
0.558219
[Epoch 0/5] [Batch 8/360] [D loss: 0.197937] [G loss: 39.192722] time: 0:01

9) We can test the model with 100 test data which will be saved as images

In [None]:
for batch_i in range(100):
    test_first_imgs, test_last_imgs = next(test_batch_generator)
    test_fake_last_imgs = modelObj.generator.predict(test_first_imgs) 

    test_img_name = output_log_dir + "/gen_img_test_" + str(batch_i) + ".png"
    merged_img = np.vstack((test_first_imgs[0],test_last_imgs[0],test_fake_last_imgs[0]))
    imageio.imwrite(test_img_name, img_as_ubyte(merged_img))

## EXERCISES

#### Exercise 1)
Update the network architecture given in  **build_generator**  and  **build_discriminator**  of the class GANModel. Please note that the current image resolution is set to 32x32 (i.e. IMAGE_WIDTH and IMAGE_HEIGHT values) in the file configGAN.py. 
This way initial experiements can run faster. Once you implement the inital version of the network, please set the resolution values back to 128x128. Experimental results should be provided for this high resolution images.  

**Hint:** As a generator model, you can use the segmentation model implemented in lab03. Do not forget to adapt the input and output shapes of the generator model in this case.

#### Exercise 2) 
Use different **optimization** (e.g. ADAM, SGD, etc) and **regularization** (e.g. data augmentation, dropout) methods to increase the network accuracy. 