# Training and Applying Generative Adversarial Nets

In this project we are going to develop a Generative adversarial networks (GAN). We used the approach based on the article Generative Adversarial Nets written by Goodfellow et al., in which they propose a new framework where two models are simultaneously trained: a generative model G and a discriminative model D.

Hence two main roles are played in this framework:

* **Generative role**: captures the data distribituon on the real data, which we will call R.

* **Discriminator role**: estimates the probability that a sample came from the real data R rather than from the generative model G.

And the main goal of this framework is to maximize the probability of D being mistaken, i.e. thinking that the data generated by G corresponds to R.

A good metaphor to help to understand this framework is facial composite, where the witness (in this case would be D) evaluates the facial representation done by the painter (in this case would be G), and the goal of R is to make the representation as approximate to the suspect's face (which would be R).


To do so we will use Python and Keras a high-level neural network API, which is written in Python and capable of running on top of either TensortFlow, Theano or CNTK.

Hence, first of all we will load all the required libraries:

In [2]:
from __future__ import print_function
import keras
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD
from keras.constraints import maxnorm


ModuleNotFoundError: No module named 'keras'

After having loaded all the modules needed to implement GAN, we will proceed to define some values, which will be used as parameters for our GAN model.

Hence we will define the batch size, i.e. the number of that that are going to be propagated through the network, to 32. Also we will also specify 25 epochs, i.e. we will train our GAN with 25 full training cycles.

Furthermore, we will define that we are not willing to perform data augmentation. And we will define that our data will count with 10 classes, this will be used to define our data as categorical.

All what we said, can be said defined in the code below:

In [3]:
batch_size = 32
num_classes = 10
epochs = 25
data_augmentation = False

Once we have already defined the parameters that we are going to use on our GAN, we will proceed to divide our data on test and training sets.

In order to do so, we will shuffle and split the data, this is done in the code below:

In [None]:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

And we will proceed to define our data as categorical, specifying that our data will have 10 classes, as we previously defined.

In [None]:
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

Now, that we have already divided our data on train and test set, we will proceed to generate our model.

To do so we will proceed to generate a model with 2 2D Convolutional layers, in which we will define the dimensionality of the outpit space as 32 and the kernel size as 3 x 3. Additionally we will define a ReLu activation, since it produces sparsity and reduced likelihood of vanishing gradient.

Furthermore, between those 2 layers we will perform regularization by performing a **Dropout** with a rate equal to 0.2, to prevent complex co-adaptations on the training data.

And after the second layer we will perform max pooling, in order to reduce the dimensionality of the output of this second layer followed by a flatten process to reduce our input to a 15-d vector to then use a Dense layer with 512 units, a fully connected neural network layer where each input node is connected to each output node.

Finally we apply reguralization again via Dropout, this time with a rate equal to 0.5, and then we apply again a dense layer with 10 units.

All of this is done in the code below:

In [None]:
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(3, 32, 32), padding='same', activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', kernel_constraint=maxnorm(3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

After having specified the layers on our model, we are going to select a gradient descent optimization algorithm to find an approapiate local optimum. 

The selected optimization algorithms in our case will be a RMSprop with a learning rate equal to 0.0001 and a decay equal to 1e-6 and SGD with a learning rate 0.01, a momentum equal to 0.9, a decay equal to the divison between its learning rate and the number of epochs that we defined previously.

In [None]:
rmsprop= keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)
lrate = 0.01
decay = lrate/epochs
sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False)

Since now we have divided our data into train and test sets, and also we have specified the layers of our model and defined two optmization algorithms RMSprop and SGD. Now we will proceed to train our model. To do so, we will use the categorical crossentropy as our loss function and the sgd optimizer, that we previously defined. Furthermore, we will take accuracy as our main metric.


In [None]:
model.compile(loss='categorical_crossentropy',
              optimizer=sgd,
              metrics=['accuracy'])

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

In [None]:
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(model.summary())

In [None]:
if not data_augmentation:
    print('Not using data augmentation.')
    model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              validation_data=(x_test, y_test),
              shuffle=True)
    # Final evaluation of the model
    scores = model.evaluate(x_test, y_test, verbose=0)
    print("Accuracy: %.2f%%" % (scores[1] * 100))

else:
    print('Using real-time data augmentation.')
    # This will do preprocessing and realtime data augmentation:
    datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range=0,  # randomly rotate images in the range (degrees, 0 to 180)
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False)  # randomly flip images

    # Compute quantities required for feature-wise normalization
    # (std, mean, and principal components if ZCA whitening is applied).
    datagen.fit(x_train)

    # Fit the model on the batches generated by datagen.flow().
    model.fit_generator(datagen.flow(x_train, y_train,
                                     batch_size=batch_size),
                        steps_per_epdoch=x_train.shape[0] // batch_size,
                        epochs=epochs,
                        validation_data=(x_test, y_test))
    # Final evaluation of the model
    scores = model.evaluate(x_test, y_test, verbose=0)
    print("Accuracy: %.2f%%" % (scores[1] * 100))