# iLykei Lecture Series

# Advanced Machine Learning and Artificial Intelligence (MScA 32017)

# Project:  Generative Models

# Topic: Conditional Generative Adversarial Networks (CGANs)

## Notebook 1: Training Conditional GAN (CGAN)


## Yuri Balasanov, &copy; iLykei 2019

##### Main texts: 

**[Generative Adversarial Networks, Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio, arXiv:1406.2661 [stat.ML]](https://arxiv.org/abs/1406.2661)**

**[Conditional Generative Adversarial Nets, Mehdi Mirza, Simon Osindero, arXiv:1411.1784 [cs.LG]](https://arxiv.org/abs/1411.1784)**

**[How to Develop a Conditional GAN (cGAN) From Scratch, Jason Brownlee, 2019](https://machinelearningmastery.com/how-to-develop-a-conditional-generative-adversarial-network-from-scratch/)**    

This notebook shows how to construct a conditional CGAN and train it on Fashion MNIST data.

Code in this notebook is based on [this blog by Jason Brownlee](https://machinelearningmastery.com/how-to-develop-a-conditional-generative-adversarial-network-from-scratch/).

In [1]:
import numpy as np
from numpy.random import randn
from numpy.random import randint
from keras.datasets.fashion_mnist import load_data
from keras.optimizers import Adam
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Reshape
from keras.layers import Flatten
from keras.layers import Conv2D
from keras.layers import Conv2DTranspose
from keras.layers import LeakyReLU
from keras.layers import Dropout
from keras.layers import Embedding
from keras.layers import Concatenate

Using TensorFlow backend.


In [2]:
# Remove when run on GPU
from matplotlib import pyplot as plt
from keras.utils import plot_model

Load the data. Conditional GAN requires the training set of images, like unconditional GAN, and in addition training labels.

In [3]:
#from keras.datasets import mnist, fashion_mnist
# load MNIST or FASHION_MNIST dataset
#(x_train, y_train), (_, _) = mnist.load_data()
#(x_train, y_train), (_, _) = fashion_mnist.load_data()

#np.save('x_train.npy',x_train)
#np.save('y_train.npy',y_train)
#np.save('x_train_fashion.npy',x_train)
#np.save('x_train_fashion.npy',y_train)
#x_train=np.load('x_train.npy')
#y_train=np.load('y_train.npy')
x_train=np.load('x_train_fashion.npy')
y_train=np.load('y_train_fashion.npy')

## Discriminator

Create function building discriminator model:

- First input is label of one class used as condition. For example, 0 is the T-shirt class, this means that the generator generates variations of T-shirts. This label is one number from 0 to 9. This number is passed through embedding layer with output dimension 50 and then through dense layer with 784 units followed by reshaping into (28,28,1), same as input image
- Second input is an image coming from either sample ("real" image) or from generator ("fake" image)
- Transformed first input and second input get concatenated into shape equivalent to a 2-chanel image: (28,28,2)
- Apply 2 consecutive 2d-convolutions, each with 128 filters and LeakyReLU activation with reduction of pixel size from (28,28) - to (14,14) - to (7,7)
- Shape (7,7,128) flattened and regularized with dropout
- Output layer is 1 unit with sigmoid activation for binary classification

In [4]:
def define_discriminator(in_shape=(28,28,1), n_classes=10):
        # label input
    in_label = Input(shape=(1,))
    # embedding for categorical input
    li = Embedding(n_classes, 50)(in_label)
    # scale up to image dimensions with linear activation
    n_nodes = in_shape[0] * in_shape[1]
    li = Dense(n_nodes)(li)
    # reshape to additional channel
    li = Reshape((in_shape[0], in_shape[1], 1))(li)
    # image input
    in_image = Input(shape=in_shape)
    # concat label as a channel
    merge = Concatenate()([in_image, li])
    # downsample
    fe = Conv2D(128, (3,3), strides=(2,2), padding='same')(merge)
    fe = LeakyReLU(alpha=0.2)(fe)
    # downsample
    fe = Conv2D(128, (3,3), strides=(2,2), padding='same')(fe)
    fe = LeakyReLU(alpha=0.2)(fe)
    # flatten feature maps
    fe = Flatten()(fe)
    # dropout
    fe = Dropout(0.4)(fe)
    # output
    out_layer = Dense(1, activation='sigmoid')(fe)
    # define model
    model = Model([in_image, in_label], out_layer)
    # compile model
    opt = Adam(lr=0.0002, beta_1=0.5)
    model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
    return model

In [5]:
# Remove when run on GPU
discriminator_structure = define_discriminator()
plot_model(discriminator_structure, to_file='cgan_discriminator_structure.png',show_shapes=True,show_layer_names=True)
discriminator_structure.summary()

W0721 14:24:34.803095 140534253528896 deprecation_wrapper.py:119] From /home/yuri/anaconda3/envs/newtf/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

W0721 14:24:34.821180 140534253528896 deprecation_wrapper.py:119] From /home/yuri/anaconda3/envs/newtf/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0721 14:24:34.825179 140534253528896 deprecation_wrapper.py:119] From /home/yuri/anaconda3/envs/newtf/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

W0721 14:24:34.922662 140534253528896 deprecation_wrapper.py:119] From /home/yuri/anaconda3/envs/newtf/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecate

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 1)            0                                            
__________________________________________________________________________________________________
embedding_1 (Embedding)         (None, 1, 50)        500         input_1[0][0]                    
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 1, 784)       39984       embedding_1[0][0]                
__________________________________________________________________________________________________
input_2 (InputLayer)            (None, 28, 28, 1)    0                                            
__________________________________________________________________________________________________
reshape_1 

The discriminator has 196,773 trainable parameters.  

Plot the architecture of the discriminator.

![CGAN Discriminator](cgan_discriminator_structure.png)

## Generator

Create function building generator of conditional GAN.

- First input. Prepare the conditioning class information in a similar way to the discriminator: a single class input number between 0 and 9 is embedded into a 50-number vector, passed through a dense 49-unit layer and reshaped into (7,7,1) image-like input
- Second input. Take a `latent_dim`-dimensional vector of random seed, pass it through a 6272-units dense layer and reshape it into (7,7,128), i.e. 128 feature maps, (7,7) each
- Concatenate both inputs into (7,7,129) feature maps set
- Apply sequence of 2 `Conv2DTranspose` layers with `LeakyReLU` activation increasing image size from (7,7) - to (14,14) - to (28,28)
- Make output as (28,28,1) with `tanh` activation

In [6]:
def define_generator(latent_dim, n_classes=10):
    # label input
    in_label = Input(shape=(1,))
    # embedding for categorical input
    li = Embedding(n_classes, 50)(in_label)
    # linear multiplication
    n_nodes = 7 * 7

    li = Dense(n_nodes)(li)
    # reshape to additional channel
    li = Reshape((7, 7, 1))(li)
    # image generator input
    in_lat = Input(shape=(latent_dim,))
    # foundation for 7x7 image
    n_nodes = 128 * 7 * 7
    gen = Dense(n_nodes)(in_lat)
    gen = LeakyReLU(alpha=0.2)(gen)
    gen = Reshape((7, 7, 128))(gen)
    # merge image gen and label input
    merge = Concatenate()([gen, li])
    # upsample to 14x14
    gen = Conv2DTranspose(128, (4,4), strides=(2,2), padding='same')(merge)
    gen = LeakyReLU(alpha=0.2)(gen)
    # upsample to 28x28
    gen = Conv2DTranspose(128, (4,4), strides=(2,2), padding='same')(gen)
    gen = LeakyReLU(alpha=0.2)(gen)
    # output
    out_layer = Conv2D(1, (7,7), activation='tanh', padding='same')(gen)
    # define model
    model = Model([in_lat, in_label], out_layer)
    return model

In [7]:
# Remove when running on GPU
generator_structure = define_generator(latent_dim=100)
plot_model(generator_structure, to_file='cgan_generator_structure.png',show_shapes=True,show_layer_names=True)
generator_structure.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_4 (InputLayer)            (None, 100)          0                                            
__________________________________________________________________________________________________
input_3 (InputLayer)            (None, 1)            0                                            
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, 6272)         633472      input_4[0][0]                    
__________________________________________________________________________________________________
embedding_2 (Embedding)         (None, 1, 50)        500         input_3[0][0]                    
__________________________________________________________________________________________________
leaky_re_l

Generator has 1,169,336 parameters, all trainable.

![CGAN Model](cgan_generator_structure.png)

## Adversarial model

Combine the generator and discriminator models into a GAN with following steps.

- Make the discriminator weights in the adversarial model untrainable
- Make input to the CGAN like input to the generator, i.e. combination of random input noise and conditioning input label tensors 

In [8]:
# Remove when running on GPU
generator_structure.input

[<tf.Tensor 'input_4:0' shape=(?, 100) dtype=float32>,
 <tf.Tensor 'input_3:0' shape=(?, 1) dtype=float32>]

- Make input tensor to the discriminator from the output tensor of the generator 

In [9]:
# Remove when running on GPU
print('Output tensor of generator:\n',generator_structure.output)


Output tensor of generator:
 Tensor("conv2d_3/Tanh:0", shape=(?, ?, ?, 1), dtype=float32)


- Make adversarial model output tensor from the discriminator output tensor

In [10]:
# Remove when running on GPU
print('\nOutput of discriminator as output of CGAN tensor:\n',
     discriminator_structure([generator_structure.output, generator_structure.input[1]]))



Output of discriminator as output of CGAN tensor:
 Tensor("model_1/dense_2/Sigmoid:0", shape=(?, 1), dtype=float32)


- Create the adversarial model from the generator's input tensors and discriminator's output tensor

In [11]:
# define the combined generator and discriminator model, for updating the generator
def define_gan(g_model, d_model):
    # make weights in the discriminator not trainable
    d_model.trainable = False
    # get noise and label inputs from generator model
    gen_noise, gen_label = g_model.input
    # get image output from the generator model
    gen_output = g_model.output
    # connect image output and label input from generator as inputs to discriminator
    gan_output = d_model([gen_output, gen_label])
    # define gan model as taking noise and label and outputting a classification
    model = Model([gen_noise, gen_label], gan_output)
    # compile model
    opt = Adam(lr=0.0002, beta_1=0.5)
    model.compile(loss='binary_crossentropy', optimizer=opt)
    return model

In [12]:
# Remove when running on GPU
gan_structure = define_gan(generator_structure,discriminator_structure)
plot_model(gan_structure, to_file='cgan_gan_structure.png',show_shapes=True,show_layer_names=True)
gan_structure.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_4 (InputLayer)            (None, 100)          0                                            
__________________________________________________________________________________________________
input_3 (InputLayer)            (None, 1)            0                                            
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, 6272)         633472      input_4[0][0]                    
__________________________________________________________________________________________________
embedding_2 (Embedding)         (None, 1, 50)        500         input_3[0][0]                    
__________________________________________________________________________________________________
leaky_re_l

The model has total 1,366,109 parameters, of them 1,169,336 parameters trainable (generator) and 196,773 parameters non-trainable (discriminator).

Plot the architecture of the CGAN.

![CGAN_Structure](cgan_gan_structure.png)

## Train the CGAN model

Create function preparing the data for the model: expand dimension of images from (28,28) to (28,28,1) and scale image arrays as real numbers in [-1,1]

In [13]:
def prepare_samples():
    # expand
    X = np.expand_dims(x_train, axis=-1)
    X = X.astype('float32')
    # scale
    X = (X - 127.5) / 127.5
    return [X, y_train]

Create function selecting real images from the data

In [14]:
def generate_real_samples(dataset, n_samples):
    # split into images and labels
    images, labels = dataset
    # create random index of selected images
    ix = randint(0, images.shape[0], n_samples)
    # select images and labels
    X, labels = images[ix], labels[ix]
    # label images as as 1 (real)
    y = np.ones((n_samples, 1))
    return [X, labels], y

Create function generating `n_samples` random latent vectors of dimension `latent_dim` from standard normal distribution. These vectors are inputs to the generator. 

In [15]:
def generate_latent_points(latent_dim, n_samples, n_classes=10):
    # generate Gaussian latent vectors
    x_input = randn(latent_dim * n_samples)
    # reshape into a batch of inputs for the network
    z_input = x_input.reshape(n_samples, latent_dim)
    # generate random labels of classes from 0 to n_classes
    labels = randint(0, n_classes, n_samples)
    return [z_input, labels]

Create function generating `n_samples` of fake images using `latent_dim`-dimensional random input vectors.

In [16]:
def generate_fake_samples(generator, latent_dim, n_samples):
    # generate random latent inputs
    z_input, labels_input = generate_latent_points(latent_dim, n_samples)
    # predict outputs with generator
    images = generator.predict([z_input, labels_input])
    # label fake images as 0
    y = np.zeros((n_samples, 1))
    return [images, labels_input], y

Create function training the adversarial model by alternating adjustments of weights of the generator and the  discriminator

In [17]:
def train(g_model, d_model, gan_model, dataset, latent_dim, n_epochs=100, n_batch=128):
    # number of batches in one epoch
    bat_per_epo = int(dataset[0].shape[0] / n_batch)
    half_batch = int(n_batch / 2)
    # manually enumerate epochs
    for i in range(n_epochs):
        # enumerate batches in each epoch
        for j in range(bat_per_epo):
            # select halh-batch of random real images
            [X_real, labels_real], y_real = generate_real_samples(dataset, half_batch)
            # update discriminator model weights on real samples
            d_loss1, _ = d_model.train_on_batch([X_real, labels_real], y_real)
            # generate half-batch of fake images
            [X_fake, labels], y_fake = generate_fake_samples(g_model, latent_dim, half_batch)
            # update discriminator model weights on fake images
            d_loss2, _ = d_model.train_on_batch([X_fake, labels], y_fake)
            # prepare full batch of random latent vectors as input for the generator
            [z_input, labels_input] = generate_latent_points(latent_dim, n_batch)
            # create 'misleading' labels 1 (real image) for the fake samples
            y_gan = np.ones((n_batch, 1))
            # train weights of the generator on the discriminator's errors
            g_loss = gan_model.train_on_batch([z_input, labels_input], y_gan)
            # summarize loss on this batch
            print('>%d, %d/%d, d1= %.3f, d2= %.3f g= %.3f' %
                (i+1, j+1, bat_per_epo, d_loss1, d_loss2, g_loss))
    # save the generator model
    g_model.save('try_cgan_generator.h5')

Run a quick training process for only 1 epoch and with `n_batch=1000` to check if everything works. 

In [18]:
# size of the latent space
latent_dim = 100
# create the discriminator
d_model = define_discriminator()
# create the generator
g_model = define_generator(latent_dim)
# create the cgan
gan_model = define_gan(g_model, d_model)
# prepare image data
dataset = prepare_samples()
# train model
train(g_model, d_model, gan_model, dataset, latent_dim, n_epochs=1, n_batch=1000)

  'Discrepancy between trainable weights and collected trainable'
  'Discrepancy between trainable weights and collected trainable'


>1, 1/60, d1= 0.695, d2= 0.695 g= 0.692


  'Discrepancy between trainable weights and collected trainable'


>1, 2/60, d1= 0.621, d2= 0.699 g= 0.688
>1, 3/60, d1= 0.563, d2= 0.706 g= 0.682
>1, 4/60, d1= 0.498, d2= 0.718 g= 0.670
>1, 5/60, d1= 0.445, d2= 0.740 g= 0.653
>1, 6/60, d1= 0.394, d2= 0.770 g= 0.629
>1, 7/60, d1= 0.346, d2= 0.809 g= 0.607
>1, 8/60, d1= 0.321, d2= 0.846 g= 0.594
>1, 9/60, d1= 0.297, d2= 0.853 g= 0.601
>1, 10/60, d1= 0.289, d2= 0.824 g= 0.646
>1, 11/60, d1= 0.287, d2= 0.747 g= 0.728
>1, 12/60, d1= 0.297, d2= 0.652 g= 0.829
>1, 13/60, d1= 0.309, d2= 0.569 g= 0.942
>1, 14/60, d1= 0.303, d2= 0.488 g= 1.056
>1, 15/60, d1= 0.306, d2= 0.445 g= 1.119
>1, 16/60, d1= 0.319, d2= 0.447 g= 1.084
>1, 17/60, d1= 0.310, d2= 0.495 g= 0.975
>1, 18/60, d1= 0.305, d2= 0.564 g= 0.866
>1, 19/60, d1= 0.253, d2= 0.629 g= 0.779
>1, 20/60, d1= 0.238, d2= 0.699 g= 0.708
>1, 21/60, d1= 0.221, d2= 0.815 g= 0.624
>1, 22/60, d1= 0.200, d2= 0.981 g= 0.523
>1, 23/60, d1= 0.184, d2= 1.140 g= 0.460
>1, 24/60, d1= 0.162, d2= 1.239 g= 0.445
>1, 25/60, d1= 0.147, d2= 1.209 g= 0.499
>1, 26/60, d1= 0.140, d2

Now run training for longer time to get more reasonable results.

Recommendation: run on GPU with, for example, `n_epochs=100, n_batch=128`