# Creating New Faces - Generative Adversarial Network Tutorial


[Edward Toth, PhD, University of Sydney]

- e-mail: eddie_toth@hotmail.com
- Add me on: https://www.linkedin.com/in/edward-toth/ 
- Join the community: https://www.meetup.com/Get-Singapore-Meetup-Group/
- Data Avenger: https://data-avenger.mailchimpsites.com/


### In Today's Tutorial, you'll learn about:
    - Discriminative and Generative Models
    - Deep Convolutional Generative adversarial networks (DCGANs)
    - Generating Faces 

Original Paper for DEEP CONVOLUTIONAL
GENERATIVE ADVERSARIAL NETWORKS (DCGANs):
 https://arxiv.org/pdf/1511.06434.pdf

<a id = "0"></a><br>
1. [Load Libraries](#1)  
1. [Data Preparation](#2)
     * [View Images](#2a) 
1. [Discriminative vs. Generative Models](#3)
1. [Discriminative Model](#4)
   [Generative Model](#5)
1. [Generative Adversarial Networks (GANs)](#6)   
- [Steps for Generating Zombie Faces](#7)
    

<a id = "1"></a><br>
## Load Libraries

In [None]:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import tensorflow as tf

<a id = "2"></a><br>
## Data Preparation
- Check image sizes 
- Resize images to $120 \times 120$ pixels
- Normalize images (divide by 255)
- Convert to tensorflow compatible format



In [None]:
dir = "/kaggle/input/faces-data-new/images/"
def load_faces():
    data = []
    sizes = []
    for i in os.listdir(dir):   
        if '.jpg' in i:
            img = Image.open(dir + i)
            size = np.array(img,dtype = "float32").shape
            sizes.append(size)
            img = img.resize((120,120))
            # Convert to tf format   pixels = tf.keras.preprocessing.image.img_to_array(img)
            # Normalize data
            pixels = np.array(img,dtype = "float32")/255
            data.append(pixels)
        else:
            print("")
    return sizes, np.stack(data)


In [None]:
# Check image sizes and load data
sizes, dataset = load_faces()
print("Number of images:", len(sizes))
print("Unique Shapes of images:", pd.Series(sizes).unique() )
dataset.shape

<a id = "2a"></a><br>
### Image Selection
- View Images
- Pick Images with green background

In [None]:
plt.figure(figsize = (15,15))
for i in range(10):
    plt.subplot(5,5,i+1)
    plt.axis("off")
    plt.imshow(dataset[i])
plt.show()

In [None]:
 #RGB
# np.unique(dataset[2][:,:,1])

import numpy as np
green_back = [] 
for d in dataset:
    a = d[:,:,1]*255 #red
    b =d[:,:,0]*255 #blue
    if (pd.Series(a.flat).mean() > 70) & (pd.Series(a.flat).std() < 40):
    #pd.Series(a.flat).value_counts().index[0] > 60:
        green_back.append(d)
    else:
       "Not green"

len(green_back)

In [None]:
dataset = np.stack(green_back)

In [None]:

plt.figure(figsize = (15,15))
for i in range(50):
    plt.subplot(10,7,i+1)
    plt.axis("off")
    plt.imshow(dataset[i])
plt.show()

<a id = "3"></a><br>
## Discriminative vs. Generative Models

Discriminative machine learning:
- Typically, model parameters are learnt by maximizing the conditional probability P(Y/X).
- Classify data into a certain category/label. 
- E.g. Neural networks with output layer with "sigmoid" activation function.   

Generative machine learning:
- Train model to learn parameters by maximizing the joint probability of P(X,Y).
- Calculates probability that data belongs to a certain category/label. 
- E.g. Neural networks with output layer with "softmax" probabilities.   


<a id = "4"></a><br>
## Discriminative Model

### Convolutional Neural Network

Recommendations from original DCGAN paper https://arxiv.org/pdf/1511.06434.pdf:
- Replace Max-pooling layers with strided convolutions in CNN model.
- kernel initializor: Initial weights were generated from a Normal distribution with mean 0 and standard deviation 0.02. 
- Adam(0.0002, beta_1=0.5): Follows from DCGAN paper, where learning rate is 0.0002 and momentum is 0.5 (accumulates the gradient of the past  epochs to determine the direction to go). 
- Use LeakyReLU activation in the discriminator for all layers.

Further Explanations: 
- `LeakyRelu`: Leaky ReLU allows the pass of a small gradient signal for negative values while ReLu only ignores negative values. This enables more gradients from the discriminator to flow into the generator.
- Number of convolutional filters in layers: $(128,128,64,64)$
- Kernel size: E.g. 3 or (3,3)  is the sizes of the convolutional filters that are moved throughout the images. Lose information because of the image borders as a (3,3) square can only (N,N) image, N-3 times. 
- Padding = "same": size of output features has the same length as the original input. Without this, the output size  is reduced depending on kernel size.
- Strides = E.g. (2,2) or 2 move the convolutional filters throughout the images by 2 pixels at a time. Since you move 2 pixels at a time and with padding = ‘same’, this means the size of the input tensor is halved in both height and width. 

Another suggestion: 
- batch normalization (batchnorm): Directly applying batchnorm to all layers however, results model instability (even excluding the generator output layer and the discriminator input layer). We achieve better results for our image generation without batchnorm.


In [None]:
# CNN model 3*[Conv2D -> Leaky] -> Conv2D -> Dropout -> Flatten -> Dense + Sigmoid

def discriminator(inp_shape = (120,120,3)):
    model = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(128, kernel_size = 3, strides = 2,  padding="same", 
               input_shape = inp_shape, kernel_initializer = tf.random_normal_initializer(mean=0, stddev=0.02) ),
        tf.keras.layers.LeakyReLU(0.2),
        
        tf.keras.layers.Conv2D(128, kernel_size = 3, strides = 2, padding="same"),
        tf.keras.layers.LeakyReLU(0.2),
        
        tf.keras.layers.Conv2D(64, kernel_size = 3, strides = 2, padding="same"),
        tf.keras.layers.LeakyReLU(0.2),
        
        tf.keras.layers.Conv2D(64, kernel_size = 3, strides = 2, padding = "same"),
        tf.keras.layers.Flatten(),      
        tf.keras.layers.Dense(1, activation = "sigmoid")
    ],
        name="discriminator")
    model.compile(optimizer = tf.keras.optimizers.Adam(0.0002, beta_1=0.5), loss = "binary_crossentropy", metrics = ['acc'])
    return model

In [None]:
# View model layers 
d_model = discriminator()
tf.keras.utils.plot_model(d_model, show_shapes = True)
# d_model.summary()

<a id = "5"></a><br>
## Generative Model  


Architecture guidelines for stable Deep Convolutional GANs from https://arxiv.org/pdf/1511.06434.pdf: 
- `Conv2DTranspose`: fractional-strided convolutions or transposed convolutional layer uses transformations that reverse the direction of normal convolutions. In a way, we want to unfold what happened with the discriminator. 

- `latent_dim`: Represents the dimension of an unseen (hidden) space in the generator. The size of this initial latent space is paramount to allow for an accurate generation of images. Manually selected and should see how this parameter changes the results. 

- (15,15,128): [Or shape with 128 * 15 * 15] represents   128 generations of images with $15 \times 15$ pixels.
- Use ReLU activation in generator for all layers except for the output, which uses Tanh.


<!-- - DCGANs Directly applying batchnorm to all layers however, resulted in sample oscillation and model instability. This was avoided by not applying
batchnorm to the generator output layer and the discriminator input layer. -->

Different models:
- Change `relu` activation function to `LeakyRelu`
- Add Batch Normalization after each convolutional layer except the output layer. 


In [None]:
# Generative Model with BatchNormalization and LeakyRelu 
def generator(latent_dim = 100):
    model = tf.keras.models.Sequential([
        tf.keras.layers.Dense(128 * 15 * 15, input_dim = latent_dim,
                              kernel_initializer = tf.random_normal_initializer(mean=0, stddev=0.02) ),
        tf.keras.layers.LeakyReLU(0.2),
        tf.keras.layers.Reshape((15,15,128)),

        # First transpose convolutional filter 
        tf.keras.layers.Conv2DTranspose(128, kernel_size = 3, strides = 2,  padding = "same"),
#         tf.keras.layers.BatchNormalization(),
        tf.keras.layers.LeakyReLU(0.2),

        # Second transpose convolutional filter 
        tf.keras.layers.Conv2DTranspose(128, kernel_size = 3, strides = 2,  padding = "same"),
#         tf.keras.layers.BatchNormalization(),
        tf.keras.layers.LeakyReLU(0.2),

         # Third transpose convolutional filter 
        tf.keras.layers.Conv2DTranspose(64, kernel_size = 3, strides = 2,  padding = "same"),
#         tf.keras.layers.BatchNormalization(),
        tf.keras.layers.LeakyReLU(0.2),
        
        tf.keras.layers.Conv2D(3,  kernel_size = 3, padding = "same", activation = "tanh") #"sigmoid")
        
    ])
    return model

In [None]:
g_model = generator()
tf.keras.utils.plot_model(g_model, show_shapes = True)
# g_model.summary()

<a id = "6"></a><br>
## Deep Convolutional Generative Adversarial Networks (DCGANs) 

DCGANs is one of the popular and successful network design for GAN. It mainly composes of convolution layers without max pooling or fully connected layers. It uses convolutional stride and transposed convolution for the downsampling and the upsampling. The figure below is the network design for the generator.

Other notebooks that use DCGANs: 
https://www.kaggle.com/akshat0007/generating-new-simpsons-character-using-dcgan

[Back to Top](#0)

In [None]:
def gan(g_model, d_model):
    d_model.trainable = False
    model = tf.keras.models.Sequential([
        g_model,
        d_model
    ],
        name="DCGANs")
    model.compile(optimizer = tf.keras.optimizers.Adam(0.0002, beta_1=0.5), loss = "binary_crossentropy")
    return model

In [None]:
gan_model = gan(g_model, d_model)
tf.keras.utils.plot_model(gan_model, show_shapes = True)
# gan_model.summary()

New functions: 
- generate_real_samples: Select a random batch of real images
- generate_latent_space: Take a point from the latent space as input and generates a new image using DCGAN. 
- generate_fake_examples:  Generate a batch of fake images 
- plot_samples: Show 49 generated images 
- summarize_performance: returns accuracy of discriminator's ability to classify an image as fake or real. Also uses plot_samples to show generated images.

Function inputs: 
- n_size = 200 is the batch size of images that are selected and generated. 

np.ones((batch_size, 1))

In [None]:
def summarize_performance(g_model, dataset, n_size = 128):
    X_real, y_real = generate_real_samples(dataset)
    _,accr = d_model.evaluate(X_real, y_real)
    
    X_fake, y_fake = generate_fake_examples(g_model)
    _, accf = d_model.evaluate(X_fake, y_fake)
    
    print("Real samples Acc: {}".format(accr*100))
    print("Fake samples Acc: {}".format(accf*100))
    
#     plot_samples(X_fake)
    plt.figure(figsize = (15,15))
    for i in range(7*7):
        plt.subplot(7,7,i+1)
        plt.axis("off")
        plt.imshow(Xfake[i])
    plt.show()
    

<a id = "7"></a><br>
# Steps for Generating Zombie Faces:
1. Select a random batch of real images
- Randomly select points from the latent space and generate fake images  
- Track discriminator loss (combined loss from real and fake batches)
- Track loss from DCGANs over the number of epoches
- Generate fake images using generator model

In [None]:


def train(g_model, d_model, gan_model, dataset, iterations=2000, batch_size=200, latent_dim=100, sample_interval=200):
 losses = []
 accuracies = []
 # Labels for real and fake examples
 real = np.ones((batch_size, 1))
 fake = np.zeros((batch_size, 1))

 for iteration in range(iterations):

    # -------------------------
    #  Train the Discriminator
    # -------------------------
    # Select a random batch of real images
    ind = np.random.randint(0, dataset.shape[0], batch_size)
    imgs = dataset[ind]

    # Generate points from the latent space 
    z = np.random.normal(0, 1, (batch_size, latent_dim))
    # Generate a batch of fake images
    gen_imgs = g_model.predict(z)

    # Discriminator loss function 
    d_loss_real = d_model.train_on_batch(imgs, real) # real = 0 label
    d_loss_fake = d_model.train_on_batch(gen_imgs, fake) # fake = 1 label
    d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

    # ---------------------
    #  Train the Generator
    # ---------------------
    # Generate point from a latent space 
    z = np.random.normal(0, 1, (batch_size, latent_dim))

    # GANs loss
    g_loss = gan_model.train_on_batch(z, real)

    # Generate a batch of fake images
    gen_imgs = g_model.predict(z)


    if iteration % sample_interval == 0:

    # Output training progress
        print ("%d [D loss: %f, Acc.: %.2f%%] [G loss: %f]" %
                  (iteration, d_loss[0], 100*d_loss[1], g_loss))

        # Save losses and accuracies to be plotted after training
        losses.append((d_loss[0], g_loss))
        accuracies.append(100*d_loss[1])

        # Output generated images
        #          summarize_performance(g_model, dataset)
        plt.figure(figsize = (15,15))
        for i in range(7*7):
            plt.subplot(7,7,i+1)
            plt.axis("off")
            plt.imshow(gen_imgs[i])
        plt.show()


In [None]:
train(g_model, d_model, gan_model, dataset)

### Our results are consistent with the findings from https://machinelearningmastery.com/practical-guide-to-gan-failure-modes/, 
- A stable GAN will have a discriminator loss (D loss) around 0.5 (maybe as high 0.8).
- The accuracy (acc) of the discriminator for classifying real and generated images will not be 50%, but should  hover around 70% to 80%.
- The generator loss (G loss) is typically larger than D loss.

How to improve results?
- Tune parameters or use different architectures 
- Need images that are centered with white backgrounds
- Require additional interations or epochs? 


### THE END
# If you like this tutorial, add an UPVOTE or COMMENT! 
[Back to Top](#0)