# Deep Convolutional Generative Adversarial Network (DCGAN)
This tutorial will help you understand the basics to build and train a GAN model. GAN uses the principle of learning a distribution from the real data to create or generate new data that resembles or matches with the real data. Tensorflow version 2.1.0 is used to build our DCGAN model. <br>
<br>
This tutorial uses a manually created dataset from real casting defects. Although the casting defects like pores and shrinkages are 3D in nature, the created dataset is for 2D images such that the tutorial is simpler and faster to train. Certain slices from the image stack of real casting defects found in Ni-based superalloy are used to create this dataset manually. <br>
<br>
The dataset can be found in the folder `dataset`. For this tutorial, we will train our model for `40 epochs` and generate images every `2 epochs` to see the evolution of the generator during the training process. Since `40 epochs` is insufficient to train the model, we will load the already trained model `trained.h5` using tensorflow and generate few images from random input noise.

As a side note, with much deeper networks (outside the scope of this tutorial since they need GPu hardware to train), it is *relatively easy* to acheive >90% (and even 95%) precision on this data set.

In [None]:
import numpy as np
import statistics as stat
import matplotlib.pyplot as plt
from matplotlib.colors import NoNorm
from skimage import io
from skimage.util import montage as montage2d
from sklearn.utils import shuffle
import tensorflow as tf
import scipy

module `matplotlib` is used for plotting graphs <br>
module `skimage` will help us to read and write our images easily <br>
module `imutils` will let us load only the images from a selected directory <br>
module `scipy` is very powerful and useful to perform certain editing operations on our images <br>
module `tensorflow` already has compiled syntaxes that can be used to build our machine learning models <br>

The `tensorflow` module is used to build and compile the generator, discriminator and GAN model.<br>
We will create a DCGAN class within which we will define functions to build generator and discriminator.<br>
<br>
The method `Sequential` is a keras syntax (backend tensorflow) to build the model layer by layer via addition.<br>
<br>

In [None]:
## objects below are needed to compile generator and discriminator
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Input
from tensorflow.keras.models import Model

## required objects to build the Generator and discriminator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Conv2DTranspose
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import LeakyReLU
from tensorflow.keras.layers import ReLU
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Reshape

## Let's build our GAN model<br> 
**For the generator:**<br>
The generator consists of a dense layer that takes in input vector and creates the first layer, followed by three blocks made up of `Conv2DTranspose` (2D Convolutional transpose), `BatchNormalization` and `ReLU` activation layer, and finally a convolution layer with sigmoid activation.<br> 
The size of the first Dense layer is chosen such that it can be reshaped into matrix form and be served as our first image into `Conv2DTranspose` layer.<br>
Also note that the number of filters varies across layers and finally converges to a value equal to channels. This is because, the greyscale image has one channel (i.e., one image) while for a coloured image, the number of channels are three or requires three image layers (Red, Green, and Blue)

**For the Discriminator:**<br>
The architecture of Discriminator is almost an inverse of generator. It takes an input image whose size matches with the output of generator (and the real images ofcourse). Further, it contains three blocks of `Conv2D` and `LeakyReLU` activation layers. <br> 
Setting the padding as `same` ensures the size of the image is not reduced after convolution however, applying stride of 2 reduces the image size by a factor of 2 after each block. <br>
Finally, the image is flattened into a neurons layer along with the filters and is converged to give one output in the last layer whose output corresponds to a value between 0 and 1. (0 for fake and 1 for real)


In [None]:
class DCGAN:
    @staticmethod
    def build_generator(channels=1, inputDim=256):
        model = Sequential()
        
        # No. of output units must match with size of first Trans-Conv layer
        model.add(Dense(input_dim=inputDim, units=8*8*64)) 
        model.add(BatchNormalization(momentum=0.9))
        # Reshape the series neurons into matrix form to serve as input for our first transposed convolution layer
        model.add(Reshape((8, 8, 64)))
        
        # So, image size here, input = 8*8, output = 16*16 with 64 filters
        model.add(Conv2DTranspose(64,(4,4),strides=(2,2),padding="same"))
        model.add(BatchNormalization(momentum=0.9))
        model.add(ReLU())

        # Image size here, input = 16*16, output = 32*32 with 128 filters
        model.add(Conv2DTranspose(128,(4,4),strides=(2,2),padding="same"))
        model.add(BatchNormalization(momentum=0.9))
        model.add(ReLU())
        
        # Image size here, input = 32*32, output = 64*64 with 256 filters
        model.add(Conv2DTranspose(256,(4,4),strides=(2,2),padding="same"))
        model.add(BatchNormalization(momentum=0.9))
        model.add(ReLU())
        
        # Image size here, input = 64*64, output = 64*64 with 1 filter (greyscale image)
        model.add(Conv2D(channels,(4,4),padding="same"))
        model.add(Activation("sigmoid"))

        print(model.summary()) # prints the model architecture

        return model


    @staticmethod
    def build_discriminator(dim,alpha=0.2):
        model = Sequential()
        inputShape = (dim, dim, 1) #input shape = image size with the third dimension corresponding to channels

        model.add(Conv2D(64,(4,4),padding="same",strides=(2,2),input_shape=inputShape))
        model.add(LeakyReLU(alpha=alpha))

        model.add(Conv2D(128,(4,4),padding="same",strides=(2,2)))
        model.add(LeakyReLU(alpha=alpha))

        model.add(Conv2D(128,(4,4),padding="same",strides=(2,2)))
        model.add(LeakyReLU(alpha=alpha))

        model.add(Flatten()) # Flatten operation converts our matrix form neurons into series, a dense layer
        model.add(Dropout(0.2)) 

        model.add(Dense(1))
        model.add(Activation("sigmoid"))

        print(model.summary())

        return model

Let's now build our generator and discriminator model with the aide of class `DCGAN` defined above and functions `build_generator`, `build_discriminator`.<br>

Adam optimizer is used to implement variable learning rate. `lr` is our learning rate and `decay` parameter ensures that learning rate decreases as the epochs increase.<br>
We wind up constructing our discriminator by compiling the `binary_crossentropy` loss function with the `discOpt` optimizer parameters. The methods `compile`, `Adam` etc are already defined in the tensorflow module making it easier to code :)

In [None]:
NUM_EPOCHS = 20 # Number of epochs 
BATCH_SIZE = 32 # Number of images in each batch

# Build generator and discriminator from the DCGAN class created above
print("[INFO] building generator....")
gen = DCGAN.build_generator(channels=1)

print("[INFO] building discriminator....")
disc = DCGAN.build_discriminator(64,alpha=0.2)
discOpt = Adam(lr=0.0001, beta_1=0.5, decay = 0.0002/NUM_EPOCHS)
disc.compile(loss="binary_crossentropy", optimizer=discOpt)

**Note that discriminator is set non-trainable before compiling into GAN. This ensures that only generator is trained while we train GAN.** <br>
The input for GAN is same as that of random input vector while output is discriminator's prediction on generated images by generator. The output or the loss that we obtain from this dicriminator's prediction is propagated back to tune and train the generator network.<br>
*So on a whole, the good training of the generator depends on the discriminator.*

In [None]:
print("[INFO] building GAN....")
disc.trainable = False
## discriminator is set non-trainable before compiling it into GAN.

# Set the size of input for GAN model as that of random noise vector.
ganInput = Input(shape=(256,))
# The output of GAN will be the discriminator's prediction of the generated images by generator.
ganOutput = disc(gen(ganInput))
# Create the GAN model by compiling input and output
gan = Model(ganInput, ganOutput)

#Compile the loss function and optimizer to finalise our GAN model
ganOpt = Adam(lr=0.0002, beta_1=0.5)
gan.compile(loss="binary_crossentropy", optimizer=ganOpt)

## Data pre-processing
Create a class `ImageReader` with a function `load` to load our images as a numpy array after performing resizing operations.

In [None]:
class ImageReader:
    def load(self, file):
        data = []
        images = io.imread(file)
        
        fig = plt.figure(figsize=(10, 7))
        fig.add_subplot(2, 2, 1)

        plt.imshow(images[5,:,:], cmap='gray', vmin=0, vmax=255)
        plt.show()

        fig.add_subplot(2, 2, 2)
        plt.imshow(images[18,:,:], cmap='gray', vmin=0, vmax=255)
        plt.show()

        fig.add_subplot(2, 2, 3)
        plt.imshow(images[313,:,:], cmap='gray', vmin=0, vmax=255)
        plt.show()

        fig.add_subplot(2, 2, 4)
        plt.imshow(images[495,:,:], cmap='gray', vmin=0, vmax=255)
        plt.show()
        
        for i in images:
            image = scipy.ndimage.zoom(i,(0.5,0.5),order=0,mode='nearest') # Resize the image

            data.append(image)
        return np.array(data) # output the image as a numpy array

We will now call our class `ImageReader` as `ir`. <br>
The images are loaded as numpy array `data` with the function `ir.load`. The array is 3D array where `axis 0` corresponds to number of images, `axis 1` and `axis 2` corresponds to size of the images. <br>

In [None]:
## we call our class to use its functions here
ir = ImageReader()

file = "dataset01.tif"

# Use the function load from class ImageReader, which will load the image, resize it and output the images
# as numpy array altogether
data = ir.load(file)
data.shape

We now add a fourth dimension to the image data which defines our number of channels (==1 for greyscale and ==3 for colour) <br>
Since we use activation functions that operate between fixed values `0 to 1`, we normalize the pixel values of our images.

In [None]:
# Our data array is 3 Dimensional, we need to add one more dimension for channels
data = np.expand_dims(data, axis=3)

# Normalize the pixel values
data = (data.astype("float")) / 255.0
data = shuffle(data)
data.shape

In [None]:
print("[INFO] starting training....")

benchmarkNoise = tf.random.normal(shape=(2,256),mean=0.0,stddev=1.0)

adversarial_loss = []
discriminator_loss = []
imlist=[]

Create the `benchmarknoise` which will be used to generate images for visualization after every 2 epochs<br>
<br>
## Now we will define the main loop that trains our models <br>
The model is trained for N `epochs`. The image dataset is divided into mini batches of size 32 as defined by `BATCH_SIZE`ie. the parameters are updated or are trained for every batch. Hence, two for loops are used, one for `epochs` and the other for `batches per epoch`.

In [None]:
for epoch in range(NUM_EPOCHS):
    print("[INFO] starting epoch {} of {}...".format(epoch + 1, NUM_EPOCHS))
    batchesPerEpoch = int(data.shape[0]/BATCH_SIZE)

    for i in range(0, batchesPerEpoch):
        dis_loss = []
        adv_loss = []
        
        ## We need to train our discriminator to differentiate between REAL and FAKE images hence,
        ## we generate fake images equal to batch size from generator and label it as zero (meaning fake)
        
        ## Both, the real and fake images are passed together to train the discriminator 
        
        noise = tf.random.normal(shape=(BATCH_SIZE,256),mean=0.0,stddev=1.0)
        genImages = gen.predict(noise,verbose=0)
        imageBatch = data[i*BATCH_SIZE:(i +1)*BATCH_SIZE]

        Train = np.concatenate((imageBatch, genImages))
        label = ([1] * BATCH_SIZE) + ([0] * BATCH_SIZE)
        (Train, label) = shuffle(Train, label)
        label=np.array(label)

        # Train on batch command trains our discriminator, updates its parameters and outputs the loss
        discLoss = disc.train_on_batch(X, y)
        dis_loss.append(discLoss)
        
        ## After updating discriminator, we generate more fake images and train the GAN model
        ## Note that only generator is trained here as discriminator inside gan is set non-trainable
        
        noise = tf.random.normal(shape=(BATCH_SIZE,256),mean=0.0,stddev=1.0)
        ganLoss = gan.train_on_batch(noise, np.array([1] * BATCH_SIZE)) 
        ## Note that the labels of fake images are set "1" here. This is done to achieve the 
        ## objective of generator to minimize the discriminator value function.. and 
        ## simultaneously train the generator to produce images that can be classified as "1"

        adv_loss.append(ganLoss)
        
        print("[INFO] Step {}_{}: disc_loss = {:.6f}, adversarial_loss = {:.6f}".format(epoch + 1, i, discLoss, ganLoss))

        ## We save the images for every two epochs and finally visualize it as a montage
        
        if epoch % 2 == 0 and i == 0:
            images = gen.predict(benchmarkNoise)
            images = ((images * 255.0)).astype("uint8")
            image1 = images[0,:,:,0]
            image1 = scipy.ndimage.zoom(image1,(2,2),order=0,mode='nearest')
            image2 = images[1,:,:,0]
            image2 = scipy.ndimage.zoom(image2,(2,2),order=0,mode='nearest')
            plt.imshow(image1, cmap='gray', vmin=0, vmax=255)
            plt.show()
            imlist.append(image1)

    adversarial_loss.append(stat.mean(adv_loss))
    discriminator_loss.append(stat.mean(dis_loss))

In [None]:
im = montage2d(imlist,fill=(0,0),padding_width=10,grid_shape=(5,4))
io.imshow(im)
io.show()

## The algorithm to save our model is commented below 
# gen.save("name.h5")

### Let's plot our adversarial and discriminator loss

In [None]:
x=list(range(epochs))
plt.plot(x,adversarial_loss, label="adversarial loss")
plt.plot(x,discriminator_loss, label="discriminator loss")
plt.xlabel("epoch")
plt.ylabel("loss")
plt.legend()

## Load a trained model to generate few images<br>
The ML models are stored in the h5 format along with their parameters and architecture but not the history! <br>
<br>
Load the trained model with the below code and print the summary of the model architecture.

In [None]:
gen = tf.keras.models.load_model("trained.h5")
gen.summary()

Create a noise of shape `(9,256)` to generate 9 images from trained model.

In [None]:
benchmarkNoise = tf.random.normal(shape=(9,256),mean=0.0,stddev=1.0, seed=42)
images = gen.predict(benchmarkNoise)
images = ((images * 255.0)).astype("uint8")
imlist = []

In [None]:
for i in range(images.shape[0]):
    image1 = images[i,:,:,0]
    image1 = scipy.ndimage.zoom(image1,(2,2),order=0,mode='nearest')
    io.imshow(image1,norm=NoNorm(), vmin=0, vmax=255)
    io.show()
    imlist.append(image1)

imlist = np.array(imlist)

Create a montage of all the generated images and show them with the command montage2d.  

In [None]:
im = montage2d(imlist,fill=(0,0),padding_width=10,grid_shape=(3,3))
io.imshow(im)
io.show