**A Breif Introduction to Generative Adversarial Network (GAN)**

![](https://images.app.goo.gl/Vt9xsqgFvafruVxU6)

Perhaps all of you heard about Deepfake where AI can generate fake speech or images which are hard to difficult to distinguish from the actual one. These systems are built on Generative Adversarial Network (GAN) and currently it's one of the most versatile neural network architecture.

**History of GAN**


In early 1960s, one of the pioneers of AI, Herbert Simon noticed that machine maybe a match of cognitive abilities of humankind , perform routine tasks. To make machine behave more human, researchers has designed an advanced neural architecture for replicating coginitive intelligence. GANs architecture has been introduced by Ian Goodfellow et. al. from University of Montreal in 2014. This model has become so popular that Yann LeCun (Facebook AI research director) addressed it as the most interesting idea in the last ten years in the world of Machine Learning.

[Original GAN paper](https://arxiv.org/abs/1406.2661)

**How GAN actually works**


A generative adversarial network (GAN) is a part of machine learning frameworks that trains a generative model having two sub-networks. One is a generative network, and the other is a discriminative network.

Generative Network: It works like a deconvolution network and tries to create images or speech using random noise .
Discriminative Network: It assesses those data and tries to differentiate between real or fake one as a convolution network.
So, the generative network creates candidates while the discriminative network judges them. Discriminative networks identify the general model-generated samples as real or fake. GAN training works as supervised learning. The generator produces new counterfeit data, and the discriminator learns to identify between natural and artificial data.

Firstly, the generator is equipped with fixed-length random data as input. This data is drawn from predefined latent space. Then the generator is trained by deceiving the discriminator enough. Finally, the discriminator evaluates them and specifies the real and fake data. Initially, the discriminator is trained with a known dataset until it reaches acceptable accuracy. It is a simple classification model. After training has been done, the discriminator is terminated. Independent backpropagation can help to produce better sample data.

![Architecture](https://images.app.goo.gl/WWDmFvCm5XQCgDRXA)

**Implementation**

In this code I'm implementing GAN to a  [Pokeman dataset](https://www.kaggle.com/kvpratama/pokemon-images-dataset). These images along with the fake ones will be fed in batches to the Discriminator.Let's take a look at the steps our GAN will follow-

Here, two neural network will compete with each other . The discriminator will detect the ground truth ( real or fake ) of generated images and return possiblities a number between 0 and 1 . here 0 represent fake and 1 is real .-

**Importing Libaries**
> This section imports all the necessary libraries to build the model as well as to plot the necessary results.

In [2]:
import os 
import torch
import torchvision
import torch.nn as nn
import torch.nn.functional as F
from torchvision.datasets import ImageFolder
import torchvision.transforms as tt
from torch.utils.data import DataLoader
from torchvision.utils import make_grid,save_image
import cv2
from tqdm.notebook import tqdm
import matplotlib.pyplot as plt
%matplotlib inline

In [4]:
Path = "../input/pokemon-images-dataset/pokemon_jpg"



In this part,python library is called for path handling.

In [None]:
os.listdir(Path)

This block mainly involves the preprocessing as well as augmentation of the dataset. At first, we have resized the dataset's images  to (64,64).After that, cropping has been applied. Since we are working in pytorch, we need to convert this numpy arrays into tensor. "tt.ToTensor" achieves this task. And then comes the classic preprocessing task called "Normalization". Normalization is performed so that deep learning model can easily do the computation. Finally,horizontal flip is performed over the normalized images. To send the images easily into the network,a dataloader has been created with suitable batch size with shuffling enabled.

In [None]:
image_size = 64
batch_size = 64
image_channels = 3
stats = (0.5, 0.5, 0.5), (0.5, 0.5, 0.5)

train_ds1 = ImageFolder(Path,transform=tt.Compose([
    tt.Resize(image_size),
    tt.CenterCrop(image_size),
    tt.ToTensor(),
    tt.Normalize(*stats),
    tt.RandomHorizontalFlip(p=0.5)
]))

train_dl = DataLoader(train_ds1, batch_size, shuffle=True, num_workers=3, pin_memory=True)

In [None]:
def denorm(img_tensors):
    return img_tensors * stats[1][0] + stats[0][0]

This part does the task of plotting.

In [None]:
def show_images(images, nmax=64):
    fig, ax = plt.subplots(figsize=(8, 8))
    ax.set_xticks([]); ax.set_yticks([])
    ax.imshow(make_grid(denorm(images.detach()[:nmax]), nrow=8).permute(1, 2, 0))

In [None]:
def show_batch(dl, nmax=64):
    for images, _ in dl:
        show_images(images, nmax)
        break

In [None]:
show_batch(train_dl)

This block checks if the cuda is enabled to take advantage of the gpu. Otherwise it will run the code on cpu. This code creates the dataloader for the device as well.

In [None]:
def get_default_device():
    if torch.cuda.is_available():
        return torch.device('cuda')
    else:
        return torch.device('cpu')

def to_device(data,device):
    if isinstance(data,(list,tuple)):
        return [to_device(x,device) for x in data]
    return data.to(device, non_blocking=True)

class DeviceDataLoader():
    def __init__(self,dl,device):
        self.dl = dl
        self.device = device
    def __iter__(self):
        for b in self.dl:
            yield to_device(b,device)
    def __len__(self):
        return len(self.dl)

In [None]:
device = get_default_device()
device

This code sends the train dataloader to the device(cuda/cpu). 

In [None]:
train_dl = DeviceDataLoader(train_dl, device)

This is the core of our tutorial. This is block where the model has been built.Since GAN has basically two networks--Discriminator and Generator. This block builds the discriminator network.As we can see from the code basically there are repeated use of three layers which constitue a single block.If we observe the block carefully,first there is a convolution layer which extracts low level details from the image. And then batchnormalization is applied.For activation function,leakyrelu is applied.In the last layer,we have flattened the tensors and passed them through the sigmoid activation layer.

In [None]:
discriminator = nn.Sequential(
    nn.Conv2d(3,64,kernel_size=4,stride=2,padding=1,bias=False),
    nn.BatchNorm2d(64),
    nn.LeakyReLU(0.2,inplace=True),
    
    
    nn.Conv2d(64,128,kernel_size=4,stride=2,padding=1,bias=False),
    nn.BatchNorm2d(128),
    nn.LeakyReLU(0.2,inplace=True),
    
    nn.Conv2d(128,256,kernel_size=4,stride=2,padding=1,bias=False),
    nn.BatchNorm2d(256),
    nn.LeakyReLU(0.2,inplace=True),
    
    nn.Conv2d(256,512,kernel_size=4,stride=2,padding=1,bias=False),
    nn.BatchNorm2d(512),
    nn.LeakyReLU(0.2,inplace=True),
    
    nn.Conv2d(512, 1, kernel_size=4, stride=1, padding=0, bias=False),
    
    nn.Flatten(),
    nn.Sigmoid()
)

In [None]:
discriminator = to_device(discriminator,device)

We have used a latent size of 128. Here is the generator block which consists of transposed convolutional layer. And then batch normalization is used.For activation function,Relu is used which is different from discriminator network. Finally,at the end of the generator block,tanh activation layer is applied.

In [None]:
latent_size = 128
generator = nn.Sequential(
    nn.ConvTranspose2d(latent_size,512,kernel_size=4,stride=1,padding=0,bias = False),
    nn.BatchNorm2d(512),
    nn.ReLU(True),
    
    nn.ConvTranspose2d(512,256,kernel_size=4,stride=2,padding=1,bias = False),
    nn.BatchNorm2d(256),
    nn.ReLU(True),
    
    nn.ConvTranspose2d(256,128,kernel_size=4,stride=2,padding=1,bias = False),
    nn.BatchNorm2d(128),
    nn.ReLU(True),
    
    nn.ConvTranspose2d(128,64,kernel_size=4,stride=2,padding=1,bias = False),
    nn.BatchNorm2d(64),
    nn.ReLU(True),
    
    nn.ConvTranspose2d(64,3,kernel_size=4,stride=2,padding=1,bias = False),
    nn.Tanh()
)

In this block,we have tested the generator block with some random latent tensors.The main goal of this block is to see whether the generator works or not. And the main task of the generator is to produce fake images.

In [None]:
xb = torch.randn(batch_size, latent_size, 1, 1) # random latent tensors
fake_images = generator(xb)
print(fake_images.shape)
show_images(fake_images)

In [None]:
generator = to_device(generator,device)

This part is the cream of this tutorial where we wll perform the training of this GAN model.Eventually,we will train the discriminator first.For that,we will first clear the gradient of the discriminator network,which is basically initialize the discriminator network with zero weight. And then real images are passed through this.The main task of the discriminator is to distinguish the real images from the fake ones,generated from the generator network.To be precise, we will try to minimize the loss function of the discriminator net. For the loss function, we have used binary cross-entropy to calculate the loss between real and fake images. Proceeding forward, we observe that fake images are passed through this net to calculate fake loss. In the end,both fake and real loss are summed to calculated the total loss.This loss will be used in the backpropagation phase to update the weight of the discriminator network.And this is done in the last part of this block. Finally this block will output some real scores and fake scores.

In [None]:
def train_discriminator(real_images, opt_d):
    # Clear discriminator gradients
    opt_d.zero_grad()

    # Pass real images through discriminator
    real_preds = discriminator(real_images)
    real_targets = torch.ones(real_images.size(0), 1, device=device)
    real_loss = F.binary_cross_entropy(real_preds, real_targets)
    real_score = torch.mean(real_preds).item()
    
    # Generate fake images
    latent = torch.randn(batch_size, latent_size, 1, 1, device=device)
    fake_images = generator(latent)

    # Pass fake images through discriminator
    fake_targets = torch.zeros(fake_images.size(0), 1, device=device)
    fake_preds = discriminator(fake_images)
    fake_loss = F.binary_cross_entropy(fake_preds, fake_targets)
    fake_score = torch.mean(fake_preds).item()

    # Update discriminator weights
    loss = real_loss + fake_loss
    loss.backward()
    opt_d.step()
    return loss.item(), real_score, fake_score

This block is pretty similar to the block explained above.Here we are training the generator network.The generator network will basically train through the discriminator network. The goal of the generator network is to produce fake images to fool the discriminator. Its like a competition going between generator and discriminator.Through this game, the loss function of both the network will come into an equilibrium point when the generator will learn to produce images which will look perfectly like real ones.Just like the discriminator network,this net will update its weights through loss function.

In [None]:
def train_generator(opt):
    opt.zero_grad()
    latent = torch.randn(batch_size,latent_size,1,1,device=device)
    fake_images = generator(latent)
    preds = discriminator(fake_images)
    targets = torch.ones(batch_size,1,device = device)
    loss = F.binary_cross_entropy(preds,targets)
    loss.backward()
    opt.step()
    return loss.item()
    

In [None]:
sample_dir = 'generated'
os.makedirs(sample_dir, exist_ok=True)

In this block, we will save the images generated from the generator network.In the end, we will plot them to compare with the real images.

In [None]:
def save_samples(index, latent_tensors, show=True):
    fake_images = generator(latent_tensors)
    fake_fname = 'generated-images-{0:0=4d}.png'.format(index)
    save_image(denorm(fake_images), os.path.join(sample_dir, fake_fname), nrow=8)
    print('Saving', fake_fname)
    if show:
        fig, ax = plt.subplots(figsize=(8, 8))
        ax.set_xticks([]); ax.set_yticks([])
        ax.imshow(make_grid(fake_images.cpu().detach(), nrow=8).permute(1, 2, 0))

In [None]:
fixed_latent = torch.randn(64, latent_size, 1, 1, device=device)

In [None]:
save_samples(0, fixed_latent)

This block defines the setting of the training of the GAN networks.Indeed this is the lock, where will define the hyperparameters of the trainig like the optmizer.In respect of this,we will use Adam optimizer for both of our networks.Besides,here will calculate all the relevant scores of the network. 

In [None]:
def fit(epochs, lr, start_idx=1):
    torch.cuda.empty_cache()
    
    # Losses & scores
    losses_g = []
    losses_d = []
    real_scores = []
    fake_scores = []
    
    # Create optimizers
    opt_d = torch.optim.Adam(discriminator.parameters(), lr=lr, betas=(0.5, 0.999))
    opt_g = torch.optim.Adam(generator.parameters(), lr=lr, betas=(0.5, 0.999))
    
    for epoch in range(epochs):
        for real_images, _ in tqdm(train_dl):
            # Train discriminator
            loss_d, real_score, fake_score = train_discriminator(real_images, opt_d)
            # Train generator
            loss_g = train_generator(opt_g)
            
        # Record losses & scores
        losses_g.append(loss_g)
        losses_d.append(loss_d)
        real_scores.append(real_score)
        fake_scores.append(fake_score)
        
        # Log losses & scores (last batch)
        print("Epoch [{}/{}], loss_g: {:.4f}, loss_d: {:.4f}, real_score: {:.4f}, fake_score: {:.4f}".format(
            epoch+1, epochs, loss_g, loss_d, real_score, fake_score))
    
        # Save generated images
        save_samples(epoch+start_idx, fixed_latent, show=False)
    
    return losses_g, losses_d, real_scores, fake_scores

This block defines two very important hyperparameters--learning rate and epoch.And finally, we will fit the netowrk/train the netowork here.After the execution of this block,we will get our loss scores to see the performance of our designed model.

In [None]:
lr = 0.0002
epochs = 500
history = fit(epochs, lr)

In [None]:
losses_g, losses_d, real_scores, fake_scores = history

In [None]:
from IPython.display import Image
Image('./generated/generated-images-0184.png')


The last two blocks involve the plotting of the "Epoch-loss" as well as "Epoch--score/accuracy" curve. 

In [None]:
plt.plot(losses_d, '-')
plt.plot(losses_g, '-')
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['Discriminator', 'Generator'])
plt.title('Losses');

In [None]:
plt.plot(real_scores, '-')
plt.plot(fake_scores, '-')
plt.xlabel('epoch')
plt.ylabel('score')
plt.legend(['Real', 'Fake'])
plt.title('Scores');

So finally we have found the plot fake vs real images ! 

Thank you.
[References](https://www.kaggle.com/ibtesama/generative-adversarial-networks-demystified)