### This is the implementation of a very simple GAN (Generative Adversarial Model).

This is first trial at implementing an algorithm from a research paper; in this case, 'Generative Adversarial Nets' (https://arxiv.org/pdf/1406.2661.pdf).

For this implementation, I am using many resources and basically just dipping my toes in the world of applied ML, as I hope to to gain two main outcomes of this. First, I would like to improve my actual coding skills when it comes to implementing ML/DL algorithms, as I would like to eventually be able to actually read complex publications and implement them myself. This will also help me stay updated with the new advancements happening in this field. Second, it will hopefully keep me in touch with the actual technical part and prevent me from simply being an 'implementer of github projects'. 

PS: the actual code for this implementation is a combination of the code shown in both the following Medium article: https://towardsdatascience.com/converting-deep-learning-research-papers-to-code-f-f38bbd87352f as well as the following YouTube tutorial: https://www.youtube.com/watch?v=OljTVUVzPpM. I just made very small changes and added more detailed comments for more clarity.


In [None]:
# Import required libraries
import torch
from torch import nn
import torchvision

from torch import optim
from torchvision.transforms import transforms

import numpy as np
from tqdm.notebook import tqdm
from PIL import Image
import matplotlib.pyplot as plt

import warnings
warnings.filterwarnings("ignore")

In [None]:
!jupyter nbextension enable --py widgetsnbextension

Initialize device, hyperparameters, and create latent space (noise).
Load the MNIST dataset and specify necessary tansforms and preprocessing.

In [None]:
# use the GPU if cuda is available. otherwise use the CPU
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'

lr = 3e-4               # learning rate = 3e-4
image_dim = 28*28*1     # 28*28*1=784 (MNIST data)
batch_size = 32         # batch size
num_epochs = 50        # number of epochs to train GAN

torch.manual_seed(7)
noise_dim = 64
#noise = torch.randn((batch_size, noise_dim))

# transforms.Compose allows us to combine multiple changes that we wish to do to an input image;
# in this case, the image is first converted to a Pytorch tensor and then normalized
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

mnist_images = torchvision.datasets.MNIST(
    root = '/home/achalhoub/dev/research_implementations/GAN_pt/input/',
    transform=transform)

#x, _ = mnist_images[7777]
#plt.imshow(x.numpy()[0], cmap='gray')
loader = torch.utils.data.DataLoader(mnist_images, batch_size=batch_size, shuffle=True)

Creating the Discriminator, which looks at an input image and outputs a '0' of it thinks the image is fake and a '1' if it thinks the image is real.

In [None]:
# defines a custom nn module, used for custom models.
# a sequential model, which is made up of the input layer of
# size 32*32*3 and a ReLU activation function and a single hidden layer
# with 10000 nodes, and finally an output layer of size 10000 with a sigmoid 
# activation function, is instantiated.

# a method for the forward pass is also created, which first flattens the
# input image and then performs a forward pass of the image through the Discriminator.
# this method then returns the output result that is coming from the sigmoid function.

class Discriminator(nn.Module):
    def __init__(self, image_dim):
        super().__init__()
        self.linear = nn.Sequential(
            nn.Linear(image_dim, 10000),
            nn.ReLU(),
            nn.Linear(10000, 1),
            nn.Sigmoid()
        )

    def forward(self, img):
        # flatten the input image and perform a forward pass
        out = self.linear(img)

        return out

Creating the Generator, which creates an output image tensor (fake image) using input random noise. The size of the fake image is the same size as that of the real image.

In [None]:
# defines a custom nn module, used for custom models.
# a sequential model, which is made up of the input layer of
# size 100 and a LeakyReLU activation function and two hidden layers
# with 10000 nodes and 4000 noges, and finally an output layer of size
# 32*32*3, is instantiated.

# a method for the forward pass is also created, which first flattens the
# input image and then performs a forward pass of the image through the Generator.
# this method then returns the output 'fake' image produced by the generator.

class Generator(nn.Module):
    def __init__(self, noise_dim):
        super().__init__()
        self.linear = nn.Sequential(
            nn.Linear(noise_dim, 10000),
            nn.LeakyReLU(),
            nn.Linear(10000, 4000),
            nn.LeakyReLU(),
            nn.Linear(4000, image_dim),
            # Tanh() normalizes the output
            nn.Tanh()
        )

    def forward(self, noise):
        # flatten the input image and perform a forward pass
        out = self.linear(noise)

        return out

Initializing the Generator and Discriminator, their optimizers, and the loss function.

In [None]:
# initialize the models and assign them to the device available
discr = Discriminator(image_dim=image_dim).to(device)
gen = Generator(noise_dim=noise_dim).to(device)

# initialize the optimizers for both parameters. here we use SGD
opt_d = optim.Adam(discr.parameters(), lr=lr)
opt_g = optim.Adam(gen.parameters(), lr=lr)

# initialize the loss function. the paper uses BCE (Binary Cross Entropy)
criterion = nn.BCELoss()

Train the model. The tutorial I am following trains the whole GAN (discriminator + generator) for 500 epochs. For each epoch of the whole GAN, the discriminator is trained for 4 epochs and the generator will be trained for 3 epochs.

In [None]:
torch.autograd.set_detect_anomaly(True)
torch.set_printoptions(threshold=10000)

In [None]:
# define the loop for the overall GAN training (500 epochs).
# 'total' is used to specify the total number of expected
# iterations (https://www.geeksforgeeks.org/python-how-to-make-a-terminal-progress-bar-using-tqdm/)

discr_losses = []
gen_losses = []

for epoch in tqdm(range(num_epochs), total=num_epochs):

    for idx, (real, _) in enumerate(loader):
        print(idx)

        #############################
        # Training the discriminator
        #############################

        # 'real_img.to(device) performs a Tensor device conversion. in this case,
        # it sets the input image, real_img, on the device we previously defined (GPU).
        # this also performs the forward pass and returns an output value from the
        # sigmoid activation function (a value between 0 and 1).

        # this measures the Binary Cross Entropy between the target and the
        # input probability. 'out_d1' is the output of the forward pass.
        # 'torch.ones((1, 1)).to(device)' is a 1x1 tensor of value 1, which
        # stands for the value of the real image (remember, fake=0 & real=1).
        # (https://pytorch.org/docs/stable/generated/torch.nn.BCELoss.html)
    

        # generate a fake image from the Generator using the input latent space.
        # 'detach' is used to return a tensor that is detached from the current
        # graph, meaning that the tensor will never require gradient.
        # same process as described above, however this time using a fake image
        # which is created by the generator.

        # Discriminator wants to maximize 'log(D(x)) + log(1 - D(G(z)))'

        print('real shape: ', real.shape)                               # [32, 1, 28, 28]
        real = real.view(-1, image_dim).to(device)                      # [32, 784]         
        discr_real_results = discr(real).view(-1)                       # [32]
        
        discr_real_loss = criterion(
            discr_real_results, torch.ones_like(discr_real_results))
        
        # create random input noise for the generator
        noise = torch.randn(batch_size, noise_dim).to(device)           # [32, 64]
        fake_images = gen(noise)                                        # [32, 784]

        discr_fake_results = discr(fake_images).view(-1)                # [32]
        discr_fake_loss = criterion(
            discr_fake_results, torch.zeros_like(discr_fake_results))

        discr_loss = (discr_real_loss + discr_fake_loss) / 2
        discr.zero_grad()
        discr_loss.backward(retain_graph=True)
        opt_d.step()


        #############################
        # Training the generator
        #############################

        # Generator wants to minimize 'log(1 - D(G(z)))', but a better thing
        # to do is to maximize 'log(D(G(z)))', which is equivalent but deals
        # better with the vanishing gradient problem
        new_discr_fake_results = discr(fake_images).view(-1)
        gen_loss = criterion(new_discr_fake_results, torch.ones_like(new_discr_fake_results))
        gen.zero_grad()
        gen_loss.backward()
        opt_d.step()

Plot the original image and the last fake image generated by the Generator to compare the results.