# Lab 3: Generative Adversarial Networks

In this lab, we will explore several architectures of generative adversarial networks. The break-up is as follows:
<ol>
<li>Vanilla GAN (DCGAN)</li>
<li>Conditional GAN (CGAN)</li>
<li>Auxiliary Classifier GAN (AC-GAN) </li>
<li>Bi-directional GAN (BiGAN) </li>
</ol>


## Module 1: Vanilla GAN (using DCGANs)

Let us first look at a normal generative adversarial network, using the structure known as DCGAN. 

First, let us define some constants and import some libraries. We also define a custom weights initialization function.


In [None]:
import os
import random
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.backends.cudnn as cudnn
import torch.optim as optim
import torch.utils.data
import torchvision.datasets as dset
import torchvision.transforms as transforms
import torchvision.utils as vutils
from torch.autograd import Variable
from collections import namedtuple

# let us define some parameters
workers = 2
batchSize=4
nz=10
ngf = 64
ndf=64
niter=1 #number of epochs
lr=0.0002 #learning rate for adam
beta1=0.5 #beta1 for adam
cuda=True
ngpu=1 #number of gpus to use
outf = './output'
manualSeed = 67
nc = 1 #number of output channels

# custom weights initialization called on netG and netD
def weights_init(m):
    classname = m.__class__.__name__
    if classname.find('Conv') != -1:
        m.weight.data.normal_(0.0, 0.02)
    elif classname.find('BatchNorm') != -1:
        m.weight.data.normal_(1.0, 0.02)
        m.bias.data.fill_(0)

A GAN architecture consists of a generator network, and a discriminator network, trained in an alternating manner. The generator network takes in noise and generates an image. The discriminator network takes in an image and determines whether the image is produced by the generator or taken from the database.

Let us define the discriminator network. In keeping with the principles put forward in the DCGAN paper, we use LeakyReLUs as the non-linearities. We will also define a criterion which will calculate the binary cross entropy loss.

In [None]:
class _netD(nn.Module):
    def __init__(self):
        super(_netD, self).__init__()
        self.main = nn.Sequential(
            # input is (nc) x 64 x 64
            nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf) x 32 x 32
            nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 2),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*2) x 16 x 16
            nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 4),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*4) x 8 x 8
            nn.Conv2d(ndf * 4, ndf * 8, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 8),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*8) x 4 x 4
            nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False),
            nn.Sigmoid()
        )

    def forward(self, input):
        output = self.main(input)
        return output.view(-1, 1)
    
netD = _netD()
netD.apply(weights_init)
print(netD)

criterion = nn.BCELoss()

Now, let us define the generator network and create an instance of the class.

In [None]:
class _netG(nn.Module):
    def __init__(self):
        super(_netG, self).__init__()
        self.main = nn.Sequential(
            # input is Z, going into a convolution
             nn.ConvTranspose2d(     nz, ngf * 8, 4, 1, 0, bias=False),
            nn.BatchNorm2d(ngf * 8),
            nn.ReLU(True),
            # state size. (ngf*8) x 4 x 4
            nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 4),
            nn.ReLU(True),
            # state size. (ngf*4) x 8 x 8
            nn.ConvTranspose2d(ngf * 4, ngf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 2),
            nn.ReLU(True),
            # state size. (ngf*2) x 16 x 16
            nn.ConvTranspose2d(ngf * 2,     ngf, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf),
            nn.ReLU(True),
            # state size. (ngf) x 32 x 32
            nn.ConvTranspose2d(    ngf,      nc, 4, 2, 1, bias=False),
            nn.Tanh()
            # state size. (nc) x 64 x 64
        )

    def forward(self, input):
        output = self.main(input)
        return output
    
netG = _netG()
netG.apply(weights_init)

We will use ADAM optimizer to train the networks. Let us set it up

In [None]:
# setup optimizer
optimizerD = optim.Adam(netD.parameters(), lr=lr, betas=(beta1, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr=lr, betas=(beta1, 0.999))

Let us see the sort of images that are generated by an untrained generator network using the random weight initialization function:

In [None]:
import torch
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np

%matplotlib inline

nbatchSize = 64
noise = torch.FloatTensor(nbatchSize, nz,1,1)
noise.normal_(0,1)
noise = Variable(noise)
if cuda:
    noise = noise.cuda()
output = netG.forward(input=noise)
print(output.size())

output = output.cpu()
output = output.data
output = torchvision.utils.make_grid(output)
output = output.permute(1,2,0)
plt.imshow(output.numpy())

Now, we will put the networks to the GPU so training can happen faster. Please note, if the networks are on GPU, we should put all the data we use with the networks on GPU too, and vice versa. Otherwise, it will crash.

In [None]:
input = torch.FloatTensor(batchSize, 1, 64, 64)
noise = torch.FloatTensor(batchSize, nz, 1, 1)
fixed_noise = torch.FloatTensor(batchSize, nz, 1, 1).normal_(0, 1)
label = torch.FloatTensor(batchSize)
real_label = 1
fake_label = 0

#cuda=False
if cuda:
    netD.cuda()
    netG.cuda()
    criterion.cuda()
    input, label = input.cuda(), label.cuda()
    noise, fixed_noise = noise.cuda(), fixed_noise.cuda()

fixed_noise = Variable(fixed_noise)

In this experiment, we will use the MNIST dataset to train the GAN. The package torchvision provides some methods for us to easily use the MNIST dataset and some other commonly used datasets. Let us load the dataset now. We will resize the images from 28x28 to 64x64 as required by the networks we defined above. 

In [None]:
# define a transformation to resize the image to 64x64
transform = transforms.Compose(
    [transforms.Scale(64),
     transforms.CenterCrop(64),
        transforms.ToTensor(),
     transforms.Normalize((0,), (1,))])

# load the dataset
trainset = torchvision.datasets.MNIST(root='../../data/lab3', train=True,
                                        download=False, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)
classes = ('0','1','2','3','4','5','6','7','8','9')

# print some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()
img = torchvision.utils.make_grid(images)
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))

Now, let us train the network. The code below runs for a 100 batches. You run this block again and again, or change the number of iterations to train for longer.

In [None]:
from __future__ import print_function

for epoch in range(niter):
    for i, data in enumerate(trainloader, 0):
        if i>1000:
            print('done 100 iterations')
            break
        if i%100==0:
            print(i, end=' ')
            
        ############################
        # (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
        ###########################
        # train with real
        netD.zero_grad()
        real_cpu, _ = data
        batch_size = real_cpu.size(0)
        if cuda:
            real_cpu = real_cpu.cuda()
        input.resize_as_(real_cpu).copy_(real_cpu)
        label.resize_(batch_size).fill_(real_label)
        inputv = Variable(input)
        labelv = Variable(label)

        output = netD(inputv)
        errD_real = criterion(output, labelv)
        errD_real.backward()
        D_x = output.data.mean()

        # train with fake
        noise.resize_(batch_size, nz, 1, 1).normal_(0, 1)
        noisev = Variable(noise)
        fake = netG(noisev)
        labelv = Variable(label.fill_(fake_label))
        output = netD(fake.detach())
        errD_fake = criterion(output, labelv)
        errD_fake.backward()
        D_G_z1 = output.data.mean()
        errD = errD_real + errD_fake
        optimizerD.step()

        ############################
        # (2) Update G network: maximize log(D(G(z)))
        ###########################
        netG.zero_grad()
        labelv = Variable(label.fill_(real_label))  # fake labels are real for generator cost
        output = netD(fake)
        errG = criterion(output, labelv)
        errG.backward()
        D_G_z2 = output.data.mean()
        optimizerG.step()
        
        # display some generated images for each 200 iterations
        if i%100 == 0:
            fake = netG(fixed_noise)
            fake = fake.data
            fake = fake[0:16,:,:,:]
            fake = fake.cpu()
            fake = torchvision.utils.make_grid(fake)
            fake = fake.permute(1,2,0)
            plt.imshow(fake.numpy())

You can go back to the display module we made earlier and run it to see some of the images the generator creates once it is trained.

### Questions/Exercises
<ol>
<li> How long do we have to train the DCGAN to get good results? Can you plot the loss of the generator and discriminator and see if there is a correlation? (Hint: loss of generator is errG, loss of discriminator is errD)</li>
<li> What is the correlation between the input random noise and the output produced by the generator? Can you vary the input noise a little and see the result on the generated images? </li>
<li> Is there any way to determine the class of the generated images by changing the input noise vector?</li>
</ol>