<a href="https://colab.research.google.com/github/thejayden/FYP/blob/main/dcgan_for_cifar_solution.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook, we will implement Deep Convolutional GAN (DCGAN) from [Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://arxiv.org/pdf/1511.06434.pdf). Both the generator and discriminator are made up of convolutional layers. The dataset used here is CIFAR 10 dataset.

**Before executing the cell, go to Runtime -> Change Runtime Type -> GPU**

In [2]:
import time 

# for building GAN
import torch
import torch.nn as nn  
import numpy as np
from torchvision import transforms
from torch.utils.data import DataLoader
import torch.optim as optim
import torchvision.datasets as dset
import torchvision.utils as vutils

# for visualization
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from IPython.display import HTML
from torchvision.utils import make_grid
%matplotlib inline

# Using GPU


In [3]:
device = "cuda" if torch.cuda.is_available() else "cpu"
print(device)

cuda


In [4]:
X = torch.randn(3, 2).to(device)
print(X.device)

cuda:0


# Prepare Cifar10 dataset

### Prepare data

In [5]:
transform = transforms.Compose([
    transforms.Resize(32),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), 
    (0.5, 0.5, 0.5)),
])
dataset = dset.CIFAR10(
    root='input/data',
    train=True,
    download=True,
    transform=transform
)

batch_size = 128        # Batch size during training

# Create the dataloader
dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=True)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to input/data/cifar-10-python.tar.gz


  0%|          | 0/170498071 [00:00<?, ?it/s]

Extracting input/data/cifar-10-python.tar.gz to input/data


In [6]:
len(dataset)

50000

### Visualization

In [7]:
label_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

In [8]:
# Show a random image
idx = np.random.choice(len(dataset)) # get a random index in range [0, len(dataset))
img, label = dataset[idx]

print(dataset[idx])
print('Image size: {}'.format(img.shape))
print('Label: {}'.format(label_names[label]))
plt.axis('off')
# torch.squeeze() method removes the dimensions of input of size 1
# input size of 32 x 32 x 1 will become 32 x 32
plt.imshow(img.permute(1, 2, 0).squeeze(), cmap='gray')
plt.show()

(tensor([[[ 0.5137,  0.5843,  0.5608,  ..., -0.3176, -0.1843, -0.3961],
         [ 0.9451,  0.9843,  0.9686,  ..., -0.1765, -0.0039, -0.1529],
         [ 0.9529,  0.9843,  0.9451,  ..., -0.1137, -0.1216, -0.1608],
         ...,
         [-0.0667, -0.0431, -0.0431,  ..., -0.2078, -0.1765, -0.1451],
         [-0.0745, -0.0196, -0.0431,  ..., -0.1608, -0.1843, -0.1922],
         [-0.2627, -0.1608, -0.1922,  ..., -0.2392, -0.2627, -0.2863]],

        [[ 0.5373,  0.5608,  0.5451,  ..., -0.3255, -0.2392, -0.4510],
         [ 0.9686,  0.9686,  0.9686,  ..., -0.1765, -0.0275, -0.1765],
         [ 0.9608,  0.9686,  0.9608,  ..., -0.1294, -0.1373, -0.1608],
         ...,
         [ 0.0039,  0.0431,  0.0667,  ..., -0.1216, -0.1216, -0.1137],
         [-0.0118,  0.0431,  0.0510,  ..., -0.0902, -0.1294, -0.1529],
         [-0.2235, -0.1137, -0.1294,  ..., -0.2078, -0.2157, -0.2392]],

        [[ 0.6000,  0.6314,  0.6157,  ..., -0.5686, -0.3804, -0.6078],
         [ 0.9765,  0.9922,  0.9843,  ..., -

In [None]:
def show_images(image_tensor, num_images=25, nrow=5, save=False): 
  image_tensor = image_tensor.detach().to('cpu') 
  # convert the output values to the expected range of float pixel values [0, 1]
  image_tensor = (image_tensor + 1)/2  
  img = make_grid(image_tensor[:num_images], nrow=nrow).permute(1,2,0).squeeze()
  # if save is True, just return the image  
  if save:
    return img
  plt.axis('off')
  plt.imshow(img)
  plt.show()

In [None]:
for i, data in enumerate(dataloader):
  X, _ = data  
  print(X.shape)
  break

show_images(X, 64, 8)

# Model definition
In this section, we will define generator and discriminator structure.  


### Generator Model

**Each layer of generator:**


*   Transposed convolution for upsampling
*   Use batchnorm except for the last layer
*   Apply ReLU activation for all layers except for the output, which uses Tanh










Remember that the output size of transposed convolution is:
$$output size = (input size -1)*stride - 2*padding + kernel size$$

Useful functions in building generator:


*   [ConvTranspose2d](https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose2d.html)
*   [BatchNorm2d](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html?highlight=batchnorm#torch.nn.BatchNorm2d)
*   [ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html?highlight=relu#torch.nn.ReLU)
*   [Tanh](https://pytorch.org/docs/stable/generated/torch.nn.Tanh.html?highlight=tanh#torch.nn.Tanh)

In [None]:
nz = 100                # Size of z latent vector (i.e. size of generator input)
nc = 3                  # Number of channels in the training images. For color images this is 3
hidden_dim = 64         # Size of feature maps in generator and discriminator for CNN

In [None]:
# Generator Code
class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
        self.main = nn.Sequential(
            # layer 1, input is Z (noise), going into a convolution
            nn.ConvTranspose2d(in_channels=nz, out_channels=hidden_dim * 8, kernel_size=4, stride=1, padding=0, bias=False),
            nn.BatchNorm2d(hidden_dim * 8),
            nn.ReLU(True),
            # size (hidden_dim*8) x 4 x 4

            # layer 2
            nn.ConvTranspose2d(in_channels=hidden_dim * 8, out_channels=hidden_dim * 4, kernel_size=4, stride=2, padding=1, bias=False),
            nn.BatchNorm2d(hidden_dim * 4),
            nn.ReLU(True),
            # size (hidden_dim*2) x 8 x 8

            # layer 3
            nn.ConvTranspose2d(in_channels=hidden_dim * 4, out_channels=hidden_dim * 2, kernel_size=4, stride=2, padding=1, bias=False),
            nn.BatchNorm2d(hidden_dim * 2),
            nn.ReLU(True),
            # size (hidden_dim) x 16 x 16

            # layer 4 (last layer)
            # state size. (hidden_dim*2) x 16 x 16
            nn.ConvTranspose2d(in_channels=hidden_dim * 2, out_channels=nc, kernel_size=4, stride=2, padding=1, bias=False),
            nn.Tanh()
        )
         # state size. (nc) x 32 x 32

    def forward(self, input):
        '''
        Function for completing a forward pass of the classifier: Given an image tensor, 
        returns an n_classes-dimension tensor representing fake/real.
        Parameters:
            image: a flattened image tensor with im_chan channels
        '''
        return self.main(input)

In [None]:
# apply to Generator and Discriminator network
def init_weights(m):
  if type(m) == nn.Conv2d or type(m) == nn.ConvTranspose2d:
    nn.init.normal_(m.weight, mean=0.0, std=0.02)
  elif type(m) == nn.BatchNorm2d:
    nn.init.normal_(m.weight, mean=0.0, std=0.02)
    nn.init.constant_(m.bias, val=0.0)

In [None]:
# Create the generator
netG = Generator().to(device)

# Apply the init_weights function to randomly initialize all weights
#  to mean=0, stdev=0.02.
netG.apply(init_weights)

# Print the model
print(netG)

In [None]:
def noise_vector(num, dim):
  '''
  Function for creating noise vectors: Given the dimensions (num, dim)
    creates a tensor of that shape filled with random numbers from the normal distribution.
    
  num: number of noise vectors, num = batch size in training process
  dim: dimension of each noise vector

  return: noise vectors of shape (num, dim, 1, 1)
  '''
  return torch.randn(num, dim, 1, 1)

In [None]:
z = noise_vector(1, 100)
print(z.shape)

In [None]:
z

In [None]:
gen = Generator()
fake = gen(z).detach()
print(fake[0].shape)

plt.axis('off')
plt.imshow(fake[0].permute(1, 2, 0).squeeze(), cmap='gray')
plt.show()

### Discriminator Model

**Each layer of discriminator:**


*   Convolution for downsampling
*   Use batchnorm except for the last layer
*   Apply LeakyReLU activation with slope of 0.2 for all layers 




Remember that the output size of convolution is:
$$output size = (inputsize + 2*padding - kernelsize)/stride + 1$$

Useful functions in building discriminator:


*   [Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html?highlight=conv2d#torch.nn.Conv2d)
*   [BatchNorm2d](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html?highlight=batchnorm#torch.nn.BatchNorm2d)
*   [LeakyReLU](https://pytorch.org/docs/stable/generated/torch.nn.LeakyReLU.html?highlight=leaky%20relu#torch.nn.LeakyReLU)
*   [Sigmoid](https://pytorch.org/docs/stable/nn.functional.html?highlight=sigmoid#torch.nn.functional.sigmoid)

In [None]:
class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()
        self.main = nn.Sequential(
            # layer 1, input is image of size nc x 32 x 32
            nn.Conv2d(nc, hidden_dim, 4, 2, 1, bias=False),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (hidden_dim) x 16 x 16

            # layer 2
            nn.Conv2d(hidden_dim, hidden_dim * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(hidden_dim * 2),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (hidden_dim*2) x 8 x 8

            # layer 3
            nn.Conv2d(hidden_dim * 2, hidden_dim * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(hidden_dim * 4),
            nn.LeakyReLU(0.2, inplace=True),
              # state size. (hidden_dim*4) x 4 x 4

            # layer 4
            nn.Conv2d(hidden_dim * 4, 1, 4, 1, 0, bias=False),
            nn.Sigmoid()
            # output size 1 (probability)
        )
    def forward(self, input):
        return self.main(input)

In [None]:
# Create the Discriminator
netD = Discriminator().to(device)

# Apply the init_weights function to randomly initialize all weights
#  to mean=0, stdev=0.2.
netD.apply(init_weights)

# Print the model
print(netD)

# Start Training

Useful function:



*   [torch.optim.Adam](https://pytorch.org/docs/stable/optim.html?highlight=adam#torch.optim.Adam)




In [None]:
lr = 0.0002                         # Learning rate for optimizers
beta1 = 0.5                         # Beta1 hyperparam for Adam optimizers

criterion = nn.BCELoss()            # Loss function

# Setup Adam optimizers for both G and D
optimizerD = optim.Adam(netD.parameters(), lr=lr, betas=(beta1, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr=lr, betas=(beta1, 0.999))

# Latent vectors to visualize the progression of the generator
fixed_noise = torch.randn(64, nz, 1, 1, device=device)

# Establish convention for real and fake labels during training
real_label = 1 
fake_label = 0


In [None]:
num_epochs = 30          # Number of training epochs

# Lists to keep track of progress
img_list = []
G_losses = []
D_losses = []
iters = 0

print("Starting Training Loop...")

for epoch in range(num_epochs):
    
    for i, data in enumerate(dataloader, 0):

        ############################
        # (1) Update D network
        ###########################

        netD.zero_grad()

        ## Train with all-real batch
        # Format batch
        real_cpu = data[0].to(device)
        b_size = real_cpu.size(0)
        label = torch.full((b_size,), real_label, dtype=torch.float, device=device)
        
        # Forward pass real batch through D
        output = netD(real_cpu).view(-1)

        # Calculate loss on all-real batch
        errD_real = criterion(output, label)

        # Calculate gradients for D in backward pass
        errD_real.backward()
        D_x = output.mean().item()

        ## Train with all-fake batch
        # Generate batch of latent vectors
        noise = torch.randn(b_size, nz, 1, 1, device=device)

        # Generate fake image batch with G
        fake = netG(noise)
        label.fill_(fake_label)

        # Classify all fake batch with D
        output = netD(fake.detach()).view(-1)

        # Calculate D's loss on the all-fake batch
        errD_fake = criterion(output, label)

        # Calculate the gradients for this batch, accumulated (summed) with previous gradients
        errD_fake.backward()
        D_G_z1 = output.mean().item()

        # Compute error of D as sum over the fake and the real batches
        errD = errD_real + errD_fake
        # Update D
        optimizerD.step()

        ############################
        # (2) Update G network: maximize log(D(G(z)))
        ###########################

        netG.zero_grad()
        label.fill_(real_label)        # fake labels are real for generator cost

        # Classify all fake batch with D 
        output = netD(fake).view(-1)

        # Calculate G's loss based on this output
        errG = criterion(output, label)

        # Calculate gradients for G
        errG.backward()
        D_G_z2 = output.mean().item()
        
        # Update G
        optimizerG.step()
        
        # Output training stats
        if i % 50 == 0:
            print('[%d/%d][%d/%d]\tLoss_D: %.4f\tLoss_G: %.4f\tD(x): %.4f\tD(G(z)): %.4f / %.4f'
                  % (epoch, num_epochs, i, len(dataloader),
                     errD.item(), errG.item(), D_x, D_G_z1, D_G_z2))
        
        # Save Losses for plotting later
        G_losses.append(errG.item())
        D_losses.append(errD.item())
        
        # Check how the generator is doing by saving G's output on fixed_noise
        if (iters % 500 == 0) or ((epoch == num_epochs-1) and (i == len(dataloader)-1)):
            with torch.no_grad():
                fake = netG(fixed_noise).detach().cpu()
            img_list.append(vutils.make_grid(fake, padding=2, normalize=True))
            
        iters += 1

In [None]:
torch.save(netG.state_dict(), 'generator.pth')
torch.save(netD.state_dict(), 'discriminator.pth')

# Visualization

In [None]:
plt.figure(figsize=(18,5))
plt.title("Generator and Discriminator Loss During Training")
plt.plot(G_losses,label="G")
plt.plot(D_losses,label="D")
plt.xlabel("iterations")
plt.xscale('linear')
plt.ylim(0, 15)

plt.show()

In [None]:
#%%capture
fig = plt.figure(figsize=(8,8))
plt.axis("off")
ims = [[plt.imshow(np.transpose(i,(1,2,0)), animated=True)] for i in img_list]
ani = animation.ArtistAnimation(fig, ims, interval=1000, repeat_delay=1000, blit=True)

HTML(ani.to_jshtml())

In [None]:
# Load pre-trained model
# Remember to define hyper parameters such as nz, nc before loading

# Uncomment this part
# netG = Generator().to(device)
# netG.load_state_dict(torch.load("./generator.pth"))
# netG.eval()

# netD = Discriminator().to(device)
# netD.load_state_dict(torch.load("./discriminator.pth"))
# netD.eval()
# print("Loaded the models!")

# opt = torch.optim.Adam(netD.parameters(), lr=0.01)

In [None]:
# noise vector
noise = noise_vector(25, nz).to(device)
# pass the noise vectors to trained generator
imgs = netG(noise)
show_images(imgs)

# What's next


*   Using other datasets to train a GAN, for example [CelebA](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) which is also available in *torchvision*
*   Explore methods to improve the stability of GAN learning. For anyone who is interested in this, can read more on Wasserstein GAN and Gradient Penalty. [WGAN-GP](https://jonathan-hui.medium.com/gan-wasserstein-gan-wgan-gp-6a1a2aa1b490)
*   Explore Conditional GAN, which allows you to control the output.

Want to include GAN in your next project? Take a look at this Github repository [GANs Awesome Applications](https://github.com/nashory/gans-awesome-applications)!

