### **Expeirment 6.2**

- Implement a fully convolutional DCGAN-like model (https://arxiv.org/abs/1511.06434)
- Train the model on the CelebA dataset to generate new faces
- Requirements:
1. Used WanDB for tracking
2. Show Model capabilities to generate images
3. Evaluate and track during training using one quantitative metric (Inception Score is used here)

In [1]:
# Required Imports
import os
import shutil
from tqdm import tqdm
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import ignite
from PIL import Image
from natsort import natsorted
from ignite.metrics import FID 
from torch import Tensor, optim
from torch.autograd import Variable
from torchvision import datasets, models, transforms
from torchvision.utils import save_image
from torch.utils.data import DataLoader, Dataset
from torchvision.datasets import ImageFolder
from torch.utils.tensorboard import SummaryWriter
from pytorch_gan_metrics import get_inception_score_and_fid, get_inception_score, get_fid

In [4]:
# Install Wandb and its dependencies 
# !pip install wandb



In [5]:
# Log in to W&B account
import wandb
os.environ["WANDB_NOTEBOOK_NAME"] = "Exercise6_2CelebA.ipynb"
wandb.login()

# Project link: https://wandb.ai/tarzilianams/cudalab-assignment6-dcgan

[34m[1mwandb[0m: Currently logged in as: [33mtarzilianams[0m. Use [1m`wandb login --relogin`[0m to force relogin


True

In [6]:
# Downloading and Importing celebA dataset using DataLoader

In [7]:
image_size = 64     # 64x64
batch_size = 128  # Batch size of 128 used

train_transform = transforms.Compose([
    transforms.Resize((image_size, image_size)),
    transforms.CenterCrop(image_size),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),   # mean/std
])

celeba_dataset = datasets.CelebA('./', split="train", download=False, transform=train_transform)
celeba_loader = torch.utils.data.DataLoader(dataset=celeba_dataset, batch_size=128, shuffle=True, drop_last=False)  # Batch size of 128 used

In [8]:
# Size of CelebA dataset
len(celeba_dataset)

162770

In [9]:
# Image tensor dimension
for i in celeba_loader:
    print(i[0].shape)  # (batch, channel (RGB), height, width)
    break

torch.Size([128, 3, 64, 64])


The following issue was repeatedly experienced while trying to redownload celebAdataset using the torch dataset syntax: "The daily quota of the file img_align_celeba.zip is exceeded and it can't be downloaded. This is a limitation of Google Drive and can only be overcome by trying again later." The other workaround was to download the dataset manually.

We only define the training dataset and loader and not the validation data loader, since our DCGAN operations will be performed solely on training data including fake generation and binary classification (real(1)/fake(0)) by the discriminator.

The neural network in DCGAN uses convolutional layers. In fact, in the whole DCGAN architecture, there are no fully connected layers. Hence this architecture is a fully convolutional network.

In [10]:
# Use cuda gpu device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

The next code block contains the entire generator code.

In [11]:
# nv_size -  Size of the noise vector used by generator as input to first convolutional layer (initialized as 100)
# Kernel size, Stride, Padding = 4, 1, 0 (for first layer); 4, 2, 1 for subsequent layers 
# Start from 512 output channels -> 256 -> 128 -> 64 -> 3 channels (RGB) of colored images
# Final Image dimension: 64 x 64 x 3

class GeneratorDCGAN(nn.Module):
    def __init__(self, nv_size: int = 100):
        super().__init__()
        self.nv_size = nv_size
        # Sequential container used to build the generator model
        self.network = nn.Sequential(
            nn.ConvTranspose2d(nv_size, 512, 4, 1, 0, bias=False),
            nn.BatchNorm2d(512),        # Batch Normalization for normalizing layer inputs
            nn.ReLU(True),              # ReLu Activation function

            nn.ConvTranspose2d(512, 256, 4, 2, 1, bias=False),
            nn.BatchNorm2d(256),
            nn.ReLU(True),

            nn.ConvTranspose2d(256, 128, 4, 2, 1, bias=False),
            nn.BatchNorm2d(128),
            nn.ReLU(True),

            nn.ConvTranspose2d(128, 64, 4, 2, 1, bias=False),
            nn.BatchNorm2d(64),
            nn.ReLU(True),

            nn.ConvTranspose2d(64, 3, 4, 2, 1, bias=False),
            nn.Tanh()                   # Tanh function
        )

    # forward pass of the noise vector through the generator network
    def forward(self, batch_size: int) -> Tensor:
        z = torch.randn(batch_size, self.nv_size, 1, 1, device=device)
        return self.network(z)


The discriminator mode will almost be the reverse of the generator model.

In [12]:
# nv_size -  Size of the noise vector used by generator as input to first convolutional layer (initialized as 100)
# Kernel size, Stride, Padding = 4, 1, 0 (for last layer); 4, 2, 1 otherwise
# Start from 3 output channels -> 64 -> 128 -> 256 -> 512 -> 1 channel (to classify image as real (1) or fake (0))

class DiscriminatorDCGAN(nn.Module):
    def __init__(self):
        super().__init__()
        # Sequential container used to build the discriminator model
        self.network = nn.Sequential(
            nn.Conv2d(3, 64, 4, 2, 1, bias=False),
            nn.BatchNorm2d(64),
            nn.LeakyReLU(0.2, inplace=True),  # LeakyReLU activation with slope 0.2

            nn.Conv2d(64, 128, 4, 2, 1, bias=False),
            nn.BatchNorm2d(128),
            nn.LeakyReLU(0.2, inplace=True),

            nn.Conv2d(128, 256, 4, 2, 1, bias=False),
            nn.BatchNorm2d(256),
            nn.LeakyReLU(0.2, inplace=True),

            nn.Conv2d(256, 512, 4, 2, 1, bias=False),
            nn.BatchNorm2d(512),
            nn.LeakyReLU(0.2, inplace=True),

            nn.Conv2d(512, 1, 4, 1, 0, bias=False),
            nn.Sigmoid()                      # 
        )

    # Forward passes real/fake image batch through discriminator
    def forward(self, x: Tensor) -> Tensor:
        return self.network(x).flatten()

We next define a function that initializes weights for both the aforementioned models.

In [13]:
# Function that initializes the model weights of the generator and the discriminator 
def weight_initializer(model: nn.Module):
    for child in model.network.children():
        if isinstance(child, nn.Conv2d):
            nn.init.normal_(child.weight.data, 0.0, 0.02)
        if isinstance(child, nn.BatchNorm2d):
            nn.init.normal_(child.weight.data, 1.0, 0.02)
            nn.init.constant_(child.bias.data, 0)


In [15]:
# Image Generation for FID metric Calculations
from tqdm import trange

def generate_imgs(generator, device, nv_size=128, size=5000, batch_size=128):
    generator.eval()
    imgs = []
    with torch.no_grad():
        for start in trange(0, size, batch_size,
                            desc='Evaluating', ncols=0, leave=False):
            end = min(start + batch_size, size)
            z = torch.randn(end - start, nv_size).to(device)
            imgs.append(generator(z).cpu())
    generator.train()
    imgs = torch.cat(imgs, dim=0)
    imgs = (imgs + 1) / 2
    return imgs

 Next we define function used for training the discriminator and generator networks.

In [16]:
# Training function
def training_function():
    
    # Wandb Initialization
    run = wandb.init()
    nv_size = run.config.nv_size
    run.name = f'DCGAN-{nv_size}'  # nv_size of current iteration

    # Respective models
    discriminator = DiscriminatorDCGAN().to(device)
    generator = GeneratorDCGAN(nv_size).to(device)
    
    # Initialize weights for both models
    weight_initializer(discriminator)
    weight_initializer(generator)
    
    # Other initializations
    lr = 0.0002
    beta1 = 0.5    # default is 0.9, using 0.5 here for more stable training and faster convergence.
    beta2 = 0.999  # default on documentation

    # Using Adam optimizer for training for both models:
    Discriminator_optimizer = optim.Adam(discriminator.parameters(),lr=lr, betas=(beta1,beta2))
    Generator_optimizer = optim.Adam(generator.parameters(),lr=lr, betas=(beta1,beta2))
    
    # Criterion - Binary Cross Entropy Loss
    criterion = nn.BCELoss().to(device) 

    for epoch in range(5):
        for images,_ in celeba_loader:
            
            # get images and image batch size
            real_imgs = images.to(device)
            batch_size = images.shape[0] 
            
            # target to classify image as real
            target = torch.ones(batch_size, dtype=torch.float, device=device)
            # generate fake image batch with Generator
            fake_imgs = generator(batch_size)

            # Discriminator Loss and optimizer
            Discriminator_loss_real = criterion(discriminator(real_imgs), target)                 # target = real
            Discriminator_loss_fake = criterion(discriminator(fake_imgs.detach()), 1-target)      # 1-target = fake
            Discriminator_loss = Discriminator_loss_real + Discriminator_loss_fake
            Discriminator_optimizer.zero_grad()
            Discriminator_loss.backward()
            Discriminator_optimizer.step()

            # Generator Loss and optimizer
            Generator_loss = criterion(discriminator(fake_imgs), target)
            Generator_optimizer.zero_grad()
            Generator_loss.backward()
            Generator_optimizer.step()
            
            # Using Inception Score (IS) for evaluating the quality of generated/fake images
            IS = get_inception_score(fake_imgs)
            
            # Log losses and IS
            wandb.log({'Discriminator loss': Discriminator_loss, 'Generator loss': Generator_loss, 'Inception Score (IS)':IS[0]})
            

    discriminator.eval()
    generator.eval()

    # Check for every configuration by generating 15 images and calculating discriminator's score
    fake_batch_size = 15
    fake_imgs = generator(batch_size = fake_batch_size)
    preds = discriminator(fake_imgs)
    
    # Create Wandb Table and Display Discriminator Predictions to visualize the model capability
    sweep_table = wandb.Table(columns=['img', 'Prediction_Discriminator', 'nv_size'])
    for index in range(15):
        sweep_table.add_data(wandb.Image(fake_imgs[index]), preds[index], nv_size)
    wandb.log({"Predictions_CelebA": sweep_table})


We also initialize sweep configurations for WanDB to be used later for tracking loss curves for generator and discriminator.

In [19]:
sweep_configuration = {
    'name': 'Fully Convolutional DCGAN',
    'metric': { 'name': 'Generative loss', 'goal': 'minimize' },
    'method': 'grid',
    'parameters': {
        'nv_size': {
            'values': [10, 50, 100, 200, 300, 400]
        }
    }
}
sweep_id = wandb.sweep(sweep_configuration, project='cudalab-assignment6-dcgan')

Create sweep with ID: 1hb4xwzr
Sweep URL: https://wandb.ai/tarzilianams/cudalab-assignment6-dcgan/sweeps/1hb4xwzr


Ater the requisite configuratrions and functions defined all we need to do is start Wandb agent that runs sweep using train function above.

In [18]:
wandb.agent(sweep_id, function=training_function, project='cudalab-assignment6-dcgan')

[34m[1mwandb[0m: Agent Starting Run: ggcfl4s7 with config:
[34m[1mwandb[0m: 	nv_size: 10


0,1
Discriminator loss,▂▇▆█▆▄▆▄▇▆▅▄▆▅▃▃▄▂▂▂▃▄▃▃▂▄▂▂▅▁▃▆▂▃▂▅▂▂▂▅
Generator loss,█▂▂▄▁▃▁▂▁▂▃▂▃▃▂▄▃▃▂▂▄▃▂▃▃▃▃▄▅▃▅▄▂▃▄▅▃▃▄▃
Inception Score (IS),▁▅▅▇▇▆▆▇▅█▆▅▅▅▄▆▅▅▆▅▅▅▄▅▅▄▆▅▅▆▅▆▆▅▅▇▅▅▅▅

0,1
Discriminator loss,1.59594
Generator loss,1.47087
Inception Score (IS),1.98362


[34m[1mwandb[0m: Agent Starting Run: s7f9fq1d with config:
[34m[1mwandb[0m: 	nv_size: 50


0,1
Discriminator loss,▂▅▂▆▂▂▄▂▃▂▂▂▃▂▁▁▂▂▂▁▃▂▅▃▂▁▂▂▁▁▁▅▃▁▂▂█▅▁▁
Generator loss,█▅▂▁▂▂▂▂▂▂▂▁▁▂▂▂▂▂▂▂▂▃▃▂▂▂▂▂▂▂▂▁▁▂▃▂▃▁▂▂
Inception Score (IS),▁▂▅▇▇▇▇█▇▆▅▆▅▇▅▆▇▆▆▆▆▇▆▆▆▆▆▆▆▅▇▆▆▆▇▆▇▆▆▆

0,1
Discriminator loss,0.22253
Generator loss,3.40306
Inception Score (IS),2.1759


[34m[1mwandb[0m: Agent Starting Run: w9d2h2z9 with config:
[34m[1mwandb[0m: 	nv_size: 100


0,1
Discriminator loss,▂▁▁▁▁▁▁▁▁▁▁▁▁▁█▅▄▄▅▄▄▅▄▃▅▄▄▄▆▄▄▃▃▃▄▃▄▆▄▅
Generator loss,▃████████████▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▂▁▁
Inception Score (IS),▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▆▆█▇▇▇▆▇▇▆▆▆█▆▆▆▆▆▆▆▆▆▆▆▆

0,1
Discriminator loss,1.72069
Generator loss,0.78656
Inception Score (IS),2.09346


[34m[1mwandb[0m: Agent Starting Run: 0jkab7m2 with config:
[34m[1mwandb[0m: 	nv_size: 200


0,1
Discriminator loss,▁▁▁▁▆▄▃▆▅▇▅▇▅▄▃▄▄▃▃▄▃▄▃▆▅▄▅▂▄▃▇█▂▄▇▃▄▂▃▃
Generator loss,▆▆██▂▁▁▂▁▁▁▂▁▁▂▁▂▂▁▁▁▁▁▂▁▁▁▁▂▁▂▁▁▁▁▁▂▁▂▁
Inception Score (IS),▁▁▁▁▅▇▆▇▆█▅▇▇▆▆▆▆▅▆▇▆▇▇▇▆▆▇▆▆▇▆▆▆▆▇▆▆▇▆▆

0,1
Discriminator loss,0.19292
Generator loss,3.63657
Inception Score (IS),2.08783


[34m[1mwandb[0m: Agent Starting Run: 2f7m15rv with config:
[34m[1mwandb[0m: 	nv_size: 300


0,1
Discriminator loss,█▁▁▁▂▄▅▄▄▄▄▇▅▄▄▄▄▃▃▃▃▃▂▅▃▃▇▃▄▄▃▃▅▃▅▅▃▃▃▇
Generator loss,▅▇▇█▁▁▂▂▂▁▂▂▂▁▁▁▁▁▁▂▁▁▁▁▁▁▂▁▂▁▁▁▂▁▂▂▁▁▁▁
Inception Score (IS),▁▁▁▁▃▄▆▇█▇▇▇▆▇█▇▆▇▅▅▇▆▅▇▇▇▇▆▆▆▅▆▇▇▇▆▇▆▇▇

0,1
Discriminator loss,0.28519
Generator loss,2.90507
Inception Score (IS),2.01914


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Sweep Agent: Exiting.


In [20]:
# Training for nv_size = 400

wandb.agent(sweep_id, function=training_function, project='cudalab-assignment6-dcgan')

[34m[1mwandb[0m: Agent Starting Run: ci7nwru3 with config:
[34m[1mwandb[0m: 	nv_size: 400


0,1
Discriminator loss,█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
Generator loss,▄▁███▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▆▆▆▆▆▆▆▆
Inception Score (IS),▇▃▄▇▇▃▄▄▁▄▂▆▇▄▅▅▃▇▆▇▅▇▅█▄▅▃▂▄▅▅▃▆█▄▃▆█▆▃

0,1
Discriminator loss,0.0
Generator loss,40.85154
Inception Score (IS),1.02188


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Sweep Agent: Exiting.


**Observation from the aforementioned loss and inception score curves**:

After running the above training function for multiple times, the following observations are made: (Though there are always some differences in the observed curve during multiple runs for the same nv_size.)

1. For small nv_size = 10,30 The Generator loss curve increases for a few steps and then eventually decreases around 200 steps and for the most instances stays in the same loss range (less than 5) The discriminator loss as a result stays below 1 for most part. Ideally, loss should be as small as possible for both the generator and the discriminator, so for these input dimensions, it works fairly well.
2. With the increase in nv_size, the generator loss curve takes a bit longer to converge and reduce loss (as seen for nv_size = 100, it increases to above 40 and stays in that range for the next 2k steps before eventually converging), though not always. During the duration of increased gernerator loss value the discriminator loss reaches almost zero value, and the inception score also diminishes to 1, which signify either loss of image variety for generated images or these images have reduced clarity and less meaningful, A higher score is considered better as it means DCGAN can generate many different distinct images.
3. Starting from close to 300 or more input dimensionality (specifically observed at nv_size=400 in the above run), we observe exploding generator loss throughout the training period and conversely the discriminator loss vanishes, and low inception score as a result, though this behaviour is not always observed when trained multiple times. But it does gives us a general picture of higher input dimensionality eventually yielding persistent high generator loss, vanishing discriminator loss and low inception score.

So to summarize, running with increase in nv_size values, the generator curve takes a bit longer to converge and reduce loss, though it does reaches a high value usually in the beginning phase. Hence the generated images are more sensible as a result. However after a certain input dimensionality it results in exploding generator loss and conversely the discriminator loss vanishes.

For the sake of comparision, here's my previous run that shows high gradient loss at nv_size = 300 (In the current sweep similar curve is observed at 400 instead):
https://wandb.ai/tarzilianams/cudalab-assignment6-dcgan/sweeps/abiotco1?workspace=user-tarzilianams