<div class="alert alert-block alert-info">
<b>Number of points for this notebook:</b> 2
<br>
<b>Deadline:</b> May 23, 2020 (Saturday) 23:00
</div>

# Exercise 11.1. Generative adversarial networks (GANs). DCGAN: Deep convolutional GAN

The goal of this exercise is to get familiar with generative adversarial networks and specifically DCGAN. The model was proposed by [Radford et al., 2015](https://arxiv.org/pdf/1511.06434.pdf).

DCGAN is probably the simplest GAN model which is relatively easy to train.

In [0]:
skip_training = True  # Set this flag to True before validation and submission

In [0]:
# During evaluation, this cell sets skip_training to True
# skip_training = True

In [0]:
import os
import numpy as np
import matplotlib.pyplot as plt
from IPython import display

import torch
import torchvision
import torch.nn as nn
from torch.nn import functional as F
from torchvision import transforms

import tools
import tests

In [35]:
# When running on your own computer, you can specify the data directory by:
# data_dir = tools.select_data_dir('/your/local/data/directory')
data_dir = tools.select_data_dir()

The data directory is ../data


In [0]:
device = torch.device('cuda:0')
# device = torch.device('cpu')

In [0]:
if skip_training:
    # The models are always evaluated on CPU
    device = torch.device("cpu")

# Data

We will use MNIST data in this exercise. Note that we re-scale images so that the pixel intensities are in the range [-1, 1].

In [0]:
transform = transforms.Compose([
    transforms.ToTensor(),  # Transform to tensor
    transforms.Normalize((0.5,), (0.5,))  # Scale images to [-1, 1]
])

trainset = torchvision.datasets.MNIST(root=data_dir, train=True, download=True, transform=transform)

batch_size = 100
data_loader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True)

# Generative adversarial networks

Our task is to train a generative model of the data, that is a model from which we can draw samples that will have a distribution similar to the distribution of the training data (MNIST digits in our case).

## Generator

The generative model that we are going to train is:
\begin{align}
z &\sim N(0, I)
\\
x &= G(z)
\end{align}
that is the data is generated by applying a nonlinear transformation to samples drawn from the standard normal distribution.

We are going to model $G$ with a deep neural network created below. In DCGAN, the generator is made of only transposed convolutional layers `ConvTranspose2d`.
The proposed architecture for the generator:
* `ConvTranspose2d` layer with `kernel_size=4`, `stride=2`, `4*ngf` output channels, no bias,
   followed by `BatchNorm2d` and ReLU
* `ConvTranspose2d` layer with `kernel_size=4`, `stride=2`, `2*ngf` output channels, no bias,
   followed by `BatchNorm2d` and ReLU
* `ConvTranspose2d` layer with `kernel_size=4`, `stride=2`, `ngf` output channels, no bias,
   followed by `BatchNorm2d` and ReLU
* `ConvTranspose2d` layer with `kernel_size=4`, `stride=2`, `nc` output channels, no bias,
   followed by `tanh`.
   
The `tanh` nonlinearity guarantees that the output is between -1 and 1 which holds for our scaling of the training data.

**The exact architecture is not tested in this assignment. If the description is not full, please fill the missing pieces according to your preferences.**

In [0]:
class Generator(nn.Module):
    def __init__(self, nz=10, ngf=64, nc=1):
        """GAN generator.
        
        Args:
          nz:  Number of elements in the latent code.
          ngf: Base size (number of channels) of the generator layers.
          nc:  Number of channels in the generated images.
        """
        super(Generator, self).__init__()
        # YOUR CODE HERE
        self.conv1 = nn.ConvTranspose2d(in_channels=nz, out_channels=4*ngf, kernel_size=4, stride=2,padding=1, bias=False)
        self.conv2 = nn.ConvTranspose2d(in_channels=4*ngf, out_channels=2*ngf, kernel_size=4, stride=2, bias=False)
        self.conv3 = nn.ConvTranspose2d(in_channels=2*ngf, out_channels=ngf, kernel_size=4, stride=2, bias=False)
        self.conv4 = nn.ConvTranspose2d(in_channels=ngf, out_channels=nc, kernel_size=4, stride=2, padding=1, bias=False)
        
        self.bn1 = nn.BatchNorm2d(4*ngf)
        self.bn2 = nn.BatchNorm2d(2*ngf)
        self.bn3 = nn.BatchNorm2d(ngf)
        
        #raise NotImplementedError()

    def forward(self, z, verbose=False):
        """Generate images by transforming the given noise tensor.
        
        Args:
          z of shape (batch_size, nz, 1, 1): Tensor of noise samples. We use the last two singleton dimensions
                          so that we can feed z to the generator without reshaping.
          verbose (bool): Whether to print intermediate shapes (True) or not (False).
        
        Returns:
          out of shape (batch_size, nc, 28, 28): Generated images.
        """
        # YOUR CODE HERE
        z = F.relu(self.bn1(self.conv1(z))) #b,nz,1,1 -> b,4*ngf,2,2
#         print(z.shape)
        z = F.relu(self.bn2(self.conv2(z))) #b,4*ngf,2,2 -> b,2*ngf,6,6
#         print(z.shape)
        z = F.relu(self.bn3(self.conv3(z))) #b,2*ngf,6,6 -> b,ngf,14,14
#         print(z.shape)
        z = torch.tanh(self.conv4(z)) #b,ngf,14,14 -> b,nc,28,28
#         print(z.shape)
             
        return z
        #raise NotImplementedError()

In [62]:
def test_Generator_shapes():
    nz = 10
    netG = Generator(nz, ngf=64, nc=1)

    batch_size = 32
    noise = torch.randn(batch_size, nz, 1, 1)
    out = netG(noise, verbose=True)

    assert out.shape == torch.Size([batch_size, 1, 28, 28]), f"Bad shape of out: out.shape={out.shape}"
    print('Success')

test_Generator_shapes()

Success


### Loss for training the generator

The generative model will be guided by a discriminator whose task is to separate (classify) data into two classes:
* true data (samples from the training set)
* generated data (samples generated by the generator).

In [0]:
# Establish convention for real and fake labels during training
real_label = 1
fake_label = 0

The task of the generator is to confuse the discriminator as much as possible, which is the case when the distribution produced by the generator perfectly replicates the data distribution.

In the cell below, you need to implement the loss function which is used to train the generator. The loss should be the `binary_cross_entropy` loss computed with `real_label` as targets for the generated samples.

In [0]:
def generator_loss(D, fake_images):
    """Loss computed to train the GAN generator.

    Args:
      D: The discriminator whose forward function takes inputs of shape (batch_size, nc, 28, 28)
         and produces outputs of shape (batch_size, 1).
      fake_images of shape (batch_size, nc, 28, 28): Fake images produces by the generator.

    Returns:
      loss: The mean of the binary cross-entropy losses computed for all the samples in the batch.

    Notes:
    - Make sure that you process on the device given by `fake_images.device`.
    - Use values of global variables `real_label`, `fake_label` to produce the right targets.
    """
    # YOUR CODE HERE
    criterion = nn.BCELoss()
    d_out = D(fake_images)
    expected = torch.ones(d_out.size(0), device=fake_images.device)*real_label
    loss = criterion(d_out, expected)
    return loss
    #raise NotImplementedError()

In [65]:
tests.test_generator_loss(generator_loss)

loss: tensor(1.0730)
expected: tensor(1.0730)
Success


## Discriminator

In DCGAN, the discriminator is a stack of only convolutional layers.

The proposed architecture for the discriminator:
* `Conv2d` layer with `kernel_size=4`, `stride=2`, `ndf` output channels, no bias,
   followed by LeakyReLU(0.2)
* `Conv2d` layer with `kernel_size=4`, `stride=2`, `2*ndf` output channels, no bias,
   followed by LeakyReLU(0.2)
* `Conv2d` layer with `kernel_size=4`, `stride=2`, `4*ndf` output channels, no bias,
   followed by LeakyReLU(0.2)
* `Conv2d` layer with `kernel_size=4`, `stride=2`, `nc` output channels, no bias,
   followed by `sigmoid`.

**The exact architecture is not tested in this assignment. If the description is not full, please fill the missing pieces according to your preferences.**

In [0]:
class Discriminator(nn.Module):
    def __init__(self, nc=1, ndf=64):
        """GAN discriminator.
        
        Args:
          nc:  Number of channels in images.
          ndf: Base size (number of channels) of the discriminator layers.
        """
        # YOUR CODE HERE
        super(Discriminator, self).__init__()
        self.conv1 = nn.Conv2d(nc, ndf, 4, stride=2, padding=1, bias=False)
        self.conv2 = nn.Conv2d(ndf, 2*ndf, 4, stride=2, bias=False)
        self.conv3 = nn.Conv2d(2*ndf, 4*ndf, 4, stride=2, bias=False)
        self.conv4 = nn.Conv2d(4*ndf, nc, 4, stride=2, padding=1, bias=False)
        
        self.bn1 = nn.BatchNorm2d(ndf)
        self.bn2 = nn.BatchNorm2d(2*ndf)
        self.bn3 = nn.BatchNorm2d(4*ndf)
        
        self.l_relu = nn.LeakyReLU(0.2)
        #raise NotImplementedError()

    def forward(self, x, verbose=False):
        """Classify given images into real/fake.
        
        Args:
          x of shape (batch_size, 1, 28, 28): Images to be classified.
        
        Returns:
          out of shape (batch_size,): Probabilities that images are real. All elements should be between 0 and 1.
        """
        # YOUR CODE HERE
        x = self.l_relu(self.bn1(self.conv1(x))) #b,nc,28,28 -> b,ndf,14,14
        x = self.l_relu(self.bn2(self.conv2(x))) #b,ndf,14,14 -> b,2*ndf,6,6
        x = self.l_relu(self.bn3(self.conv3(x))) #b,2*ndf,6,6 -> b,4*ndf,2,2
        x = torch.sigmoid(self.conv4(x)) #b,4*ndf,2,2 -> b,1,1,1
        return x.squeeze()
        
        #raise NotImplementedError()

In [67]:
def test_Discriminator_shapes():
    batch_size = 32
    netD = Discriminator(nc=1, ndf=64)

    images = torch.ones(32, 1, 28, 28)
    out = netD(images, verbose=True)

    assert out.shape == torch.Size([batch_size]), f"Bad shape of out: out.shape={out.shape}"
    print('Success')

test_Discriminator_shapes()

Success


### Loss for training the discriminator

The discriminator is trained to solve a binary classification problem: to separate real data from generated samples. Thus, the output of the discriminator should be a scalar between 0 and 1.

You need to implement the loss function used to train the discriminator. The dicriminator uses the `binary_cross_entropy` loss using `real_label` as targets for real samples and `fake_label` as targets for generated samples.

In [0]:
def discriminator_loss(D, real_images, fake_images):
    """Loss computed to train the GAN discriminator.

    Args:
      D: The discriminator.
      real_images of shape (batch_size, nc, 28, 28): Real images.
      fake_images of shape (batch_size, nc, 28, 28): Fake images produces by the generator.

    Returns:
      d_loss_real: The mean of the binary cross-entropy losses computed on the real_images.
      D_real: Mean output of the discriminator for real_images. This is useful for tracking convergence.
      d_loss_fake: The mean of the binary cross-entropy losses computed on the fake_images.
      D_fake: Mean output of the discriminator for fake_images. This is useful for tracking convergence.

    Notes:
    - Make sure that you process on the device given by `fake_images.device`.
    - Use values of global variables `real_label`, `fake_label` to produce the right targets.
    """
    # YOUR CODE HERE
    criterion = nn.BCELoss()
    d_fake = D(fake_images)
    d_real = D(real_images)
    exp_fake = torch.ones(d_fake.size(0), device=fake_images.device)*fake_label
    exp_real = torch.ones(d_real.size(0), device=fake_images.device)*real_label
    
    loss_fake = criterion(d_fake, exp_fake)
    loss_real = criterion(d_real, exp_real)
    return loss_real, d_real.mean().item(), loss_fake, d_fake.mean().item()
    #raise NotImplementedError()

In [69]:
def test_discriminator_loss():
    netD = Discriminator(nc=1, ndf=64)
    real_images = fake_images = torch.ones(32, 1, 28, 28)

    d_loss_real, D_real, d_loss_fake, D_fake = discriminator_loss(netD, real_images, fake_images)
    assert d_loss_real.shape == torch.Size([]), "d_loss_real should be a scalar tensor."
    assert 0 < D_real < 1, "D_real should be a scalar between 0 and 1."
    assert d_loss_fake.shape == torch.Size([]), "d_loss_fake should be a scalar tensor."
    assert 0 < D_fake < 1, "D_fake should be a scalar between 0 and 1."
    print('Success')

test_discriminator_loss()

Success


In [70]:
tests.test_discriminator_loss(discriminator_loss)

d_loss_real: tensor(0.3635)
expected d_loss_real: tensor(0.3635)
D_real: 0.699999988079071
expected D_real: 0.699999988079071
d_loss_fake: 0.22839301824569702
expected d_loss_fake: tensor(0.2284)
D_fake: 0.20000000298023224
expected D_fake: 0.20000000298023224
Success


# Training GANs

We will now train a GAN. To assess the quality of the generated samples, we will use a simple scorer loaded in the cell below.

In [71]:
from scorer import Scorer
scorer = Scorer()
scorer.to(device)

Sequential(
  (fc1): Linear(in_features=784, out_features=256, bias=True)
  (relu1): ReLU()
  (drop1): Dropout(p=0.2, inplace=False)
  (fc2): Linear(in_features=256, out_features=256, bias=True)
  (relu2): ReLU()
  (drop2): Dropout(p=0.2, inplace=False)
  (out): Linear(in_features=256, out_features=10, bias=True)
)


Scorer(
  (model): MLP(
    (model): Sequential(
      (fc1): Linear(in_features=784, out_features=256, bias=True)
      (relu1): ReLU()
      (drop1): Dropout(p=0.2, inplace=False)
      (fc2): Linear(in_features=256, out_features=256, bias=True)
      (relu2): ReLU()
      (drop2): Dropout(p=0.2, inplace=False)
      (out): Linear(in_features=256, out_features=10, bias=True)
    )
  )
)

In [0]:
# Create the network
nz = 10
netG = Generator(nz=nz, ngf=64, nc=1)
netD = Discriminator(nc=1, ndf=64)

netD = netD.to(device)
netG = netG.to(device)

### Training loop

Implement the training loop in the cell below. The recommended hyperparameters:
* Optimizer of the discriminator: Adam with learning rate 0.0002 and `betas=(0.5, 0.999)`
* Optimizer of the generator:     Adam with learning rate 0.0002 and `betas=(0.5, 0.999)`

Hints:
- We will use the scorer defined above to assess the quality of the generated samples. The desired level of .7 should be reached within 15-20 epochs.
- You can use the following code to track the training progress. The code plots some generated images and computes the score that we use to evaluate the trained model. Note that the images fed to the scorer need to be normalized to be in the range [0, 1].
```
with torch.no_grad():
    # Plot generated images
    z = torch.randn(144, nz, 1, 1, device=device)
    samples = netG(z)
    tools.plot_generated_samples(samples)
    
    # Compute score
    z = torch.randn(1000, nz, 1, 1, device=device)
    samples = netG(z)
    samples = (samples + 1) / 2  # Re-normalize to [0, 1]
    score = scorer(samples)
```
- The quality of the generated images should be good (better than with the PixelCNN model).
- You can track `D_real` and `D_fake` returned by function `discriminator_loss()`. When it is hard for the discriminator to separate real and fake images, their values are close to 0.5.

In [0]:
### plt.imshow(tmp[0].permute(1,2,0), cmap='gray')
# transforms.ToPILImage()(tmp[5])

In [79]:
import time
if not skip_training:
    # YOUR CODE HERE
    d_optim = torch.optim.Adam(params=netD.parameters(), lr=0.0002, betas=(0.5, 0.999))
    g_optim = torch.optim.Adam(params=netG.parameters(), lr=0.0002, betas=(0.5, 0.999))
    for epoch in range(15):
        running_d_loss_real = []
        running_d_loss_fake = []
        running_g_loss = []
        running_d_acc_real = []
        running_d_acc_fake = []
        start = time.time()
        for i, (real_images, labels) in enumerate(data_loader):
            real_images, labels = real_images.to(device), labels.to(device)
            
            d_optim.zero_grad()
            #g_optim.zero_grad() we need to zero_grad after discriminator step
            netD.train()
            netG.train()
            
            #generate fake images
            z = torch.randn(100, nz, 1, 1, device=device)
            fake_images =netG(z).detach()
            
            #calculate loss
            d_loss_real, mean_real, d_loss_fake, mean_fake = discriminator_loss(netD, real_images, fake_images)
            
            #train discriminator
            d_loss = d_loss_fake + d_loss_real
            d_loss.backward()
            d_optim.step()
            
            #train generator
            g_optim.zero_grad()
            z = torch.randn(100, nz, 1, 1, device=device)
            gen_images =netG(z)
            g_loss = generator_loss(netD, gen_images)
            g_loss.backward()
            g_optim.step()
            
            running_d_loss_real.append(d_loss_real.item())
            running_d_loss_fake.append(d_loss_fake.item())
            running_d_acc_real.append(mean_real)
            running_d_acc_fake.append(mean_fake)
            running_g_loss.append(g_loss.item())
            if i%100==0: print(i, end=" ")

        end = time.time() 
        var1, var2, var3, var4, var5 = np.mean(running_d_loss_real), np.mean(running_d_loss_fake), np.mean(running_d_acc_fake), np.mean(running_d_acc_real), np.mean(running_g_loss)
        print(f"\n{epoch} d_loss_real:{var1} d_loss_fake:{var2} d_acc_fake:{var3} d_acc_real:{var4} g_loss:{var5} time:{end-start}")

        with torch.no_grad():
        # Plot generated images
          z = torch.randn(144, nz, 1, 1, device=device)
          samples = netG(z)
          tools.plot_generated_samples(samples)
        
          # Compute score
          z = torch.randn(1000, nz, 1, 1, device=device)
          samples = netG(z)
          samples = (samples + 1) / 2  # Re-normalize to [0, 1]
          score = scorer(samples)
          print("score:", score)
            
            
            
    #raise NotImplementedError()

Output hidden; open in https://colab.research.google.com to view.

In [80]:
# Save the model to disk (the pth-files will be submitted automatically together with your notebook)
if not skip_training:
    tools.save_model(netG, '11_dcgan_g.pth')
    tools.save_model(netD, '11_dcgan_d.pth')
else:
    nz = 10
    netG = Generator(nz=nz, ngf=64, nc=1)
    netD = Discriminator(nc=1, ndf=64)

    tools.load_model(netG, '11_dcgan_g.pth', device)
    tools.load_model(netD, '11_dcgan_d.pth', device)

Do you want to save the model (type yes to confirm)? yes
Model saved to 11_dcgan_g.pth.
Do you want to save the model (type yes to confirm)? yes
Model saved to 11_dcgan_d.pth.


In [81]:
# Evaluate generated samples
with torch.no_grad():
    z = torch.randn(1000, nz, 1, 1, device=device)
    samples = (netG(z) + 1) / 2
    score = scorer(samples)

print(f'The trained DCGAN achieves a score of {score:.5f}')
assert score >= 0.7, "Poor GAN score! Check your architecture and training."
print('Success')

The trained DCGAN achieves a score of 0.71475
Success
