# Deep Convolutional GANs

In this notebook, you'll build a GAN using convolutional layers in the generator and discriminator. This is called a Deep Convolutional GAN, or DCGAN for short. The DCGAN architecture was first explored in 2016 and has seen impressive results in generating new images; you can read the [original paper, here](https://arxiv.org/pdf/1511.06434.pdf).

You'll be training DCGAN on the [CIFAR10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset. These are color images of different classes, such as airplanes, dogs or trucks. This dataset is much more complex and diverse than the MNIST dataset and justifies the use of the DCGAN architecture.

<img src='assets/cifar10_data.png' width=80% />


So, our goal is to create a DCGAN that can generate new, realistic-looking images. We'll go through the following steps to do this:
* Load in and pre-process the CIFAR10 dataset
* **Define discriminator and generator networks**
* Train these adversarial networks
* Visualize the loss over time and some sample, generated images

In this notebook, we will focus on defining the networks.

#### Deeper Convolutional Networks

Since this dataset is more complex than our MNIST data, we'll need a deeper network to accurately identify patterns in these images and be able to generate new ones. Specifically, we'll use a series of convolutional or transpose convolutional layers in the discriminator and generator. It's also necessary to use batch normalization to get these convolutional networks to train. 

Besides these changes in network structure, training the discriminator and generator networks should be the same as before. That is, the discriminator will alternate training on real and fake (generated) images, and the generator will aim to trick the discriminator into thinking that its generated images are real!

## Discriminator

Here you'll build the discriminator. This is a convolutional classifier like you've built before, only without any maxpooling layers. 
* The inputs to the discriminator are 32x32x3 tensor images
* You'll want a few convolutional, hidden layers
* Then a fully connected layer for the output; as before, we want a sigmoid output, but we'll add that in the loss function, [BCEWithLogitsLoss](https://pytorch.org/docs/stable/nn.html#bcewithlogitsloss), later

<img src='assets/conv_discriminator.png' width=80%/>

For the depths of the convolutional layers I suggest starting with 32 filters in the first layer, then double that depth as you add layers (to 64, 128, etc.). Note that in the DCGAN paper, they did all the downsampling using only strided convolutional layers with no maxpooling layers.

You'll also want to use batch normalization with [nn.BatchNorm2d](https://pytorch.org/docs/stable/nn.html#batchnorm2d) on each layer **except** the first convolutional layer and final, linear output layer. 

#### Helper `ConvBlock` module 

In general, each layer should look something like convolution > batch norm > leaky ReLU, and so we'll define a **custom torch Module** to put these layers together. This module will create a sequential series of a convolutional + an optional batch norm layer. 

Note: It is also suggested that you use a **kernel_size of 4** and a **stride of 2** for strided convolutions.

### First exercise

Implement the `ConvBlock` module below and use it for your implementation of the `Discriminator` module. Your discriminator should take a 32x32x3 image as input and output a single logit.

In [1]:
import torch
import torch.nn as nn

import tests

In [2]:
class ConvBlock(nn.Module):
    """
    A convolutional block is made of 3 layers: Conv -> BatchNorm -> Activation.
    args:
    - in_channels: number of channels in the input to the conv layer
    - out_channels: number of filters in the conv layer
    - kernel_size: filter dimension of the conv layer
    - batch_norm: whether to use batch norm or not
    """
    def __init__(self, in_channels: int, out_channels: int, kernel_size: int, batch_norm: bool = True):
        super(ConvBlock, self).__init__()
        ####
        # IMPLEMENT HERE
        ####
        if (batch_norm):
            self.model = nn.Sequential(
            nn.Conv2d(in_channels,out_channels,kernel_size = kernel_size,stride = 2),
            nn.BatchNorm2d(out_channels),
            nn.LeakyReLU(inplace = True)
            )
        else:
            self.model = nn.Sequential(
            nn.Conv2d(in_channels,out_channels,kernel_size = kernel_size),
            nn.LeakyReLU(inplace = True)
            )
            
        
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        ####
        # IMPLEMENT HERE
        ####
        x = self.model(x)
        return x

In [3]:
class Discriminator(nn.Module):
    """
    The discriminator model adapted from the DCGAN paper. It should only contains a few layers.
    args:
    - conv_dim: control the number of filters
    """
    def __init__(self, conv_dim: int):
        super(Discriminator, self).__init__()
        ####
        # IMPLEMENT HERE
        self.model = nn.Sequential(
            ConvBlock(3,32,4,False),
            ConvBlock(32,64,3),
            ConvBlock(64,128,4),
            ConvBlock(128,256,4),
            nn.Flatten(),
            nn.Linear(1024,1),
            nn.Sigmoid()
            
        )
        
        ####

    def forward(self, x):
        ####
        # IMPLEMENT HERE
        ####      
        x = self.model(x)
        return x

In [4]:
discriminator = Discriminator(64)
print(discriminator)

Discriminator(
  (model): Sequential(
    (0): ConvBlock(
      (model): Sequential(
        (0): Conv2d(3, 32, kernel_size=(4, 4), stride=(1, 1))
        (1): LeakyReLU(negative_slope=0.01, inplace=True)
      )
    )
    (1): ConvBlock(
      (model): Sequential(
        (0): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2))
        (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): LeakyReLU(negative_slope=0.01, inplace=True)
      )
    )
    (2): ConvBlock(
      (model): Sequential(
        (0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2))
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): LeakyReLU(negative_slope=0.01, inplace=True)
      )
    )
    (3): ConvBlock(
      (model): Sequential(
        (0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2))
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): LeakyReLU(

In [5]:
tests.check_discriminator(discriminator)

Congrats, you successfully implemented your discriminator


## Generator

Next, you'll build the generator network. The input will be our noise vector `z`, as before. And, the output will be a $tanh$ output, but this time with size 32x32 which is the size of our CIFAR10 images.

<img src='assets/conv_generator.png' width=80% />

What's new here is we'll use transpose convolutional layers to create our new images. 
* The first layer is a fully connected layer which is reshaped into a deep and narrow layer, something like 4x4x512. 
* Then, we use batch normalization and a leaky ReLU activation. 
* Next is a series of [transpose convolutional layers](https://pytorch.org/docs/stable/nn.html#convtranspose2d), where you typically halve the depth and double the width and height of the previous layer. 
* And, we'll apply batch normalization and ReLU to all but the last of these hidden layers. Where we will just apply a `tanh` activation.

#### Helper `DeconvBlock` module

For each of these layers, the general scheme is transpose convolution > batch norm > ReLU, and so we'll define a function to put these layers together. This function will create a sequential series of a transpose convolutional + an optional batch norm layer. We'll create these using PyTorch's Sequential container, which takes in a list of layers and creates layers according to the order that they are passed in to the Sequential constructor.

Note: It is also suggested that you use a **kernel_size of 4** and a **stride of 2** for transpose convolutions.

#### Second exercise

Implement the `DeconvBlock` module below and use it for your implementation of the `Generator` module. Your generator should take a latent vector of dimension 128 as input and output a 32x32x3 image.

In [14]:
class DeconvBlock(nn.Module):
    """
    A "de-convolutional" block is made of 3 layers: ConvTranspose -> BatchNorm -> Activation.
    args:
    - in_channels: number of channels in the input to the conv layer
    - out_channels: number of filters in the conv layer
    - kernel_size: filter dimension of the conv layer
    - stride: stride of the conv layer
    - padding: padding of the conv layer
    - batch_norm: whether to use batch norm or not
    """
    def __init__(self, 
                 in_channels: int, 
                 out_channels: int, 
                 kernel_size: int, 
                 stride: int,
                 padding: int,
                 batch_norm: bool = True):
        ####
        # IMPLEMENT HERE
        super(DeconvBlock, self).__init__()
        if (batch_norm):
            self.model = nn.Sequential(
                nn.ConvTranspose2d(in_channels,out_channels,kernel_size,stride = 2,padding = padding),
                nn.BatchNorm2d(out_channels),
                nn.LeakyReLU(inplace = True)
            )
        else:
            self.model = nn.Sequential(
                nn.ConvTranspose2d(in_channels,out_channels,kernel_size,stride = 2,padding =padding),
                nn.Tanh()
            )
        ####
        
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        ####
        # IMPLEMENT HERE
        ####
        x =self.model(x)
        return x

In [15]:
class Generator(nn.Module):
    """
    The generator model adapted from DCGAN
    args:
    - latent_dim: dimension of the latent vector
    - conv_dim: control the number of filters in the convtranspose layers
    """
    def __init__(self, latent_dim: int, conv_dim: int = 32):
        super(Generator, self).__init__()
        ####
        # IMPLEMENT HERE
        ####
        self.fc1 = nn.Linear(latent_dim,8192)
        self.norm1 = nn.BatchNorm1d(8192)
#         self.norm2 = nn.BatchNorm2d(256)
#         self.norm3 = nn.BatchNorm2d(128)
#         self.norm4 = nn.BatchNorm2d(64)
#         self.norm5 = nn.BatchNorm2d(32)
#         self.norm6 = nn.BatchNorm2d(16)
        self.lrelu = nn.LeakyReLU()
#         self.tan = nn.Tanh()
        self.tconv1 = DeconvBlock(512,256,4,2,0,True)
        self.tconv2 = DeconvBlock(256,128,4,2,0,True)
        self.tconv3 = DeconvBlock(128,64,4,1,0,True)
        self.tconv4 = DeconvBlock(64,32,4,1,0,True)
        self.tconv5 = DeconvBlock(32,16,4,1,0,True)
        self.tconv6 = DeconvBlock(16,3,4,1,1,False)
        
        
    def forward(self, x):
        ####
        # IMPLEMENT HERE
        ####
        x = self.fc1(x)
        x = self.norm1(x)
        x = self.lrelu(x)
        x = x.view(x.size(0),512,4,4)
        x = self.tconv1(x)
        x = self.tconv2(x)
        x = self.tconv3(x)
        x = self.tconv4(x)
        x = self.tconv5(x)
        x = self.tconv6(x)
        
        return x

In [16]:
generator = Generator(128)
print(generator)

Generator(
  (fc1): Linear(in_features=128, out_features=8192, bias=True)
  (norm1): BatchNorm1d(8192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (lrelu): LeakyReLU(negative_slope=0.01)
  (tconv1): DeconvBlock(
    (model): Sequential(
      (0): ConvTranspose2d(512, 256, kernel_size=(4, 4), stride=(2, 2))
      (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): LeakyReLU(negative_slope=0.01, inplace=True)
    )
  )
  (tconv2): DeconvBlock(
    (model): Sequential(
      (0): ConvTranspose2d(256, 128, kernel_size=(4, 4), stride=(2, 2))
      (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): LeakyReLU(negative_slope=0.01, inplace=True)
    )
  )
  (tconv3): DeconvBlock(
    (model): Sequential(
      (0): ConvTranspose2d(128, 64, kernel_size=(4, 4), stride=(2, 2))
      (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): LeakyReL

In [17]:
tests.check_generator(generator, 128)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (2048x1 and 128x8192)