# DCGAN
A DCGAN is a direct extension of the GAN that uses convolutional and convolutional-transpose layers in the discriminator and generator, respectively.


<img src="https://i0.wp.com/neptune.ai/wp-content/uploads/DCGAN-generator-discriminator.png" alt="drawing" width="1300"/>

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.datasets as datasets
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

In [None]:
device = torch.device("cuda:0" if (torch.cuda.is_available()) else "cpu")

print(device) 

## Hyper Parameters

ToDo Add image for Generator and discriminator together

In [None]:
# The batch size used in training. The DCGAN paper uses a batch size of 128
BATCH_SIZE = 128
# The spatial size of the images used for training. This implementation defaults to 64x64. 
# If another size is desired, the structures of D and G must be changed. 
IMAGE_SIZE = 64
# number of color channels in the input images. For color images this is 3
CHANNELS_IMG = 1
## length of latent vector
NOISE_DIM = 100
# relates to the depth of feature maps carried through the generator
FEATURES_GEN = 64 
# sets the depth of feature maps propagated through the discriminator
FEATURES_DISC = 64
#  number of training epochs to run. Training for longer will probably lead to better results but will also take much longer
NUM_EPOCHS = 5 
## learning rate for training. As described in the DCGAN paper, this number should be 0.0002
LEARNING_RATE =0.0002
## beta1 hyperparameter for Adam optimizers. As described in paper, this number should be 0.5
beta1 =  0.5

## Discriminator
The discriminator is made up of 
1. strided convolution layers, 
2. batch norm layers https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html , and 
3. LeakyReLU activations. 

The input is a 1x64x64 input image and the output is a scalar probability that the input is from the real data distribution.

How does "conv2d" change the shape of image batch?

In [None]:
image = torch.rand(16, 3, 32, 32)
conv_filter = torch.nn.Conv2d(in_channels=3,out_channels=1, kernel_size=5,stride=1, padding=0)
output_feature = conv_filter(image)
print(output_feature.shape)


you can define the network like
```python
discriminator = nn.Sequential(
    # input layer
    nn.Conv2d(channel_image, features_d, 4, 2, 1, bias=False),
    nn.LeakyReLU(0.2, inplace=True),
    # hidden layer 1
    nn.Conv2d(features_d, features_d * 2, 4, 2, 1, bias=False),
    nn.BatchNorm2d(features_d * 2),
    nn.LeakyReLU(0.2, inplace=True),
    # hidden layer 2
    nn.Conv2d(features_d * 2, features_d * 4, 4, 2, 1, bias=False),
    nn.BatchNorm2d(features_d * 4),
    nn.LeakyReLU(0.2, inplace=True),
    # hidden layer 3
    nn.Conv2d(features_d * 4, features_d * 8, 4, 2, 1, bias=False),
    nn.BatchNorm2d(features_d * 8),
    nn.LeakyReLU(0.2, inplace=True),
    # output layer
    nn.Conv2d(features_d * 8, 1, 4, 1, 0, bias=False),
    nn.Sigmoid()
)
```

In [None]:
class Discriminator(nn.Module):
    def __init__(self, channels_img, features_d):
        super(Discriminator, self).__init__()
        self.disc = nn.Sequential(
            # input: N x channels_img x 64 x 64
            nn.Conv2d(
                channels_img, features_d, kernel_size=4, stride=2, padding=1
            ),
            nn.LeakyReLU(0.2),
            # _block(in_channels, out_channels, kernel_size, stride, padding)
            # state size. (features_d) x 32 x 32
            self._block(features_d, features_d * 2, 4, 2, 1),
            # state size. (features_d) x 32 x 32
            self._block(features_d * 2, features_d * 4, 4, 2, 1),
            self._block(features_d * 4, features_d * 8, 4, 2, 1),
            # After all _block img output is 4x4 (Conv2d below makes into 1x1)
            nn.Conv2d(features_d * 8, 1, kernel_size=4, stride=2, padding=0),
            nn.Sigmoid(),
        )

    def _block(self, in_channels, out_channels, kernel_size, stride, padding):
        return nn.Sequential(
            nn.Conv2d(
                in_channels,
                out_channels,
                kernel_size,
                stride,
                padding,
                bias=False,
            ),
            nn.BatchNorm2d(out_channels),
            nn.LeakyReLU(0.2),
        )

    def forward(self, x):
        return self.disc(x)

In [None]:
# Create the Discriminator
disc = Discriminator(CHANNELS_IMG, FEATURES_DISC).to(device)

# Print the model
print(disc)

## Generator
The generator is comprised of
1. convolutional-transpose layers, 
2. batch norm layers, and 
3. ReLU activations. 
The input is a latent vector, $z$, that is drawn from a standard normal distribution and the output is a 3x64x64 RGB image.  The strided conv-transpose layers allow the latent vector to be transformed into a volume with the same shape as an image. 

<img src="https://pytorch.org/tutorials/_images/dcgan_generator.png" alt="drawing" width="600"/>

In [None]:
class Generator(nn.Module):
    def __init__(self, channels_noise, channels_img, features_g):
        super(Generator, self).__init__()
        self.net = nn.Sequential(
            # Input: N x channels_noise x 1 x 1
            self._block(channels_noise, features_g * 16, 4, 1, 0),  # img: 4x4
            self._block(features_g * 16, features_g * 8, 4, 2, 1),  # img: 8x8
            self._block(features_g * 8, features_g * 4, 4, 2, 1),  # img: 16x16
            self._block(features_g * 4, features_g * 2, 4, 2, 1),  # img: 32x32
            nn.ConvTranspose2d(
                features_g * 2, channels_img, kernel_size=4, stride=2, padding=1
            ),
            # Output: N x channels_img x 64 x 64
            nn.Tanh(),
        )

    def _block(self, in_channels, out_channels, kernel_size, stride, padding):
        return nn.Sequential(
            nn.ConvTranspose2d(
                in_channels,
                out_channels,
                kernel_size,
                stride,
                padding,
                bias=False,
            ),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(),
        )

    def forward(self, x):
        return self.net(x)


In [None]:
gen = Generator(NOISE_DIM, CHANNELS_IMG, FEATURES_GEN).to(device)

# Print the model
print(gen)

## Loss Functions and Optimizers
In the paper, the authors also give **some tips** about
1. how to setup the optimizers, 
2. how to calculate the loss functions, and 
3. how to initialize the model weights

<img src="https://imgur.com/8r5d9bX.png" alt="drawing" width="900"/>



With $D$ and $G$ setup, we can specify how they learn through the loss functions and optimizers. We will use the Binary Cross Entropy loss (BCELoss) function which is defined in PyTorch as:

$$\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - \left[ y_n \cdot \log x_n + (1 - y_n) \cdot \log (1 - x_n) \right]$$

Notice how this function provides the **calculation of both log components in the objective function** $log(D(x))$ and $log(1-D(G(z)))$. We can specify what part of the BCE equation to use with the $y$ input. 

When we train we define our real label as $y=1$ and the fake label as $y=0$.  

we will set up two separate optimizers, one for $D$ and one for $G$. As specified in the DCGAN paper, both are Adam optimizers with learning rate 0.0002 and Beta1 = 0.5. 

In [None]:
# Initialize BCELoss function
criterion = nn.BCELoss()

# Establish convention for real and fake labels during training
real_label = 1.
fake_label = 0.

# Setup Adam optimizers for both G and D
optimizerD = optim.Adam(disc.parameters(), lr=LEARNING_RATE, betas=(beta1, 0.999))
optimizerG = optim.Adam(gen.parameters(), lr=LEARNING_RATE, betas=(beta1, 0.999))

Note, we will train the generator using the loss 
$$\max  \frac{1}{m} \sum  \log(D(G(z^{(i)})) $$ 

<img src="https://imgur.com/F55vNtT.png" alt="drawing" width="1200"/>



## Dataset

In [None]:
transforms = transforms.Compose(
    [
        transforms.Resize(IMAGE_SIZE),
        transforms.ToTensor(),
        transforms.Normalize(
            [0.5 for _ in range(CHANNELS_IMG)], [0.5 for _ in range(CHANNELS_IMG)]
        ),
    ]
)
# https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html#torchvision.datasets.MNIST
dataset = datasets.MNIST(root="dataset/", train=True, transform=transforms,
                       download=True)


dataloader = DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=True)

## Weight Initialization
From the DCGAN paper, the authors specify that all model weights shall be randomly initialized from a Normal distribution with mean=0, stdev=0.02. The weights_init function takes an initialized model as input and reinitializes all convolutional, convolutional-transpose, and batch normalization layers to meet this criteria. This function is applied to the models immediately after initialization.


In [None]:
def weights_init(m):
    classname = m.__class__.__name__
    if classname.find('Conv') != -1:
        nn.init.normal_(m.weight.data, 0.0, 0.02)
    elif classname.find('BatchNorm') != -1:
        nn.init.normal_(m.weight.data, 1.0, 0.02)
        nn.init.constant_(m.bias.data, 0)

In [None]:
weights_init(gen)
weights_init(disc)

## Preparation of Training 

In [None]:
writer_real = SummaryWriter(f"logs1/real")
writer_fake = SummaryWriter(f"logs1/fake")
step = 0

gen.train()
disc.train()

## Preparation of Evaluation 
we will do some statistic reporting and at the end of each epoch we will push our fixed_noise batch through the generator to visually track the progress of G’s training. The training statistics reported are:

* **$Loss_D$** - discriminator loss calculated as the sum of losses for the all real and all fake batches $(log(D(x)) + log(1 - D(G(z)))$.
* **$Loss_G$** - generator loss calculated as $log(D(G(z)))$
* **$D(x)$** - the average output (across the batch) of the discriminator for the all real batch. This should start close to 1 then theoretically converge to 0.5 when G gets better. Think about why this is.
* **$D(G(z))$** - average discriminator outputs for the all fake batch. The first number is before D is updated and the second number is after D is updated. These numbers should start near 0 and converge to 0.5 as G gets better. Think about why this is

In [None]:
# Create batch of latent vectors that we will use to visualize
#  the progression of the generator
fixed_noise = torch.randn(BATCH_SIZE, NOISE_DIM, 1, 1).to(device)
print(fixed_noise.shape)

## Training

In [None]:
print("Starting Training Loop...")

for epoch in range(NUM_EPOCHS):
    # Target labels not needed! <3 unsupervised
    for batch_idx, (real, _) in enumerate(dataloader):
        real = real.to(device)
        noise = torch.randn(BATCH_SIZE, NOISE_DIM, 1, 1).to(device)
        fake = gen(noise)

        ############################
        # (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
        ###########################
        disc_real = disc(real).reshape(-1)
        loss_disc_real = criterion(disc_real, torch.ones_like(disc_real))
        disc_fake = disc(fake.detach()).reshape(-1)
        loss_disc_fake = criterion(disc_fake, torch.zeros_like(disc_fake))
        loss_disc = (loss_disc_real + loss_disc_fake) / 2
        disc.zero_grad()
        loss_disc.backward()
        optimizerD.step()

        ############################
        # (2) Update G network:  min log(1 - D(G(z))) <-> max log(D(G(z)) 
        ###########################
        output = disc(fake).reshape(-1)
        loss_gen = criterion(output, torch.ones_like(output))
        gen.zero_grad()
        loss_gen.backward()
        optimizerG.step()

        
        # Print losses occasionally and print to tensorboard
        if batch_idx % 100 == 0:
            print(
                f"Epoch [{epoch}/{NUM_EPOCHS}] Batch {batch_idx}/{len(dataloader)} \
                  Loss D: {loss_disc:.4f}, loss G: {loss_gen:.4f}"
            )

            with torch.no_grad():
                fake = gen(fixed_noise)
                # take out (up to) 32 examples
                img_grid_real = torchvision.utils.make_grid(
                    real[:32], normalize=True
                )
                img_grid_fake = torchvision.utils.make_grid(
                    fake[:32], normalize=True
                )

                writer_real.add_image("Real", img_grid_real, global_step=step)
                writer_fake.add_image("Fake", img_grid_fake, global_step=step)
            
            writer_real.add_scalar("Loss_Gen",loss_gen.item(), global_step=step)
            writer_real.add_scalar("Loss_Dis",loss_disc.item(), global_step=step)
            step += 1

## References:
1. https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html 
2. https://github.com/aladdinpersson/Machine-Learning-Collection/blob/master/ML/Pytorch/GANs/2.%20DCGAN/
