<div style="line-height:0.5">
<div style="line-height:1">
<h1 style="color:#BF66F2 ">  Deep Convolutional Generative Adversarial Network <br> in PyTorch 1 </h1>
<div style="line-height:0.2">
<h4> Implementation of a DCGANs model, trained on the MNIST dataset. </h4>
<span style="display: inline-block;">
    <h3 style="color: lightblue; display: inline;">Keywords:</h3>
    import another notebook + nn.ConvTranspose2d() + nn.LeakyReLU() + transforms.Normalize() + torchvision.utils.make_grid()
</span>
</div> 
</div> 

<h2 style="color:#BF66F2 "> Recap: DCGANs </h2>
<div style="margin-top: -29px;">
GAN architecture specifically designed for generating high-quality images. <br>
DCGANs use convolutional neural networks (CNNs) as both the generator and discriminator models => well-suited for image synthesis tasks.
</div>

<h3> Differences between DCGANs and standard GANs: </h3>
<div style="margin-top: -20px;">


+ Architecture: <br> 
DCGANs use convolutional layers in both the generator and discriminator networks, allowing them to capture spatial features in the images. <br> In contrast, standard GANs typically use fully connected layers, which are less effective for image generation tasks.

+ Stabilized Training: DCGANs introduce some stability-enhancing techniques in training, such as using batch normalization in both the generator and discriminator <br> 
and using specific activation functions like Leaky ReLU. 
<br> These techniques help prevent issues like mode collapse and vanishing gradients.

+ Preprocessing: DCGANs require minimal preprocessing of the data, often just scaling the pixel values to a range between -1 and 1. <br>
In contrast, some standard GANs may require more complex data preprocessing steps.

+ Image Generation Quality: DCGANs are known for producing higher-quality images with more realistic details compared to standard GANs. <br> This is due to the use of convolutional layers, which are better at capturing local image patterns and structures.

In [2]:
#%%script echo Uncomment if not on Colab
%ls -lt /content

total 4
drwxr-xr-x 1 root root 4096 Jul 20 13:28 [0m[01;34msample_data[0m/


In [1]:
%%script echo Skipping:Just for Colab, used to import other module (another ipynb notebook)
from google.colab import drive
drive.mount('/content/drive')
%run '/content/drive/MyDrive/my_folder/torch_21_DCGANs_model.ipynb'

Skipping:Just for Colab, used to import other module (another ipynb notebook)


In [6]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.datasets as datasets
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

In [7]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device

device(type='cuda')

In [8]:
class Discriminator(nn.Module):
    """ Discriminator Network model.

    Args:
        - channels_img: Number of image channels
        - features_d: Feature depth 
    """    
    def __init__(self, channels_img, features_d):
        super(Discriminator, self).__init__()
        # Create generator model as a sequential module
        self.disc = nn.Sequential(
            # First deconv block, takes noise, upsample (input: N x channels_img x 64 x 64)
            nn.Conv2d(channels_img, features_d, kernel_size=4, stride=2, padding=1),
            nn.LeakyReLU(0.2),
            ### Additional deconv blocks 
            # _block(in_channels, out_channels, kernel_size, stride, padding)
            self._block(features_d, features_d * 2, 4, 2, 1),
            self._block(features_d * 2, features_d * 4, 4, 2, 1),
            self._block(features_d * 4, features_d * 8, 4, 2, 1),
            # Output deconv layer after all "_block img" output is 4x4 (Conv2d below makes into 1x1)
            nn.Conv2d(features_d * 8, 1, kernel_size=4, stride=2, padding=0),
            
            nn.Sigmoid(),
        )

    def _block(self, in_channels, out_channels, kernel_size, stride, padding):
        """ Add another Convolutional block with Batchnorm and LeakyReLU. """
        return nn.Sequential(
            nn.Conv2d(
                in_channels,
                out_channels,
                kernel_size,
                stride,
                padding,
                bias=False,
            ),
            # nn.BatchNorm2d(out_channels),
            nn.LeakyReLU(0.2),
        )

    def forward(self, x):
        """ Perform forward pass."""
        return self.disc(x)


class Generator(nn.Module):
    """ Generator Network model.

    Attributes:
        - channels_noise: Noise channels
        - channels_img: Image channels
        - features_g: Feature depth
    """    
    def __init__(self, channels_noise, channels_img, features_g):
        super(Generator, self).__init__()
        self.net = nn.Sequential(
            #### Additional deconv blocks (Input: N x channels_noise x 1 x 1)
            self._block(channels_noise, features_g * 16, 4, 1, 0),  # img: 4x4
            self._block(features_g * 16, features_g * 8, 4, 2, 1),  # img: 8x8
            self._block(features_g * 8, features_g * 4, 4, 2, 1),  # img: 16x16
            self._block(features_g * 4, features_g * 2, 4, 2, 1),  # img: 32x32
            # Output layer (N x channels_img x 64 x 64)
            nn.ConvTranspose2d(
                features_g * 2, channels_img, kernel_size=4, stride=2, padding=1
            ),
            # Tanh activation for image pixels 
            nn.Tanh(),
        )

    def _block(self, in_channels, out_channels, kernel_size, stride, padding):
        return nn.Sequential(
            nn.ConvTranspose2d(
                in_channels,
                out_channels,
                kernel_size,
                stride,
                padding,
                bias=False,
            ),
            # nn.BatchNorm2d(out_channels),
            nn.ReLU(),
        )

    def forward(self, x):
        """ Perform forward pass."""
        return self.net(x)

In [None]:
def initialize_weights(model):
    """ Initialize weights of the given model """
    for m in model.modules():
        if isinstance(m, (nn.Conv2d, nn.ConvTranspose2d, nn.BatchNorm2d)):
            nn.init.normal_(m.weight.data, 0.0, 0.02)


def test():
    N, in_channels, H, W = 8, 3, 64, 64
    noise_dim = 100
    x = torch.randn((N, in_channels, H, W))
    disc = Discriminator(in_channels, 8)
    assert disc(x).shape == (N, 1, 1, 1), "Discriminator test failed"
    gen = Generator(noise_dim, in_channels, 8)
    z = torch.randn((N, noise_dim, 1, 1))
    assert gen(z).shape == (N, in_channels, H, W), "Generator test failed"
    print("Success, tests passed!")

In [9]:
""" Hyperparameters."""
LEARNING_RATE = 2e-4      #0.0002
BATCH_SIZE = 128
IMAGE_SIZE = 64
CHANNELS_IMG = 1
NOISE_DIM = 100
NUM_EPOCHS = 5
FEATURES_DISC = 64
FEATURES_GEN = 64

In [10]:
""" Define a sequence of image transformations using Compose() that are applied to the training images:
    1. Resize the images to the specified IMAGE_SIZE
    2. Convert the input image to a PyTorch tensor
    3. Normalize the pixel values of the images to the range [-1, 1],\\
        The mean and standard deviation are set to [0.5, 0.5, ..., 0.5] for each channel.
"""
transforms = transforms.Compose(
    [
        transforms.Resize(IMAGE_SIZE),
        transforms.ToTensor(),
        transforms.Normalize(
            [0.5 for _ in range(CHANNELS_IMG)], [0.5 for _ in range(CHANNELS_IMG)]
        ),
    ]
)

In [11]:
dataset = datasets.MNIST( root="dataset/", train=True, transform=transforms, download=True)   #channels_img=1

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to dataset/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:00<00:00, 290796178.58it/s]

Extracting dataset/MNIST/raw/train-images-idx3-ubyte.gz to dataset/MNIST/raw






Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to dataset/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 20065544.78it/s]


Extracting dataset/MNIST/raw/train-labels-idx1-ubyte.gz to dataset/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to dataset/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:00<00:00, 134231811.59it/s]


Extracting dataset/MNIST/raw/t10k-images-idx3-ubyte.gz to dataset/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to dataset/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 5642929.14it/s]

Extracting dataset/MNIST/raw/t10k-labels-idx1-ubyte.gz to dataset/MNIST/raw






In [12]:
dataloader = DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=True)

gen = Generator(NOISE_DIM, CHANNELS_IMG, FEATURES_GEN).to(device)
disc = Discriminator(CHANNELS_IMG, FEATURES_DISC).to(device)

In [13]:
initialize_weights(gen)
initialize_weights(disc)

In [14]:
### Optimizers and Loss
opt_gen = optim.Adam(gen.parameters(), lr=LEARNING_RATE, betas=(0.5, 0.999))
opt_disc = optim.Adam(disc.parameters(), lr=LEARNING_RATE, betas=(0.5, 0.999))
criterion = nn.BCELoss()

In [15]:
""" Create a fixed_noise Tensor used as input to the generator during training
1. Generate random nums from a standard normal distribution with mean 0 and standard deviation 1.\\
    The tensor has 4 dimensions: batch size (32 samples), number of noise dimensions (100), height, and width.
2. Move the tensor to the specified device
"""

fixed_noise = torch.randn(32, NOISE_DIM, 1, 1).to(device)

In [16]:
""" Writes entries directly to event files in the log_dir to be consumed by TensorBoard. """
writer_real = SummaryWriter(f"logs/real")
writer_fake = SummaryWriter(f"logs/fake")

In [17]:
step = 0
gen.train()
disc.train()

Discriminator(
  (disc): Sequential(
    (0): Conv2d(1, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.2)
    (2): Sequential(
      (0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
      (1): LeakyReLU(negative_slope=0.2)
    )
    (3): Sequential(
      (0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
      (1): LeakyReLU(negative_slope=0.2)
    )
    (4): Sequential(
      (0): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
      (1): LeakyReLU(negative_slope=0.2)
    )
    (5): Conv2d(512, 1, kernel_size=(4, 4), stride=(2, 2))
    (6): Sigmoid()
  )
)

In [18]:
""" Training. 
N.B.
Check torch_20_GAN file for further info.
"""
for epoch in range(NUM_EPOCHS):
    # Target labels not needed! <3 unsupervised
    for batch_idx, (real, _) in enumerate(dataloader):
        real = real.to(device)
        noise = torch.randn(BATCH_SIZE, NOISE_DIM, 1, 1).to(device)
        fake = gen(noise)

        ############## Train Discriminator
        disc_real = disc(real).reshape(-1)
        loss_disc_real = criterion(disc_real, torch.ones_like(disc_real))
        disc_fake = disc(fake.detach()).reshape(-1)
        loss_disc_fake = criterion(disc_fake, torch.zeros_like(disc_fake))
        loss_disc = (loss_disc_real + loss_disc_fake) / 2
        ### Gradients to zero, backpropagate and update Generator's parameters
        disc.zero_grad()
        loss_disc.backward()
        opt_disc.step()

        ##################### Train Generator
        output = disc(fake).reshape(-1)
        loss_gen = criterion(output, torch.ones_like(output))
        ### Gradients to zero, backpropagate and update Generator's parameters
        gen.zero_grad()
        loss_gen.backward()
        opt_gen.step()

        # Print losses + print to Tensorboard
        if batch_idx % 100 == 0:
            print(f"Epoch [{epoch}/{NUM_EPOCHS}] Batch {batch_idx}/{len(dataloader)} \
                    Loss D: {loss_disc:.4f}, loss G: {loss_gen:.4f}")

            with torch.no_grad():
                fake = gen(fixed_noise)
                # take out (up to) 32 examples
                img_grid_real = torchvision.utils.make_grid(real[:32], normalize=True)
                img_grid_fake = torchvision.utils.make_grid(fake[:32], normalize=True)
                ## Log generated fake and real images to TensorBoard
                writer_real.add_image("Real", img_grid_real, global_step=step)
                writer_fake.add_image("Fake", img_grid_fake, global_step=step)

            step += 1

Epoch [0/5] Batch 0/469                   Loss D: 0.6908, loss G: 0.7850
Epoch [0/5] Batch 100/469                   Loss D: 0.0004, loss G: 7.8235
Epoch [0/5] Batch 200/469                   Loss D: 0.0003, loss G: 8.9830
Epoch [0/5] Batch 300/469                   Loss D: 0.0001, loss G: 9.2160
Epoch [0/5] Batch 400/469                   Loss D: 0.0001, loss G: 10.2485
Epoch [1/5] Batch 0/469                   Loss D: 0.0000, loss G: 10.7900
Epoch [1/5] Batch 100/469                   Loss D: 0.0000, loss G: 10.9549
Epoch [1/5] Batch 200/469                   Loss D: 0.0000, loss G: 11.0135
Epoch [1/5] Batch 300/469                   Loss D: 0.0000, loss G: 11.7651
Epoch [1/5] Batch 400/469                   Loss D: 0.0000, loss G: 12.0667
Epoch [2/5] Batch 0/469                   Loss D: 0.0000, loss G: 12.1739
Epoch [2/5] Batch 100/469                   Loss D: 0.0000, loss G: 12.4651
Epoch [2/5] Batch 200/469                   Loss D: 0.0000, loss G: 12.6769
Epoch [2/5] Batch 300/