<a href="https://colab.research.google.com/github/Leila828/DeepLearning_projects/blob/master/Generate_images_with_GANs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Generator
A GAN generator takes a random noise vector as input and produces a generated image. To make its architecture more reusable, you will pass both input and output shapes as parameters to the model. This way, you can use the same model with different sizes of input noise and images of varying shapes.

You will find torch.nn imported already imported for you as nn. You can also access a custom gen_block() function which returns a block of: linear layer, batch norm, and ReLU activation. You will use it as a building block for the generator

In [1]:
def gen_block(in_dim, out_dim):
    return nn.Sequential(
        nn.Linear(in_dim, out_dim),
        nn.BatchNorm1d(out_dim),
        nn.ReLU(inplace=True)
    )

In [3]:
import torch.nn as nn

In [4]:
class Generator(nn.Module):
    def __init__(self, in_dim, out_dim):
        super(Generator, self).__init__()
        # Define generator block
        self.generator = nn.Sequential(
            gen_block(in_dim, 256),
            gen_block(256, 512),
            gen_block(512, 1024),
          	# Add linear layer
            nn.Linear(1024, out_dim),
            # Add activation
            nn.Sigmoid()
        )

    def forward(self, x):
      	# Pass input through generator
        return self.generator(x)

With the generator defined, the next step in building a GAN is to construct the discriminator. It takes the generator's output as input, and produces a binary prediction: is the input generated or real?

You will find torch.nn imported already imported for you as nn. You can also access a custom disc_block() function which returns a block of a linear layer followed by a LeakyReLU activation. You will use it as a building block for the discriminator.

In [5]:
def disc_block(in_dim, out_dim):
    return nn.Sequential(
        nn.Linear(in_dim, out_dim),
        nn.LeakyReLU(0.2)
    )

In [None]:
class Discriminator(nn.Module):
    def __init__(self, im_dim):
        super(Discriminator, self).__init__()
        self.disc = nn.Sequential(
            disc_block(im_dim, 1024),
            disc_block(1024, 512),
            # Define last discriminator block
            disc_block(512,256),
            # Add a linear layer
            nn.Linear(256, 1),
        )

    def forward(self, x):
        # Define the forward method
       return self.disc(x)

Deep convolutional Gans

Define a convolutional generator following the DCGAN guidelines discussed in the last video.

torch.nn has been pre-imported as nn for your convenience. Additionally, a custom function dc_gen_block() is available, which eturns a block of a transposed convolution, batch norm, and ReLU activation. This function serves as a foundational component for constructing the convolutional generator. You can get familiar with dc_gen_block()'s definition below.

In [6]:
def dc_gen_block(in_dim, out_dim, kernel_size, stride):
    return nn.Sequential(
        nn.ConvTranspose2d(in_dim, out_dim, kernel_size, stride=stride),
        nn.BatchNorm2d(out_dim),
        nn.ReLU()
    )

In [7]:
class DCGenerator(nn.Module):
    def __init__(self, in_dim, kernel_size=4, stride=2):
        super(DCGenerator, self).__init__()
        self.in_dim = in_dim
        self.gen = nn.Sequential(
            dc_gen_block(in_dim, 1024, kernel_size, stride),
            dc_gen_block(1024, 512, kernel_size, stride),
            # Add last generator block
            dc_gen_block(512,256, kernel_size, stride),
            # Add transposed convolution
            nn.ConvTranspose2d(256, 3, kernel_size, stride=stride),
            # Add tanh activation
            nn.Tanh()
        )

    def forward(self, x):
        x = x.view(len(x), self.in_dim, 1, 1)
        return self.gen(x)

Convolutional Discriminator
With the DCGAN's generator ready, the last step before you can proceed to training it is to define the convolutional discriminator.

torch.nn is imported for you under its usual alias. To build the convolutional discriminator, you will use a custom gc_disc_block() function which returns a block of a convolution followed by a batch norm and the leaky ReLU activation. You can inspect dc_disc_block()'s definition below.

In [8]:
def dc_disc_block(in_dim, out_dim, kernel_size, stride):
    return nn.Sequential(
        nn.Conv2d(in_dim, out_dim, kernel_size, stride=stride),
        nn.BatchNorm2d(out_dim),
        nn.LeakyReLU(0.2),
    )

In [9]:
class DCDiscriminator(nn.Module):
    def __init__(self, kernel_size=4, stride=2):
        super(DCDiscriminator, self).__init__()
        self.disc = nn.Sequential(
          	# Add first discriminator block
            dc_disc_block(3, 512, kernel_size, stride),
            dc_disc_block(3, 512, kernel_size, stride),
            dc_disc_block(512, 1024, kernel_size, stride),
          	# Add a convolution
            nn.Conv2d(1024, 1, kernel_size, stride=stride),
            nn.Conv2d(1024, 1, kernel_size, stride=stride),
        )

    def forward(self, x):
        # Pass input through sequential block
        x = self.disc(x)
        return x.view(len(x), -1)

Generator loss
Before you can train your GAN, you need to define loss functions for both the generator and the discriminator. You will start with the former.

Recall that the generator's job is to produce such fake images that would fool the discriminator into classifying them as real. Therefore, the generator incurs a loss if the images it generated are classified by the discriminator as fake (label 0).

Define the gen_loss() function that calculates the generator loss. It takes four arguments:

gen, the generator model
disc, the discriminator model
num_images, the number of images in batch
z_dim, the size of the input random noise

In [10]:
def gen_loss(gen, disc, criterion, num_images, z_dim):
    # Define random noise
    noise = torch.rand(num_images, z_dim)
    # Generate fake image
    fake = gen(noise)
    # Get discriminator's prediction on the fake image
    disc_pred = disc(fake)
    # Compute generator loss
    criterion = nn.BCEWithLogitsLoss()
    gen_loss = criterion(disc_pred, torch.ones_like(disc_pred))
    return gen_loss

Discriminator loss
It's time to define the loss for the discriminator. Recall that the discriminator's job is to classify images either real or fake. Therefore, the generator incurs a loss if it classifies generator's outputs as real (label 1) or the real images as fake (label 0).

Define the disc_loss() function that calculates the discriminator loss. It takes five arguments:

gen, the generator model
disc, the discriminator model
real, a sample of real images from the training data
num_images, the number of images in batch
z_dim, the size of the input random noise

In [11]:
def disc_loss(gen, disc, real, num_images, z_dim):
    criterion = nn.BCEWithLogitsLoss()
    noise = torch.randn(num_images, z_dim)
    fake = gen(noise)
    # Get discriminator's predictions for fake images
    disc_pred_fake = disc(fake)
    # Calculate the fake loss component
    fake_loss = criterion(disc_fake_pred, torch.zeros_like(disc_fake_pred))
    # Get discriminator's predictions for real images
    disc_pred_real = disc(real)
    # Calculate the real loss component
    real_loss = criterion(disc_real_pred, torch.ones_like(disc_real_pred))
    disc_loss = (real_loss + fake_loss) / 2
    return disc_loss

Training loop
Finally, all the hard work you put into defining the model architectures and loss functions comes to fruition: it's training time! Your job is to implement and execute the GAN training loop. Note: a break statement is placed after the first batch of data to avoid a long runtime.

The two optimizers, disc_opt and gen_opt, have been initialized as Adam() optimizers. The functions to compute the losses that you defined earlier, gen_loss() and disc_loss(), are available to you. A dataloader is also prepared for you.

Recall that:

disc_loss()'s arguments are: gen, disc, real, cur_batch_size, z_dim.
gen_loss()'s arguments are: gen, disc, cur_batch_size, z_dim.

In [None]:
from torch.utils.data import DataLoader
# Assuming 'dataset' is defined and contains your data
dataloader = DataLoader(dataset, batch_size=32, shuffle=True) # Create a DataLoader instance


In [12]:
for epoch in range(1):
    for real in dataloader:
        cur_batch_size = len(real)

        disc_opt.zero_grad()
        # Calculate discriminator loss
        disc_loss = disc_loss(gen, disc, real, cur_batch_size, 16)
        # Compute gradients
        disc_loss.backward()
        disc_opt.step()

        gen_opt.zero_grad()
        # Calculate generator loss
        gen_loss = gen_loss(gen, disc, cur_batch_size, 16)
        # Compute generator gradients
        gen_loss.backward()
        gen_opt.step()

        print(f"Generator loss: {gen_loss}")
        print(f"Discriminator loss: {disc_loss}")
        break


NameError: name 'dataloader' is not defined

Generating images
Now that you have designed and trained your GAN, it's time to evaluate the quality of the images it can generate. For a start, you will perform a visual inspection to see if the generation resemble the Pokemons at all. To do this, you will create random noise as input for the generator, pass it to the model and plot the outputs.

The Deep Convolutional Generator with trained weights is available to you as gen. torch and matplotlib.pyplot as plt are already imported for you.

In [14]:
import torch

In [15]:
num_images_to_generate = 5
# Create random noise tensor
noise =  torch.randn(num_images_to_generate, 16)

# Generate images
with torch.no_grad():
    fake = gen(noise)
print(f"Generated tensor shape: {fake.shape}")

for i in range(num_images_to_generate):
    # Slice fake to select i-th image
    image_tensor = fake[i, :,:,:]
    # Permute the image dimensions
    image_tensor_permuted = image_tensor.permute(1,2,0)
    plt.imshow(image_tensor_permuted)
    plt.show()

NameError: name 'gen' is not defined

Fréchet Inception Distance
The visual inspection of generated images is a great start. But given they look okay, a more precise, quantitative evaluation will be helpful to understand the generator's performance. You will evaluate your GAN using the Fréchet Inception Distance, or FID.

Two tensors with fake and real images, 32 examples each, are available to you as fake and real, respectively. Use them to compute the FID!

In [16]:
# Import FrechetInceptionDistance
from torchmetrics.image.fid import FrechetInceptionDistance

# Instantiate FID
fid = FrechetInceptionDistance(feature=64)


# Update FID with real images
fid.update((fake * 255).to(torch.uint8), real=False)
fid.update((real * 255).to(torch.uint8), real=True)

# Compute the metric
fid_score = fid.compute()
print(fid_score)

ModuleNotFoundError: No module named 'torchmetrics'