# CSE5CV - Generative Adversarial Networks (GANs)

In this lab we will explore two different Generative Adversarial Network (GAN) architectures pretrained on different datasets.

Most of the code in this lab is provided to you.

By the end of this lab, you should be able to:
* Implement and run inference on a pre-trained GAN

## Colab preparation

Google Colab is a free online service for editing and running code in notebooks like this one. To get started, follow the steps below:

1. Click the "Copy to Drive" button at the top of the page. This will open a new tab with the title "Copy of...". This is a copy of the lab notebook which is saved in your personal Google Drive. **Continue working in that copy, otherwise you will not be able to save your work**. You may close the original Colab page (the one which displays the "Copy to Drive" button).
2. Run the code cell below to prepare the Colab coding environment by downloading sample files. Note that if you close this notebook and come back to work on it again later, you will need to run this cell again.

In [None]:
!git clone https://github.com/ltu-cse5cv/cse5cv-labs.git
%cd cse5cv-labs/Lab08

## Packages
In this lab we will be using the following packages:
* *PyTorch* to work with pretrained GANs
* *numpy* to represent image data
* *matplotlib* for visualization

In [None]:
import numpy as np
import torch
from matplotlib import pyplot as plt

Refer to the `Packages` notebook for more information on packages we have used before.

## Generative Adversarial Networks (GANs)
You have covered GANs in detail in your lectures, but here we present a brief recap of what GANs are, and at a basic level, how they work.

In previous labs you have seen examples of using a trained deep learning model to take some input and make some sort of prediction on that input, whether it be classification, detection or segmentation.

Another application for deep learning is one where we want a network to *generate* synthetic data, that is, data that doesn't exist in the real world. Generative Adversarial Networks (GANs) have recently become very popular in this space.

A GAN consists of two networks, a generator and discriminator, that both try to outperform one another. The task of the generator is to create synthetic data that look similar to the distribution of data in a training dataset, whilst the task of the discriminator is to predict whether data from the generator is real or synthesized.

The end goal of the generator is to increase the error rate of the discriminator, meaning the discriminator can no longer reliably determine if data from the generator is real or synthetic. Once an acceptable error rate has been achieved (training is finished), the generator alone is used to generate synthetic data.

There are many variations of GANs that exist that are trained in different ways and used to perform different tasks. Some GANs are used to generate synthetic images that resemble images from an existing dataset, some GANs are used to generate music matching the style of existing artists, others can apply the style of a dataset to new images, and some can even generate new images given a text description.

Training GANs can be a lengthy and tricky process. Because of this, in this lab we will use two publicly available pre-trained GANs. These models are hosted on the [PyTorch hub](https://pytorch.org/hub/).

# 1. PGAN
The first GAN architecture we will explore is based on the [Progressive Growing of GANs for Improved Quality, Stability, and Variation](https://arxiv.org/abs/1710.10196) paper. The main contribution of this paper was a new training methodology for GANs to speed up training and improve stability.

## 1.1 Pre-Trained Network
We will be using a network that is pretrained on the [celebA HQ dataset](https://github.com/nperraud/download-celebA-HQ) \(based on the [celebA dataset](https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html)\). This dataset is comprised of face images of celebrities.

This network is trained to generate synthetic images (of size 512x512px) that look similar to the data distribution in the celebA dataset. If you are interested, you can look at the [PGAN source code on GitHub](https://github.com/facebookresearch/pytorch_GAN_zoo/blob/master/models/progressive_gan.py).

In the code cell below, we download a pre-trained PGAN model from the [PyTorch hub](https://pytorch.org/hub/). This size of this model is around 270MB, so it may take a few minutes to download for the first time.

In [None]:
pgan = torch.hub.load('facebookresearch/pytorch_GAN_zoo:hub', 'PGAN', model_name='celebAHQ-512', pretrained=True, useGPU=False)

## 1.2 Model Input
Now we have a network, how do we get it to generate images?

In past labs, we needed to feed our networks some input which it used to generate predictions. In a similar way, for our GAN we need to feed it some input data, but in this case, that input data just consists of random noise!

Our model produces 512x512px images. To generate those it requires a 512 dimensional vector filled with random values. Because we always need a batch dimension, this means the input to our network should be a *(**N**, 512)* dimensional tensor, where ***N*** = the number of images to generate (or our batch size).

The PGAN implementation we are using includes a method to construct this random data (sampled from a normal distribution with mean=0, std=1). In the code cell below we create a *(4, 512)* dimensional tensor filled with random data.

In [None]:
noise, _ = pgan.buildNoiseData(n_samples=4)
print(f'Shape of noise: {noise.shape}')
print(f'Min/Max value of noise: {noise.min()}/{noise.max()}')

## 1.3 Image Generation
Now we have created input noise for our GAN, the next step is to pass it through our model to generate some images!

Given the way the pre-trained model is implemented, to perform the forward pass through our model we need to call the *`test()`* method.

In the code cell below, we generate 4 new images by passing the noise tensor to our GAN.

In [None]:
with torch.no_grad():
    generated_images = pgan.test(noise)
print(f'Shape of output: {generated_images.shape}')
print(f'Min/Max value of output: {generated_images.min()}/{generated_images.max()}')

## 1.4 Postprocessing and Visualization
Now we have some generated images, all that remains is to do some postprocessing on the image data to get it to a *numpy* array, then visualize it!

The postprocessing steps required per-image are to:
* Get the values into the range [0, 1]
* Convert the tensor to a *numpy* array
* Transpose the array to get from ordering: CHW to HWC
* Get values into the range [0, 255] with datatype uint8

In the code cell below we provide the function *`display_image()`* that you have used in previous labs, and the function *`postprocess_image()`* which performs the steps outlined above. We then postprocess and display each generated image.

In [None]:
def display_image(image, title=None):
    fig, axes = plt.subplots(figsize=(12, 8))

    if image.ndim == 2:
        axes.imshow(image, cmap='gray', vmin=0, vmax=255)
    else:
        axes.imshow(image)

    if title is not None:
        plt.title(title)

    plt.show()


def postprocess_image(tensor):
    """Postprocesses a generated image in a tensor, producing a numpy array

    Args:
        tensor (3xHxW torch.Tensor): The tensor to postprocess

    Returns:
        (HxWx3 ndarray): The processed tensor converted to a numpy array
    """
    # Determine min/max values of tensor
    low, high = float(tensor.min()), float(tensor.max())

    # Subtract the minimum value from all values in tensor (making all values positive with min=0)
    tensor = tensor - low

    # Divide all values by the range of values in the tensor (making all values between [0, 1])
    tensor = tensor / max(high - low, 1e-5)

    # Convert tensor to numpy array
    image = tensor.detach().cpu().numpy()

    # Transpose to get ordering HWC
    image = np.transpose(image, (1, 2, 0))

    # Convert [0, 1] float32 into [0, 255] uint8
    image = (image * 255).astype(np.uint8)

    return image


# Postprocess and display the generated images
for image_tensor in generated_images:
    proc_image = postprocess_image(image_tensor)
    display_image(proc_image)

Very cool results! You have just successfully used a GAN to take a vector filled with random noise, and generate synthetic images of people (none of these people exist in real life!). Whilst you may still see some artefacts in the images, this is still a very impressive result!

## 1.5 Pulling it Together
You've now seen all the bits and pieces involved to generate synthetic images of people using a GAN. To make this a bit easier to use, we will now write a function that will take a number of images to generate, then generate that many images!

**Task**: Write a function *`generate_images`* that:
* Takes a `model` and `num_images` as arguments
* Creates a `num_images` noise vectors
* Performs the forward pass through the model
* Then for every generated image:
    * Postprocesses and displays the image
    
At the end of the code cell is some code to test your solution. Try running this test a few times and see all the different types of faces you can generate (some of them are very convincingly real)!

In [None]:
# TODO: Your function here



# Test your solution
generate_images(pgan, 5)

In [None]:
#@title Task solution

def generate_images(model, num_images):
    # Generate random noise
    noise, _ = model.buildNoiseData(n_samples=num_images)

    # Perform forward pass
    with torch.no_grad():
        generated_images = model.test(noise)

    # Postprocess and display each generated image
    for image_tensor in generated_images:
        proc_image = postprocess_image(image_tensor)
        display_image(proc_image)

## 1.6 Digging Deeper
We've seen that the generated images depend on the random noise vector fed to the GAN. How do small perturbations in the noise impact the generated image?

In the code cell below, we generate a single random noise vector, then add a set of small perturbations to see how they impact the generated image.  
If you are interested, feel free to add other values into the `perturbations` list to see how it impacts the generated image.

In [None]:
# Set of perturbations to apply
perturbations = [-0.2, -0.1, 0, 0.1, 0.2]

# Generate random noise vector
noise, _ = pgan.buildNoiseData(n_samples=1)

# Create a tensor combining perturbations
noise_tensor = torch.cat([noise + pert for pert in perturbations], dim=0)

# Perform forward pass
with torch.no_grad():
    generated_images = pgan.test(noise_tensor)

# Postprocess all images
postprocessed_images = [postprocess_image(image_tensor) for image_tensor in generated_images]

# Create a plot to display all images in a single row
fig, axes = plt.subplots(nrows=1, ncols=len(perturbations), figsize=(19, 10))

# Display images and set the titles to the perturbation applied
for ax, image, pert in zip(axes, postprocessed_images, perturbations):
    ax.imshow(image)
    ax.set_title(pert)
plt.show()

**Question**: PGAN was trained to produce images for random noise vectors with 0 mean and a standard deviation of 1. The random noise that is generated by *`buildNoiseData()`* produces noise data in this range. Consider setting our perturbations between -50 and 50. This is clearly outside the trained area. Generally speaking Neural Networks only work in ranges they are trained for. A "working" GAN is one which produces somewhat realistic and highly varied images. So, what do you expect will happen if you set `perturbations = [-500, -50, 0.0, 50, 500]`?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>
Outside the trained region, there is no guarantee of much variation existing because it was never explicitly trained to make it so. If you run the above code, you'll see that there's not much difference between -50 and -500, even though we saw quite a noticable difference adding/subtracting 0.2. That is, the PGAN model just outputs the same image for large regions of space outside the trained region.
</details>

# 2. DCGAN
The second GAN architecture we will explore is based on the [Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks](https://arxiv.org/abs/1511.06434) paper. The main contribution of this paper was the introduction of a specific CNN-based GAN architecture.

The steps for setting this up are nearly identical to PGAN, so we wont spend too much time in this section.

## 2.1 Pre-Trained Network
We will be using a network that is pretrained on the [FashionGen dataset](https://arxiv.org/abs/1806.08317). This dataset is comprised of people wearing different items of fashion.

This network is trained to generate synthetic images (of size 64x64px) that look similar to the data distribution in the FashionGen dataset. If you are interested, you can look at the [DCGAN source code on GitHub](https://github.com/facebookresearch/pytorch_GAN_zoo/blob/master/models/DCGAN.py).

In the code cell below, we download a pre-trained DCGAN model from the [PyTorch hub](https://pytorch.org/hub/). This size of this model is around 40MB, so it shouldn't take too long to download.

In [None]:
dcgan = torch.hub.load('facebookresearch/pytorch_GAN_zoo:hub', 'DCGAN', pretrained=True, useGPU=False)

## 2.2 Model Input

Much like before, we need to feed this GAN random noise. To generate an image, this model requires a 120 dimensional vector. This model also has a method we can use to generate the vectors for us.

In [None]:
noise, _ = dcgan.buildNoiseData(n_samples=4)
print(f'Shape of noise: {noise.shape}')
print(f'Min/Max value of noise: {noise.min()}/{noise.max()}')

## 2.3 Image Generation
We generate images in the exact same way we did for PGAN.

In [None]:
with torch.no_grad():
    generated_images = dcgan.test(noise)
print(f'Shape of output: {generated_images.shape}')
print(f'Min/Max value of output: {generated_images.min()}/{generated_images.max()}')

## 2.4 Postprocessing and Visualization
The postprocessing steps are also identical for PGAN!

In [None]:
for image_tensor in generated_images:
    proc_image = postprocess_image(image_tensor)
    display_image(proc_image)

## 2.5 Pulling it Together
Just like for PGAN, we were able to generate synthetic images resembling the dataset that DCGAN was trained on.

Luckily for us, the interface to both PGAN and DCGAN was identical, allowing us to reuse the functions we wrote.

In the code cell below we call the *`generate_images()`* function you wrote in section 1.5 with the DCGAN model to generate synthetic images. Run this cell a few times to see the different types of fashion images you can generate!

In [None]:
generate_images(dcgan, 5)

## 2.6 Digging Deeper
We can also visualize the impact of small perturbations on the random noise vector.

In [None]:
# Set of perturbations to apply
perturbations = [-0.2, -0.1, 0, 0.1, 0.2]

# Generate random noise vector
noise, _ = dcgan.buildNoiseData(n_samples=1)

# Create a tensor combining perturbations
noise_tensor = torch.cat([noise + pert for pert in perturbations], dim=0)

# Perform forward pass
with torch.no_grad():
    generated_images = dcgan.test(noise_tensor)

# Postprocess all images
postprocessed_images = [postprocess_image(image_tensor) for image_tensor in generated_images]

# Create a plot to display all images in a single row
fig, axes = plt.subplots(nrows=1, ncols=len(perturbations), figsize=(19, 10))

# Display images and set the titles to the perturbation applied
for ax, image, pert in zip(axes, postprocessed_images, perturbations):
    ax.imshow(image)
    ax.set_title(pert)
plt.show()

# Summary
In this lab we used two different GAN architectures pretrained on different datasets to generate synthetic images. These images were generated from random noise, which is a very impressive feat! If you are interested in looking at other existing GAN architectures, you can take a look at [this GitHub repository](https://github.com/eriklindernoren/PyTorch-GAN) which has implementations of popular GANs.

# Next Lab
We will use Microsoft Azure for Image Classification inference