<img align='center' style='max-width: 1000px' src='banner.png'>

<img align='right' style='max-width: 200px; height: auto' src='hsg_logo.png'>

## Lab 09 - Generative Adversarial Networks (GANs)

GSERM Summer School 2024, Deep Learning: Fundamentals and Applications, University of St. Gallen

The lab environment is based on Jupyter Notebooks (https://jupyter.org), which provide an interactive platform for performing a variety of statistical evaluations and data analyses. In this lab, we will learn how to apply a deep learning technique referred to as **Generative Adversarial Networks (GANs)**. Unlike standard feedforward neural networks, GANs consist of two networks, a generator and a discriminator, which are trained together in a game-theoretic framework to generate realistic synthetic data.

GANs were introduced by *Ian Goodfellow* and his colleagues in 2014 and have since revolutionized the field of generative modeling. They are capable of generating high-quality data across various domains, including images, text, and audio. The generator network creates synthetic data, while the discriminator network attempts to distinguish between real and synthetic data. Through this adversarial process, the generator improves its ability to create realistic data over time.

In this lab, we will use the `PyTorch` library to implement and train a **Generative Adversarial Networks**. The network will be trained on the Fashion-MNIST dataset, which consists of grayscale images of various fashion items. Once the network is trained, we will evaluate its performance by visually inspecting the quality of the generated images. 

The figure below illustrates a high-level view of the machine learning process we aim to establish in this lab.

<img align="center" style="max-width: 800px" src="gan_pipeline.png">

As always, pls. don't hesitate to ask all your questions either during the lab, post them in our CANVAS (StudyNet) forum (https://learning.unisg.ch), or send us an email (using the course email).

## 1. Lab Objectives

After today's lab, you should be able to:

> 1. **Understand Generative Adversarial Network (GAN) Design:** Learn the fundamental concepts and architectural design of GANs.
> 2. **Implement and Train a GAN Model:** Gain hands-on experience with PyTorch to implement, train, and evaluate GAN models.
> 3. **Apply GAN Models to Generate Synthetic Data:** Use GANs to generate realistic fashion images using the Fashion-MNIST dataset.
> 4. **Evaluate and Interpret Model Performance:** Evaluate the GAN model's performance using relevant metrics and interpret the generated results.
> 5. **Visualize and Interpret Generated Images:** Visualize the images generated to gain insights into the model's ability to capture the underlying data distribution.

Before we start let's watch a motivational video:

In [None]:
from IPython.display import YouTubeVideo
# AlphaFold: The Making of a Scientific Breakthrough
# YouTubeVideo('gg7WjuFs8F4', width=800, height=600)

## 2. Setup of the Jupyter Notebook Environment

Similar to the previous labs, we need to import several Python libraries that facilitate data analysis and visualization. We will primarily use `PyTorch`, `NumPy`, `Scikit-learn`, `Matplotlib`, `Seaborn`, and a few utility libraries throughout this lab.

We start by importing `numpy` and utility libraries. Here, we also import the `pickle` module to save and reuse some Python objects. `pickle` "serializes" an object before writing it to a file. An object can be converted to a character stream and then reconstructed either later on in the script, or in another script.

In [None]:
# import python data science and utility libraries
import os
from tqdm import tqdm
from datetime import datetime
import numpy as np
import pickle as pkl

Importing `PyTorch` data download and transform libraries:

In [None]:
# import pytorch datasets and transforms
import torchvision.datasets as datasets
from torchvision import transforms

Import `Python` machine learning and deep learning libraries:

In [None]:
# import pytorch libraries
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

Import the `Matplotlib` data visualization library:

In [None]:
# import isualization libraries
import matplotlib.pyplot as plt
import seaborn as sns

Set global plotting theme and parameters:

In [None]:
# set seaborn theme
sns.set_theme()

# set general plotting parameters
plt.rcParams['figure.figsize'] = [10, 5]
plt.rcParams['figure.dpi']= 150

Enable inline plotting with `Matplotlib`:

In [None]:
%matplotlib inline

Create notebook folder structure to store the data as well as the trained neural network models:

In [None]:
# create the data sub-directory
data_directory = './data_gan'
if not os.path.exists(data_directory): os.makedirs(data_directory)

# create the models sub-directory
models_directory = './models_gan'
if not os.path.exists(models_directory): os.makedirs(models_directory) 

Set a random `seed` value to obtain reproducible results:

In [None]:
# init deterministic seed
seed_value = 123
np.random.seed(seed_value) # set numpy seed
torch.manual_seed(seed_value); # set pytorch seed CPU

Google Colab provides free GPUs for running notebooks. However, if you execute this notebook as is, it will use your device's CPU. To run the lab on a GPU, go to `Runtime` > `Change runtime type` and set the Runtime type to `GPU` in the drop-down menu. Running this lab on a CPU is fine, but you will find that GPU computing is faster. *CUDA* indicates that the lab is being run on a GPU.

Enable GPU computing by setting the device flag and initializing a CUDA seed:

In [None]:
# set cpu or gpu enabled device
device = torch.device('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').type

# init deterministic GPU seed
torch.mps.manual_seed(seed_value)
torch.cuda.manual_seed(seed_value)

# log type of device enabled
print('[LOG] notebook with {} computation enabled'.format(str(device)))

Let's determine if we have access to a GPU provided by environments such as `Google Colab`:

In [None]:
!nvidia-smi

## 3. Dataset Download and Data Assessment

In this lab, we will use the popular **FashionMNIST** dataset, which you have already seen in lab 04 **"Artificial Neural Networks (ANNs)"**. Back then, we used the dataset to train a simple neural network to classify the fashion articles. In this lab, we are going to train a model - consisting of 2 networks - to create its own images, based on the **FashionMNIST** items.

The **Fashion-MNIST database** is a large dataset of Zalando articles commonly used for training various image processing systems. The database is widely used for training and testing in the field of machine learning. Let's take a brief look at a few sample images from the dataset:

<img align="center" style="max-width: 500px; height: 300px" src="FashionMNIST.png">

Source: https://www.kaggle.com/c/insar-fashion-mnist-challenge

Further details on the dataset can be obtained from Zalando Research's [GitHub page](https://github.com/zalandoresearch/fashion-mnist).

The **Fashion-MNIST database** is an image dataset of Zalando's article images, consisting of **70,000 images** in total. The dataset is divided into **60,000 training examples** and **10,000 evaluation examples**. Each example is a **28x28 grayscale image**, associated with a **label from 10 classes**. Zalando created this dataset to replace the popular **MNIST** handwritten digits dataset. It is a useful addition as it is a bit more complex but still very easy to use. It shares the same image size and train/test split structure as MNIST, making it a drop-in replacement. It requires minimal efforts in preprocessing and formatting the distinct images.

Let's download, transform and inspect the training images of the dataset. Therefore, let's first define the directory in which we aim to store the training data:

In [None]:
train_path = data_directory + '/train_fmnist'

Now, let's download the training data accordingly:

In [None]:
# define pytorch transformation into tensor format
transf = transforms.Compose([transforms.ToTensor()])

# download and transform images
fashion_mnist_data = datasets.FashionMNIST(root=train_path, train=True, transform=transf, download=True)

Verify the number of training images downloaded:

In [None]:
# determine the number of training data images
len(fashion_mnist_data)

Next, let's inspect a few of the downloaded training images:

In [None]:
# select and set a (random) image id
image_id = 7779

# retrieve image exhibiting the image id
fashion_mnist_data[image_id]

Ok, that doesn't seem right. Let's now separate the image from its label information:

In [None]:
fashion_mnist_image, fashion_mnist_label = fashion_mnist_data[image_id]

We can verify the label of our selected image:

In [None]:
fashion_mnist_label

Ok, we know that the numerical label is 6. Each image is associated with a label from 0 to 9, representing one of the fashion items. So what does 6 mean? Is it a bag? A pullover? 

The order of the classes can be found on Zalando Research's [GitHub page](https://github.com/zalandoresearch/fashion-mnist). We need to map each numerical label to its fashion item, which will be useful throughout the lab:

In [None]:
fashion_classes = {0: 'T-shirt/top',
                    1: 'Trouser',
                    2: 'Pullover',
                    3: 'Dress',
                    4: 'Coat',
                    5: 'Sandal',
                    6: 'Shirt',
                    7: 'Sneaker',
                    8: 'Bag',
                    9: 'Ankle boot'}

So, we can determine the fashion item that the label represents:

In [None]:
fashion_classes[fashion_mnist_label]

Great, let's now visually inspect our sample image: 

In [None]:
# define tensor to image transformation
trans = transforms.ToPILImage()

# set image plot title 
plt.title('Example: {}, Label: {}'.format(str(image_id), fashion_classes[fashion_mnist_label]))

# plot mnist handwritten digit sample
plt.imshow(trans(fashion_mnist_image), cmap='gray')

That's it! In this lab, we will not use any test dataset. Unlike our previous labs where we built classifiers that required evaluation on test data, this lab does not utilize test data. We will train a model to generate new images, which, by definition, cannot be compared and validated against a test set. To train this lab's model, we will only use the training set of the `FashionMNIST` dataset, as it contains a sufficient number of images for our purposes.

## 4. Neural Network Implementation

In this section, we implement the architectures of the two **neural networks** that will constitute our **GAN** model. We aim to train our **GAN** to generate new images—artificial images based on the **FashionMNIST** dataset that no human has ever drawn, created, or dreamt of. Before we dive into the theory, let's briefly revisit the process to be established. The following illustration provides a bird's-eye view:

<img align="center" style="max-width: 800px" src="gan_pipeline.png">

We will build the two models that together constitute the **GAN** architecture. We will start with the construction of the `Discriminator` and then proceed to the `Generator`. After this, we will instantiate both models along with the other required components.


### 4.1 Implementing the Discriminator Network

The discriminative network $D$, which we name `Discriminator`, consists of four **fully-connected layers**. These layers aim to learn **non-linear feature combinations** that enable the detection of patterns. In fully-connected layers, all inputs are connected to all activation units of the next layer.

Let's implement the `Discriminator`. This is a binary classifier as described above. The input size to the first layer is 28x28 = 784, since our **FashionMNIST** images are 28x28 pixels. The output size of the last layer is 1, which corresponds to the model's classification of the input.

In [None]:
# implement the Discriminator network architecture
class DiscriminatorNet(nn.Module):

    # define the class constructor
    def __init__(self):

        # call super class constructor
        super(DiscriminatorNet, self).__init__()
        
        # specify fc layer 1: in 28*28, out 128
        self.fc1 = nn.Linear(28*28, 128, bias=True, device=device) # the linearity W*x+b
        self.activation1 = nn.LeakyReLU(0.2, inplace=True) # the non-linearity
        
        # specify fc layer 2: in 128, out 64
        self.fc2 = nn.Linear(128, 64, bias=True, device=device) # the linearity W*x+b
        self.activation2 = nn.LeakyReLU(0.2, inplace=True) # the non-linearity

        # specify fc layer 3: in 64, out 32
        self.fc3 = nn.Linear(64, 32, bias=True, device=device) # the linearity W*x+b
        self.activation3 = nn.LeakyReLU(0.2, inplace=True) # the non-linearity
        
        # specify fc layer 4: in 32, out 1
        self.fc4 = nn.Linear(32, 1, bias=True, device=device) # the linearity W*x+b

        # dropout layer
        self.dropout = nn.Dropout(0.3)
        
    # define network forward pass
    def forward(self, x):

        # flatten image
        x = x.view(-1, 28*28)

        # define fc layer 1 forward pass and add dropout
        x = self.activation1(self.fc1(x))
        x = self.dropout(x)

        # define fc layer 2 forward pass and add dropout
        x = self.activation2(self.fc2(x))
        x = self.dropout(x)

        # define fc layer 3 forward pass and add dropout
        x = self.activation3(self.fc3(x))
        x = self.dropout(x)
        
        # define fc layer 4 forward pass
        out = self.fc4(x)

        # return forward pass result
        return out

### 4.2 Implementing the Generator Network

Our generative network $G$, which we name `Generator`, consists of four **fully-connected layers**. These layers aim to learn **non-linear feature combinations** that enable the detection and, later, generation of patterns. In fully-connected layers, all inputs are connected to all activation units of the next layer.

In [None]:
# implement the Generator network architecture
class GeneratorNet(nn.Module):

    # define the class constructor
    def __init__(self):

        # call super class constructor
        super(GeneratorNet, self).__init__()
        
        # specify fc layer 1: in 100, out 32
        self.fc1 = nn.Linear(100, 32, bias=True, device=device) # the linearity W*x+b
        self.activation1 = nn.LeakyReLU(0.2, inplace=True) # the non-linearity

        # specify fc layer 2: in 32, out 64
        self.fc2 = nn.Linear(32, 64, bias=True, device=device) # the linearity W*x+b
        self.activation2 = nn.LeakyReLU(0.2, inplace=True) # the non-linearity

        # specify fc layer 3: in 64, out 128
        self.fc3 = nn.Linear(64, 128, bias=True, device=device) # the linearity W*x+b
        self.activation3 = nn.LeakyReLU(0.2, inplace=True) # the non-linearity
        
        # specify fc layer 4: in 128, out 28*28
        self.fc4 = nn.Linear(128, 28*28, bias=True, device=device) # the linearity W*x+b
       
        # dropout layer 
        self.dropout = nn.Dropout(0.3)

    # define network forward pass
    def forward(self, x):

        # define fc layer 1 forward pass and add dropout
        x = self.activation1(self.fc1(x))
        x = self.dropout(x)

        # define fc layer 2 forward pass and add dropout
        x = self.activation2(self.fc2(x))
        x = self.dropout(x)

        # define fc layer 3 forward pass and add dropout
        x = self.activation3(self.fc3(x))
        x = self.dropout(x)

        # define fc layer 4 with tanh applied
        out = self.fc4(x).tanh()

        # return forward pass result
        return out

You might notice that we use the `tanh` function as the last layer of our `Generator`. This is done because the `Discriminator` expects normalized input.

### 4.3 Generative Adversarial Network Model Instantiation

Now that we have implemented the GAN's `Discriminator` and `Generator` networks, we are ready to instantiate models of both for training.

In [None]:
discriminator = DiscriminatorNet()
generator = GeneratorNet()

Let's push the initialized `Discriminator` and `Generator` models to the enabled computing device:

In [None]:
discriminator = discriminator.to(device)
generator = generator.to(device)

Let's double-check if our model was deployed to the GPU, if available:

In [None]:
!nvidia-smi

Once the models are initialized, we can visualize the model structures and review the implemented network architectures by executing the following cells. We start with the `Discriminator`:

In [None]:
print('[LOG] Discriminator architecture:\n\n{}\n'.format(discriminator))

And now, let's review the `Generator`:

In [None]:
print('[LOG] Generator architecture:\n\n{}\n'.format(generator))

Looks like it worked as intended? Brilliant! Finally, let's look into the number of model parameters that we aim to train in the next steps of the notebook. Again, we start with the `Discriminator`. The number of parameters, if everything is defined correctly, should be: 
$$(784+1) * 128 + (128+1) * 64 + (64+1) * 32 + (32+1) * 1 = 110,849$$

Don't hesitate to revisit our **CNN** lab if you are unsure how to count the number of parameters. Let's verify this calculation:

In [None]:
# init the number of model parameters
num_params_discriminator = 0

# iterate over the distinct parameters
for param in discriminator.parameters():

    # collect number of parameters
    num_params_discriminator += param.numel()
    
# print the number of model paramters
print('[LOG] Number of Discriminator model parameters to be trained: {}.'.format(num_params_discriminator))

Now, let's look into the number of model parameters of the `Generator`. The number of parameters, if everything is defined correctly, should be: 
$$(100+1) * 32 + (32+1) * 64 + (64+1) * 128 + (128+1) * 784 = 114,800$$

Let's verify this calculation:

In [None]:
# init the number of model parameters
num_params_generator = 0

# iterate over the distinct parameters
for param in generator.parameters():

    # collect number of parameters
    num_params_generator += param.numel()

# print the number of model paramters
print('[LOG] Number of Generator model parameters to be trained: {}.'.format(num_params_generator))

Okay, our 'simple' **GAN** model already encompasses an impressive number of parameters: 110,849 + 114,800 = **225,649** model parameters to be trained.

Now that we have implemented the GANs, we are ready to train the network. However, before starting the training, we need to define an appropriate loss function. Remember, we discussed in the theory section above (see 4.1.1) that we want to use **Binary Cross-Entropy (BCE)** loss with logits, so we do not have to manually define a sigmoid function in the network.

Let's instantiate the **BCEWithLogitsLoss** by executing the following PyTorch command:

In [None]:
# define the optimization criterion / loss function
criterion = nn.BCEWithLogitsLoss()

Next, let's also push the initialized `criterion` computation to the enabled computing `device`:

In [None]:
criterion = criterion.to(device)

Based on the loss magnitude of a certain mini-batch, PyTorch automatically computes the gradients. Even better, based on the gradient, the library also helps us in the optimization and update of the network parameters $\theta$.

Based on the loss magnitude of a certain mini-batch, PyTorch automatically computes the gradients. Even better, based on the gradient, the library also helps us in the optimization and update of the network parameters $\theta$. We also set the learning rate to 0.02 for our **Discriminator** and to 0.002 for our **Generator**:

In [None]:
# set different learning rates for both networks
discriminator_learning_rate = 0.02
generator_learning_rate = 0.002

Following the advice of [Soumith Chintala](https://github.com/soumith/ganhacks), we use the **Stochastic Gradient Descent** (`SGD`) optimizer for our **Discriminator**, and the `Adam` optimizer for our **Generator**. 

In [None]:
# create optimizers for the discriminator and generator
discriminator_optimizer = optim.SGD(params=discriminator.parameters(), lr=discriminator_learning_rate) 
generator_optimizer = optim.Adam(params=generator.parameters(), lr=generator_learning_rate) 

That's it! We are finally done with the implementation. Now, let's get down to training.

## 5. Neural Network Model Training

In this section, we will train a **Generative Adversarial Network (GAN)** model (as implemented in the section above) using the **FashionMNIST** images. Specifically, we will take a detailed look at the distinct training steps and how to monitor the training progress.

### 5.1 Preparing the Network Training

So far, we have pre-processed the dataset, implemented the GANs, and defined the loss function. Let's now start training the model for **20 epochs** with a **mini-batch size of 64** FashionMNIST images per batch. This means that the entire dataset will be fed through the network 20 times in chunks of 64 images, yielding **938 mini-batches** (60,000 images / 64 images per mini-batch) per epoch. After processing each mini-batch, the parameters of the network will be updated.

In [None]:
# specify the training parameters
num_epochs = 20 # number of training epochs
mini_batch_size = 64 # size of the mini-batches

Furthermore, let's specify and instantiate a corresponding PyTorch data loader that feeds the image tensors to our neural network:

In [None]:
train_loader = torch.utils.data.DataLoader(fashion_mnist_data, batch_size=mini_batch_size, shuffle=True)

We can verify the length of the training `DataLoader`, which should correspond to **938 mini-batches**:

In [None]:
len(train_loader)

Please, remember that our `Discriminator` will attempt to classify samples as either *real* or *fake*. We therefore have to define what these labels will be.

As this is a binary classification task, we define:
>- $1$ as the label for real images: $y = 1$
>- $0$ as the label for fake images: $y = 0$

In [None]:
# establish convention for real and fake labels during training
real_label = 1
fake_label = 0

Lastly, we create a batch of **latent vectors**, exhibiting 100 dimensions each, that we will use later to visualize the progress of the `Generator`. We will call it `fixed_noise`, as it will remain fixed. This will allow us to take 4 images (we define a sample size of 4) and see how the results evolve in the evaluation section.

In [None]:
# define size of latent vector
z_size = 100

# define sample size
sample_size = 4

# uniformly distribute data of size z_size over an interval of -1; 1
fixed_noise = np.random.uniform(-1, 1, size=(sample_size, z_size))

# create numpy array into tensor, and convert data to float
fixed_noise = torch.from_numpy(fixed_noise).float()

# push the fixed vector to the device that's enabled
fixed_noise = fixed_noise.to(device)

### 5.2 Running the Network Training

Finally, we start training the model according to the following adversarial training protocol:

>1. Train the `Discriminator` on the real images.
>2. Generate fake images with the `Generator` and train the `Discriminator` on them.
>3. Perform a backward pass through the `Discriminator` and update its parameters $θ_{D}$.
>4. Train the `Generator` based on the `Discriminator`'s output on the fake data.
>5. Perform a backward pass through the `Generator` and update its parameters $θ_{G}$.


To ensure effective learning while training our **Generative Adversarial Network (GAN)** model, we will monitor whether the loss decreases as training progresses. Therefore, we will obtain and evaluate the performance on the entire training dataset after each iteration. Based on this evaluation, we can assess the training progress and determine whether the loss is converging, indicating that the model might not improve any further. The following elements of the network training code below should be given particular attention:

>- `loss.backward()` computes the gradients based on the magnitude of the reconstruction loss.
>- `optimizer.step()` updates the network parameters based on the gradients.


In [None]:
# initialize list of the generated (fake) images
fake_images = []

# initialize collection of epoch losses
discriminator_epoch_losses = []
generator_epoch_losses = []

# set networks to training mode
discriminator.train()
generator.train()

# init and wrap range of training iterations
training_epochs = tqdm(range(0, num_epochs), position=0, leave=True)

# train the GANs
for epoch in training_epochs:

    # initialize collection of batch losses
    discriminator_batch_losses = []
    generator_batch_losses = []

    # iterate over mini batches
    for i, data in enumerate(train_loader):
        
        # determine real images and ignore class labels
        real_images = data[0]

        # determine batch size as the images' size to ensure the loader is emptied completely
        batch_size = real_images.size(0)

        # --------------------------------------------------------------------------
        # (1) train Discriminator network
        # --------------------------------------------------------------------------

        #### train with real images

        # push real images to compute device
        real_images = real_images.to(device)

        # create tensor of same size as mini-batch and filled with 1's (real_label)
        label = torch.full((batch_size,), real_label, dtype=torch.float, device=device)

        # rescaling input images from [0,1) to [-1, 1), which is needed for network
        real_images = real_images * 2 - 1

        # run forward pass through Discriminator
        output = discriminator(real_images).view(-1)

        # reset graph gradients
        discriminator.zero_grad()

        # determine loss on Discriminator
        discriminator_loss_real = criterion(output, label)

        # run backward pass
        discriminator_loss_real.backward()
    
        #### train with fake images

        # generate batch of latent vectors
        z = np.random.uniform(-1, 1, size=(batch_size, z_size)) # torch.randn(batch_size, z_size, 1, 1, device=device)

        # create numpy array into tensor, and convert data to float
        z = torch.from_numpy(z).float()

        # push the z vector to the device that's enabled
        z = z.to(device)

        # generate fake image batch with Generator
        fake = generator(z)

        # fills label tensor with 0's (fake_label)
        label.fill_(fake_label)

        # classify all fake batch with Discriminator
        output = discriminator(fake.detach()).view(-1)

        # get discriminator loss on the fake batch
        discriminator_loss_fake = criterion(output, label)

        # run backward pass
        discriminator_loss_fake.backward()

        #### update Discriminator model parameters

        # compute error of Discriminator as sum of loss over the fake and the real batches
        discriminator_loss = discriminator_loss_fake + discriminator_loss_real

        # update Discriminator parameters
        discriminator_optimizer.step()

        # --------------------------------------------------------------------------
        # (2) train Generator network
        # --------------------------------------------------------------------------

        # reset graph gradients
        generator.zero_grad()

        # fake labels are real for generator
        label.fill_(real_label)

        # since we just updated D, perform another forward pass of fake batch through the Discriminator
        output = discriminator(fake).view(-1)

        # get Generator loss based on this output
        generator_loss = criterion(output, label)

        # run backward pass
        generator_loss.backward()

        #### update Generator model parameters

        # update Generator parameters
        generator_optimizer.step()

        # --------------------------------------------------------------------------
        # (3) evaluate generative adversarial network
        # --------------------------------------------------------------------------

        # collect training losses
        discriminator_batch_losses.append(discriminator_loss.item())
        generator_batch_losses.append(generator_loss.item())

        # set Generator to eval mode for generating samples
        generator.eval() 

        # make Generator generate samples from the fixed noise distribution
        samples = generator(fixed_noise.float())

        # push samples to computation device
        samples = samples.to(device)

        # append generated fixed samples to the fake_images list
        fake_images.append(samples)

        # set Generator back to train mode
        generator.train()

    # determine mean mini-batch loss of epoch
    discriminator_epoch_loss = np.mean(discriminator_batch_losses)
    generator_epoch_loss = np.mean(generator_batch_losses)

    # collect mean mini-batch loss of epoch
    discriminator_epoch_losses.append(discriminator_epoch_loss)
    generator_epoch_losses.append(generator_epoch_loss)

    # print training progress and training losses
    now = datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S')
    training_epochs.set_description(
        (
            '[LOG {}] epoch: {}, disc.-loss: {}, gen.-loss: {}'.format(str(now), str(epoch).zfill(6), str(round(discriminator_epoch_loss, 6)), str(round(generator_epoch_loss, 6)))
        )
    )

    # set filename of actual discriminator and generator model
    dis_model_name = 'gan_dis_model_epoch_{}.pth'.format(str(epoch).zfill(4))
    gen_model_name = 'gan_gen_model_epoch_{}.pth'.format(str(epoch).zfill(4))

    # save current model to models directory
    torch.save(discriminator.state_dict(), os.path.join(models_directory, dis_model_name))
    torch.save(generator.state_dict(), os.path.join(models_directory, gen_model_name))

# open pickle file output stream
with open('fake_images.pkl', 'wb') as f:

    # save generated samples with pickle
    pkl.dump(fake_images, f)

## 6. Generative Adversarial Network Model Evaluation

As we do not have a test set, the evaluation of a **Generative Adversarial Network (GAN)** model does not resemble that of a typical classifier. First, we base our evaluation on the progression of the losses of our two adversarial models. Then, and more interestingly, we examine the images generated by the `Generator` from the fixed noise we created. In this regard, you could say that the training and testing of the network happen simultaneously.

### 6.1 Training Loss Evaluation

Let's visualize and inspect the loss per training iteration (mini-batch). We'll start with the `Discriminator`'s loss:

In [None]:
# prepare plot
fig, ax = plt.subplots(figsize=(20, 8))

# convert losses to numpy arrays
discriminator_batch_losses = np.array(discriminator_batch_losses)

# add grid
ax.grid(linestyle='dotted')

# plot losses of the Discriminator network
plt.plot(discriminator_batch_losses, label='discriminator-loss (green)', c='tab:green')

# add axis legends
ax.set_xlabel("[Training mini-batch $mb_i$]", fontsize=14)
ax.set_ylabel("[Classification Error of Discriminator $D$, $L^{BCE}$]", fontsize=14)

# add plot legends
plt.legend()

# add plot title
plt.title('Training Iterations $mb_i$ vs. Discriminator Loss $L^{BCE}$', fontsize=16);

Now, let's visualize the `Generator`'s loss:

In [None]:
# prepare plot
fig, ax = plt.subplots(figsize=(20, 8))

# convert losses to numpy arrays
generator_batch_losses = np.array(generator_batch_losses)

# add grid
ax.grid(linestyle='dotted')

# plot losses of the Generator network
plt.plot(generator_batch_losses, label='generator-loss (orange)', c='tab:orange')

# add axis legends
ax.set_xlabel("[Training mini-batch $mb_i$]", fontsize=14)
ax.set_ylabel("[Classification Error of Generator $G$, $L^{BCE}$]", fontsize=14)

# add plot legends
plt.legend()

# add plot title
plt.title('Training Iterations $mb_i$ vs. Generator Loss $L^{BCE}$', fontsize=16);

What our batch losses seem to indicate is that although they are very fluctuating, the `Discriminator` starts off with a low loss which progressively increases, while the `Generator`'s loss decreases throughout the training. Let's plot the mean epoch losses of both models to get a clearer overview:

In [None]:
# prepare plot
fig, ax = plt.subplots(figsize=(20,8))

# convert losses to numpy arrays
discriminator_epoch_losses = np.array(discriminator_epoch_losses)
generator_epoch_losses = np.array(generator_epoch_losses)

# add grid
ax.grid(linestyle='dotted')

# plot losses of the Discriminator and Generator network
plt.plot(discriminator_epoch_losses, label='discriminator-loss (green)', c = 'tab:green')
plt.plot(generator_epoch_losses, label='generator-loss (orange)', c = 'tab:orange')

# add axis legends
ax.set_xlabel("[Training mini-batch $mb_i$]", fontsize=14)
ax.set_ylabel("[Classification Error $L^{BCE}$]", fontsize=14)

# add plot legends
plt.legend()

# add plot title
plt.title('Training Iterations $mb_i$ vs. Generator and Discriminator Loss $L^{BCE}$', fontsize=16);

Okay, fantastic. The training error converges nicely for both networks. The `Discriminator` starts off strong during the first few epochs with a very low loss, while the `Generator` has a very high loss. It is very apparent that the `Generator` has no idea what to do at that point. Then, the trends change as the `Generator` gets better at knowing what fools the `Discriminator` — i.e., it gets better at faking images.

We see that the loss of the `Generator` is consistently a bit lower than that of the `Discriminator` after a few epochs. You could then hypothesize that the `Generator` is often able to fool the `Discriminator`.

### 6.2 Generated Images Evaluation

Let's now inspect the images that were generated during the training of our **Generative Adversarial Network (GAN)** model. To do so, we start by defining a function that we will use to display the generated samples. These samples were created using fixed noise, which helps us see the evolution of the `Generator`'s progress on a fixed distribution.

In [None]:
# create function to view the image samples
def view_samples(epoch, samples):

    # initialize plot
    fig, axes = plt.subplots(figsize=(10,7), nrows=1, ncols=4, sharey=True, sharex=True)
    
    # adjust padding between subplots
    fig.tight_layout(pad=5.0)

    # iterate over fake images at each epoch (we change epochs each 938 mini-batch)
    # remember, we save 4 images together at each iteration
    for i, (ax, img) in enumerate(zip(axes.flatten(), fake_images[epoch*938])):
        
        # create title for each subplot
        ax.set_title(f'Epoch {epoch+1}, Sample {i+1}')

        # detach image
        img = img.detach()

        # disable axes
        ax.xaxis.set_visible(False)
        ax.yaxis.set_visible(False)

        # show 28 by 28 grayscale image
        im = ax.imshow(img.reshape((28,28)).cpu(), cmap='Greys_r')

We now "unpickle" the samples we saved using the `pickle` library:

In [None]:
# open pickle file input stream
with open('fake_images.pkl', 'rb') as f:

    # load generated samples with pickle
    samples = pkl.load(f)

We will call our function at each epoch to inspect the progress:

In [None]:
# iterate over epochs
for i in range(num_epochs):
  
    # call function to view the 4 samples
    view_samples(i, samples)

Cool, right? The samples generated by the `Generator` start off very poorly. They are totally random in the first epoch, and quite bad in the second—although we can witness clear progress. They then progressively improve, to the point where we can clearly see them representing clothes similar to those in the **FashionMNIST** dataset. The quality stabilizes over time. This rapid improvement and stabilization perfectly correspond to the progression of the `Generator`'s loss.

Interestingly, we sometimes see the same sample switching fashion class between epochs. As you can see, the model - whose parameters are updated at each iteration - does not care whether it outputs a shoe or a shirt; it simply aims at minimizing its loss.

## 8. Lab Summary:

In this lab, you successfully accomplished the following key learnings:

> 1. **Understanding the GAN Architecture:** Mastered the fundamental concepts and architectural design of Generative Adversarial Networks (GANs), enhancing your comprehension of deep learning models tailored for generating synthetic data.
> 2. **Model Implementation and Training:** Developed practical skills in implementing and training a GAN model using PyTorch, applying it to the Fashion-MNIST dataset to generate realistic fashion images.
> 3. **Evaluating Model Performance:** Gained expertise in evaluating the performance of GAN models through metrics such as loss functions for both the generator and discriminator, and visually assessing the quality of generated images.
> 4. **Visualization and Interpretation of Generated Data:** Learned to visualize and interpret the generated fashion images, providing deeper insights into the model's ability to capture the underlying distribution of the training data.

This lab provided insights into designing, implementing, training, and evaluating GANs for generating synthetic data. It equipped you with tools and techniques for effective model building, evaluation, and application. These skills are invaluable for succeeding in deep learning.