<a href="https://colab.research.google.com/github/Sauder1973/AU_PythonFacialPalette/blob/master/AU_cGAN_Workshop_WSauder_2022.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Conditional Generative Adversarial Network Workshop (cGAN)**

## Presented by Wes Sauder - MScIS Student - Athabasca University

November 12, 2023

Welcome to the workshop!!



# **Goals and Objectives:**

This workshop will guide the learner through a process to develop an understanding of GANS.  During the course of this workshop, this Colab workbook will:
 
*   Initial Environment Setup.
*   Lesson I  : **Data Loader** for Training images  -  *Real Time*
*   Lesson II : **Construct** a cGAN model  - *Real Time*
*   Lesson III: **Train Model**  -  *For your trial after the workshop Approx 1 hour to run, example of code provided*
*   Lesson IV: **Execute Model** - *Real Time with a 'saved' model*
*   Lesson V  : Evaluate effect of **'input noise'** - *Real Time Experiments*
*   Lesson VI: **Imputation** of models Latent Space - *Real Time Experiments*

Each section is clearly evaluated in the code below, along with instructions how to access test data

**The Goal** is to train a Conditional GAN or cGAN model for several different facial expressions.  

Supplied to this notebook are **Four** facial expressions which were previously clustered based on their similiarity to each other and assigned a specific expression catgegory which include:



1.   Neutral Face
2.   Happy Face
3.   Angry Face
4.   Sad Face

Once a model has been trained, the generated images from the cGAN can be recalled, altered by changing the input 'Noise' vector, as well as 'imputed' where a continuum of results is possible between two different categories.

# **Datasets**

The data used in this workshop are images.  We may not think of images as 'data'  in the traditional tabular information models.  However, images are (x,y) plots of a variable data.  The images are captured frames from a larger set of video called the ['RAVDESS dataset'](https://zenodo.org/record/1188976) or The Ryerson Audio-Visual Database of Emotional Speech and Song 

***Reference:***
*Livingstone SR, Russo FA (2018) The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 13(5): e0196391. https://doi.org/10.1371/journal.pone.0196391.*

***License information***
*“The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)” by Livingstone & Russo is licensed under CC BY-NA-SC 4.0.*

**Description**

The dataset contains the complete set of 7356 RAVDESS files (total size: 24.8 GB). Each of the 24 actors consists of three modality formats: Audio-only (16bit, 48kHz .wav), Audio-Video (720p H.264, AAC 48kHz, .mp4), and Video-only (no sound). Note, there are no song files for Actor_18.



**File naming convention**

Each of the 7356 RAVDESS files has a unique filename. The filename consists of a 7-part numerical identifier (e.g., 02-01-06-01-02-01-12.mp4). These identifiers define the stimulus characteristics:

Filename identifiers

Modality (01 = full-AV, 02 = video-only, 03 = audio-only).

*   Vocal channel (01 = speech, 02 = song).
*   Emotion (01 = neutral, 02 = calm, 03 = happy, 04 = sad, 05 = angry, 06 = fearful, 07 = disgust, 08 = surprised).
*   Emotional intensity (01 = normal, 02 = strong). **NOTE**: There is no strong intensity for the ‘neutral’ emotion.
*   Statement (01 = “Kids are talking by the door”, 02 = “Dogs are sitting by the door”).
*   Repetition (01 = 1st repetition, 02 = 2nd repetition).
*   Actor (01 to 24. Odd numbered actors are male, even numbered actors are female).

**Filename example: 02-01-06-01-02-01-12.mp4**
Video-only (02)
Speech (01)
Fearful (06)
Normal intensity (01)
Statement “dogs” (02)
1st Repetition (01)
12th Actor (12)
Female, as the actor ID number is even.


**Please Note!!**
In this workshop, only **Actor 12** was used, and a subset of the facial expressions is used, where only the three clustered expressions have been provided in this analysis.

#Initial Environment Setup


**Runtime Setup**

At this time, please attempt the following:

1.   Navigate to the top ribbon
2.   Select **Runtime**
3.   Select **Change Runtime Type**
4.   Select **Hardware accelerator** and choose **GPU**
5.   Select **Runtime Sahpe** and choose **High-Ram**

Please note, unless you have a Colab Pro Account, you may have some issues selecting these options based on availability.  If you are unable to select these options, or get 'kicked' out of the session, please try again at a later time.




**Packages and Libraries Installation**

The following Python libraries are required to build the GANS, as well as visualize and process the code.

In [1]:
# Torch Libary
import torch
from torch import nn
import torch.nn.functional as F

# Torch Visualization
from torchvision import transforms
import torchvision.transforms as T
from torchvision.datasets import MNIST
from torchvision.utils import make_grid
from torchvision.transforms import Resize
import torchvision

# Torch Dataloader
from torch.utils.data import Dataset
from torch.utils.data import DataLoader

# Python Analysis Packages
import pandas as pd
import numpy as np

import math
import matplotlib.pyplot as plt
#from matplotlib import pyplot as plt
import random

import os
import pickle
import datetime
import pytz

from PIL import Image

from tqdm.auto import tqdm

**Custom Functions**

The following functions are required later in the code. 

In [2]:
def show_tensor_images(image_tensor, num_images=25, size=(1, 28, 28), nrow=5, show=True):
    '''
    Function for visualizing images: Given a tensor of images, number of images, and
    size per image, plots and prints the images in an uniform grid.
    '''
    image_tensor = (image_tensor + 1) / 2
    image_unflat = image_tensor.detach().cpu()
    image_grid = make_grid(image_unflat[:num_images], nrow=nrow)
    #axes = plt.gca()
    #axes.set_aspect(.5)
    plt.imshow(image_grid.permute(1, 2, 0).squeeze())
    if show:
        plt.show()

**Setup Global Variables**

In [3]:
# Set the Seed
torch.manual_seed(999) # Set for our testing purposes, please do not change!
random.seed(999)

# Set the Run_Training Variable

Run_Training = False

# CHAPTER I - Loading Data into Dataloader

In order to conserve time, please connect to GitHub and pull the Image Data required to train the GAN.  While we discuss some basic theory, the data will be loaded into the file space for us to use later. 

**Get Data from GitHub and Save In Colab Session**

To speed up the exercise, preclustered facial expressions have been cropped from video and stored on [GitHub](https://github.com/Sauder1973/faceImages).  Please note, you will attempt to load the data into the dataloader, but will unlikely use it during the training since the time to train a working GAN model exceeds the length of the workshop.  However, we will make sure that these steps still work properly in order for you to try this notebook after the workshop.


In [7]:
!git clone https://github.com/Sauder1973/faceImages.git



fatal: destination path 'faceImages' already exists and is not an empty directory.


Once complete, please check the file directory to the left of the coding box under files.  You should see a new directory with the image files required for training.  
**Please note:**  For this workshop, you may not be able to actually run the dataloader and perform the training in a single step.  This code will allow you to run trials after the workshop.  In the meantime, to ensure you have everything you need, continue to run all the sections until we reach ***Lesson III: Training***

In [8]:
fileName = '/content/faceImages/FaceClusters.csv'

#Read CSV

df_FaceClusters = pd.read_csv(fileName)

df_FaceClusters


Unnamed: 0,ClassLabel,FileNumber,FileName,EmotionLabel
0,4,1136,Image_1136_FullFace.jpg,Angry
1,4,1137,Image_1137_FullFace.jpg,Angry
2,4,1138,Image_1138_FullFace.jpg,Angry
3,4,1234,Image_1234_FullFace.jpg,Angry
4,4,1323,Image_1323_FullFace.jpg,Angry
...,...,...,...,...
1209,24,4623,Image_4623_FullFace.jpg,Happy
1210,24,4628,Image_4628_FullFace.jpg,Happy
1211,24,4714,Image_4714_FullFace.jpg,Happy
1212,24,4715,Image_4715_FullFace.jpg,Happy


Next Confirm the directory 'faceImages' exists and the Zip File can be found within it.  Execute the following in order to unzip the image files.  The next line will 'inflate' the zipped image folder placing the files in a new directory called 'FaceFiles'.  This directory will be used to pull the files for the dataloader.

Please note, once the session times out or disconnects, these files will automatically be removed by Colab.  Therefore, you will need to run this process even if you have already done these steps before.  This drive is only temporarory, unlike a traditional 'Google Drive' account which will stay active and is non volatile.

In [None]:
!unzip /content/faceImages/Image_0115_FullFace.zip -d FaceFiles

**Load Data into Memory using PyTorch Data Loader**

Please refer to the following for information regarding the dataloader: [PyTorch Data Loader](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html)
This worker function provided by PyTorch accelerates the training model by pulling images from storage in parallel with the training process, thus speeding up both training and ensuring faster reads from storage.

Please note, this cGAN implementation is slightly different than a regular dataloader process since it not only pulls the images, it also tags each 'condition' to the image file.  In the case of this exercise, there are **Four** Conditions.  


1.   Neutral Face
2.   Happy Face
3.   Angry Face
4.   Sad Face



**Facial Expression Conditions File**

The first file to load is a table of file names, along with the condition.  This will be used to both load the file by the dataloader, as well as apply a condition to it's label.


In [None]:
#

# Generator and Discriminator Class Creation.

Continuing with the Tutorial, since the basic theory has been discussed, the Generator will be constructed first.  The model can also be observed once the class has been instantiated.

**Generator and Noise**