# **Neural Network Systems For Realistic Facial Generation**
**Carter Smith**

**Computer Science Topics H**

# What I Learned About Neural Networks

Here is a brief summary of what I learned about neural networks while researching, as well as how I will apply this learning to my project.

**What is a neural network & generative adversarial network?**

A neural network is a computer system that has multiple functioning parts that communicate with each other. A GAN, or generative adversarial network, works with two parts, a generator and discriminator, that work against each other to create some sort of generated output. The simplest way to think about this is an art forger as the generator and the discriminator as the detective. The forger wants to create as realistic of a fake painting as possible to fool the detective. The detective will determine whether or not the painting is real based off of his knowledge of what real paintings look like. If he determines the painting to be fake, the forger will work differently to try to make it seem real. So the detective and forger are always in an adversarial relationship with each other.
![alt text](https://miro.medium.com/max/958/1*-gFsbymY9oJUQJ-A3GTfeg.png)

**Generators**

The process of generating an image using a GAN, or Generative Adversarial Network, begins with a random number (seed) and ends with a 1024x1024 image. The generator begins by randomly assigning color to pixels. Over time, with help from the discriminator judging its outputs, it begins to create images that are more similar to that of the dataset. As its results become more accurate, the discriminator will become more refined in its comparisons, and in return the generator will also generate more accurate images. Our goal is to use a generator structured like this one to create a realistic face image.

![alt text](https://paper-attachments.dropbox.com/s_257816875AB071372D764C43DFB8A4901E0EBAB81800E986F1F30C17BAA9E957_1557670710523_Architecture-of-the-generator-a-and-discriminator-b-of-our-cGAN-model-The-generator.ppm.png)

**Complete Structures**

What makes a Generative Adversarial Network so effective is that it has 2 networks that work to refine its understanding of real data. 

The generator takes random noise and blows it up into a high quality fake image. This fake image is then compared to a real image, from the dataset, by the discriminator. The discriminator will then output a decimal that determines whether it is real or fake.

Obviously, in early stages of training, the discriminator will easily recognize a face over randomly scrambled pixels. This is why hundreds of thousands of repetitions are needed to create a well-functioning model. Then, each iteration of the generator will make it significantly more accurate.

![alt text](https://miro.medium.com/max/1204/0*LodTiw8Mc84eFtGF.png)

**Latent Vectors**

The Latent Vector is an important concept because it is at the center of the whole image generation process. The vector itself is an integer but can be turned into an image by either the generator or the discriminator. The Vector, or what is referenced in my code as an integer 'seed', represents random noise (created during the 
expand_seed() method). It is the basis for the generator expanding it into a realistic face image. The discriminator shrinks input images into a noise vector to make a decision on its reality. 
![alt text](https://miro.medium.com/max/1400/0*kHJ_LsPi-jz_CreZ.png)


**Datasets**

After research, I found that the standard dataset used for facial recognition GANs is Flickr's HQ Faces Dataset found at https://github.com/NVlabs/ffhq-dataset. The set includes 70,000 PNG images of high-quality, aligned faces that are 1024x1024 pixels. This is the model that StyleGAN2 is trained on.

Here is an example of a face that is included in the dataset:

![alt text](https://i.ibb.co/VSty5Xb/51983.png)


# Implementing TensorFlow & Using StyleGAN2

In [None]:
%tensorflow_version 1.x
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

TensorFlow 1.x selected.
Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


This mounts TensorFlow in Google CoLab. StyleGAN2 only supports TensorFlow 1.x.

You will have to get a Google authorization code from [this API link](https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly). It will be formatted like '4/yAEWLEB7HRm7odqobaddR97CDnuQ50khZ_6ZcnHAJ-Vk8FCkDUHWzV4'.


This also determines where the created images will be generated in. Part of the reason I decided to use Google CoLab was because I can choose to have them generated in Google Drive.



In [None]:
!git clone https://github.com/NVlabs/stylegan2.git

Cloning into 'stylegan2'...
remote: Enumerating objects: 93, done.[K
remote: Total 93 (delta 0), reused 0 (delta 0), pack-reused 93[K
Unpacking objects: 100% (93/93), done.


In [None]:
!ls /content/stylegan2/

dataset_tool.py  LICENSE.txt		 README.md	   run_training.py
dnnlib		 metrics		 run_generator.py  test_nvcc.cu
Dockerfile	 pretrained_networks.py  run_metrics.py    training
docs		 projector.py		 run_projector.py


This will add NVIDIA's 'StyleGAN2' to the local library (GAN = Generative Adversarial Network). This network, created in December 2019, represents the most recent and highest-level image-generating machine learning models. Here is a brief overview of StyleGAN2's capabilities:

*The style-based GAN architecture (StyleGAN) yields state-of-the-art results in data-driven unconditional generative image modeling. Our improved model redefines the state of the art in unconditional image modeling, both in terms of existing distribution quality metrics as well as perceived image quality.*

In my case, we will be utilizing the network, which has the ability to generate any object (like cars, animals, or houses) to create faces.

StyleGAN2 is a project by NVIDIA that is trained off a dataset of tens of thousands of 1024x1024 face image files.





# Using StyleGAN2 by running Python Code

In [None]:
import sys
sys.path.insert(0, "/content/stylegan2")

import dnnlib

This creates a path in Google Drive for StyleGAN2 to be installed at. It also allows for its libraries to be imported into CoLab.

In [None]:
import argparse
import numpy as np
import pickle
import PIL.Image
import dnnlib
import dnnlib.tflib as tflib
import re
import sys
import pretrained_networks

This imports all the required libraries for StyleGAN2 to operate. Many of them are unique to Python (including NumPy / PIL.image), TensorFlow (tflib), or StyleGAN itself (dnnlib)

In [None]:
##This method takes a random integer seed (i.e. 7000) and expands it into an array of random numbers of noise.
##According to the way StyleGAN2 works, vector_size is 512x512. So each seed will be expanded into an array of 512 random numbers.
##This method will fill each array, using NumPy's RandomState(), with random numbers. The generator will later turn these values into noise.
def expand_seed(seeds, vector_size):
    result = []

    for seed in seeds:
      rnd = np.random.RandomState(seed)
      result.append( rnd.randn(1, vector_size) ) 

    return result

##This method generates the actual face images using the FFHQ trained StyleGAN network.
##It takes in 'Gs', or the neural network .pkl file itself, a seed that will be generated, and 
##truncation_psi, a variable created by StyleGAN2, that is supposed to make the image clearer. (It is defined in the network, I only included it because it was recommended)
##This is done by referencing the specific methods & variables of the network using 'kwargs' and with a lot of for each loops
def generate_images(Gs, seeds, truncation_psi):
    noise_vars = [var for name, var in Gs.components.synthesis.vars.items() if name.startswith('noise')]
    
    Gs_kwargs = dnnlib.EasyDict()
    Gs_kwargs.output_transform = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)
    Gs_kwargs.randomize_noise = False

    if truncation_psi is not None:
        Gs_kwargs.truncation_psi = truncation_psi

##This loop specifically takes the seed, enumerates it (assigns pixel #s to noise variables), and gives it to the network to process in Gs.run()
##It also does the actual image conversion by taking the converted array of pixels defined in images[] and saving it as an RGB image using PIL.image.fromarray().save()
    for seed_idx, seed in enumerate(seeds):
        print('%d/%d' % (seed_idx, len(seeds)))
        rnd = np.random.RandomState()
        tflib.set_vars({var: rnd.randn(*var.shape.as_list()) for var in noise_vars})
        images = Gs.run(seed, None, **Gs_kwargs)
        path = f"/content/drive/My Drive/machinelearning/image{seed_idx}.png"
        PIL.Image.fromarray(images[0], 'RGB').save(path)

Defining methods that, using random latent image 'seeds,' create random faces based off of StyleGAN2's model.



In [None]:
##This main method accesses StyleGan's 'dnnlib' to run it (by assigning it a path in Google Drive & the 'generate-images' variable), 
##as well as directs the network (in the Python serialized form of .pkl) to run on the dedicated CoLab GPU 
def main():
    sc = dnnlib.SubmitConfig()
    sc.num_gpus = 1
    sc.submit_target = dnnlib.SubmitTarget.LOCAL
    sc.local.do_not_copy_source_files = True
    sc.run_dir_root = "/content/drive/My Drive/machinelearning"
    sc.run_desc = 'generate-images'
    network_pkl = 'gdrive:networks/stylegan2-ffhq-config-f.pkl'

##This loads the network by assigning it to the Gs variable and defines the seeds to expand and generate
##It does this by calling the expand_seed() and generate_images() methods
    _G, _D, Gs = pretrained_networks.load_networks(network_pkl)
    vector_size = Gs.input_shape[1:][0]
    seeds = expand_seed( range(7992,7995), vector_size)
    generate_images(Gs, seeds,truncation_psi=0.5)

if __name__ == "__main__":
    main()

##IF I HAD MORE TIME, I WOULD HAVE MESSED AROUND WITH THE LATENT VECTOR TO CHANGE THE IMAGES, ENCODED MYSELF, ETC

Downloading http://d36zk2xti64re0.cloudfront.net/stylegan2/networks/stylegan2-ffhq-config-f.pkl ... done
Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Compiling... Loading... Done.
Setting up TensorFlow plugin "upfirdn_2d.cu": Preprocessing... Compiling... Loading... Done.
0/3
1/3
2/3


Defining a main method that will actually generate random images based off of a # seed input.

**MAKE SURE GPU ACCELERATION IS ENABLED FOR THE NOTEBOOK** 