# Generative Adversarial Networks (GANs)

## Introduction

Generative Adversarial Networks (GANs), introduced by Ian Goodfellow et al. in 2014 [[1]](#ref1), have revolutionized the field of generative modeling. GANs consist of two neural networks, a generator and a discriminator, that are trained simultaneously through an adversarial process. The generator aims to produce data that is indistinguishable from real data, while the discriminator attempts to distinguish between real and generated data.

In this tutorial, we'll explore the architecture of GANs, understand how generators and discriminators work, delve into the mathematical foundations, and implement a GAN for image generation using the MNIST dataset. We'll also discuss some of the latest developments in GAN research.

![GAN Architecture](https://miro.medium.com/max/1400/1*LsfkZXI1i1OcsYbG_H8bOw.png)

*Image Source: [Medium](https://medium.com/)*

## Table of Contents

1. [Understanding GAN Architecture](#1)
   - [Generator Network](#1.1)
   - [Discriminator Network](#1.2)
2. [Mathematical Foundations](#2)
   - [GAN Objective Function](#2.1)
   - [Training Process](#2.2)
3. [Implementing a GAN for Image Generation](#3)
   - [Dataset Preparation](#3.1)
   - [Building the Generator](#3.2)
   - [Building the Discriminator](#3.3)
   - [Defining the GAN](#3.4)
   - [Training the GAN](#3.5)
   - [Generating Images](#3.6)
4. [Latest Developments in GANs](#4)
   - [DCGAN](#4.1)
   - [Wasserstein GAN](#4.2)
   - [StyleGAN](#4.3)
   - [CycleGAN](#4.4)
5. [Conclusion](#5)
6. [References](#6)


<a id="1"></a>
## 1. Understanding GAN Architecture

A GAN consists of two neural networks:

- **Generator (G)**: Learns to generate fake data resembling the real data.
- **Discriminator (D)**: Learns to distinguish between real and fake data.

They are trained simultaneously in a minimax game.

![GAN Training Process](https://miro.medium.com/max/1400/1*yAPb-BHZUAD4vRYPkYcMhA.png)

*Image Source: [Medium](https://medium.com/)*

<a id="1.1"></a>
### Generator Network

The generator takes a random noise vector \( z \) sampled from a prior distribution (e.g., Gaussian) and transforms it into data resembling the real data distribution.

$[
G(z; \theta_g)
]$

- $( \theta_g )$: Parameters of the generator network.

<a id="1.2"></a>
### Discriminator Network

The discriminator receives an input (either real data or generated data) and outputs a probability indicating whether the input is real or fake.

$[
D(x; \theta_d) \in [0, 1]
]$

- $( \theta_d )$: Parameters of the discriminator network.
- $( x )$: Input data (real or generated).

<a id="2"></a>
## 2. Mathematical Foundations

GANs are trained using a minimax game where the generator and discriminator have opposing objectives.

<a id="2.1"></a>
### GAN Objective Function

The objective function for the GAN is defined as:

$[
\min_{G} \max_{D} V(D, G) = \mathbb{E}_{x \sim p_{\text{data}}(x)}[\log D(x)] + \mathbb{E}_{z \sim p_z(z)}[\log (1 - D(G(z)))]
]$

- **Generator Objective:** Minimize $( \log (1 - D(G(z))) )$, i.e., generate data that the discriminator classifies as real.
- **Discriminator Objective:** Maximize $( \log D(x) + \log (1 - D(G(z))) )$, i.e., correctly classify real and fake data.

<a id="2.2"></a>
### Training Process

1. **Update Discriminator:**
   - Maximize the probability of assigning the correct label to both real and generated data.
2. **Update Generator:**
   - Minimize the probability that the discriminator correctly identifies generated data as fake.

In [None]:
pip install tensorflow matplotlib

In [None]:
# Pseudocode for Training
for number of training iterations:
    for k steps:
        Sample real data x ~ p_data(x)
        Sample noise z ~ p_z(z)
        Update the discriminator by ascending its stochastic gradient:
            ∇_θd [log D(x) + log(1 - D(G(z)))]
    Sample noise z ~ p_z(z)
    Update the generator by descending its stochastic gradient:
        ∇_θg [log(1 - D(G(z)))]

<a id="3"></a>
## 3. Implementing a GAN for Image Generation

We'll implement a simple GAN using TensorFlow and Keras to generate images similar to the MNIST handwritten digits.

<a id="3.1"></a>
### Dataset Preparation

We'll use the MNIST dataset, which consists of 28x28 grayscale images of handwritten digits.

In [None]:
# Import necessary libraries
import tensorflow as tf
from tensorflow.keras import layers
import numpy as np
import matplotlib.pyplot as plt

# Load and preprocess the dataset
(train_images, train_labels), (_, _) = tf.keras.datasets.mnist.load_data()
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32')

# Normalize the images to [-1, 1]
train_images = (train_images - 127.5) / 127.5

BUFFER_SIZE = 60000
BATCH_SIZE = 256

# Batch and shuffle the data
train_dataset = tf.data.Dataset.from_tensor_slices(train_images).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)

<a id="3.2"></a>
### Building the Generator

The generator network transforms a random noise vector into a 28x28x1 image.

In [None]:
# Generator Model
def make_generator_model():
    model = tf.keras.Sequential()
    model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Reshape((7, 7, 256)))
    assert model.output_shape == (None, 7, 7, 256)  # None is the batch size

    model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
    assert model.output_shape == (None, 7, 7, 128)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
    assert model.output_shape == (None, 14, 14, 64)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
    assert model.output_shape == (None, 28, 28, 1)

    return model

# Create the generator
generator = make_generator_model()

# Generate a sample noise vector
noise = tf.random.normal([1, 100])
generated_image = generator(noise, training=False)

# Display the generated image
plt.imshow(generated_image[0, :, :, 0], cmap='gray')
plt.show()

The generator starts with a dense layer that transforms the input noise vector into a 7x7x256 tensor. Then, it uses transposed convolutions (also known as deconvolutions) to upsample the tensor to 28x28x1.

<a id="3.3"></a>
### Building the Discriminator

The discriminator network classifies 28x28x1 images as real or fake.

In [None]:
# Discriminator Model
def make_discriminator_model():
    model = tf.keras.Sequential()
    model.add(layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same',
                                     input_shape=[28, 28, 1]))
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(0.3))

    model.add(layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same'))
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(0.3))

    model.add(layers.Flatten())
    model.add(layers.Dense(1))

    return model

# Create the discriminator
discriminator = make_discriminator_model()

# Test the discriminator
decision = discriminator(generated_image)
print(decision)

The discriminator uses convolutional layers to extract features from the input image and outputs a single value representing the probability that the input image is real.

<a id="3.4"></a>
### Defining the GAN

We'll define the loss functions and optimizers for both the generator and discriminator.

In [None]:
# Define loss functions and optimizers
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)

# Discriminator loss
def discriminator_loss(real_output, fake_output):
    real_loss = cross_entropy(tf.ones_like(real_output), real_output)
    fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
    total_loss = real_loss + fake_loss
    return total_loss

# Generator loss
def generator_loss(fake_output):
    return cross_entropy(tf.ones_like(fake_output), fake_output)

# Optimizers
generator_optimizer = tf.keras.optimizers.Adam(1e-4)
discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)

<a id="3.5"></a>
### Training the GAN

We'll define the training loop to update the generator and discriminator.

In [None]:
# Training parameters
EPOCHS = 50
noise_dim = 100
num_examples_to_generate = 16

# Seed for visualization
seed = tf.random.normal([num_examples_to_generate, noise_dim])

# Training step
@tf.function
def train_step(images):
    noise = tf.random.normal([BATCH_SIZE, noise_dim])

    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
        generated_images = generator(noise, training=True)

        real_output = discriminator(images, training=True)
        fake_output = discriminator(generated_images, training=True)

        gen_loss = generator_loss(fake_output)
        disc_loss = discriminator_loss(real_output, fake_output)

    gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
    gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)

    generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
    discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

# Training loop
def train(dataset, epochs):
    for epoch in range(epochs):
        for image_batch in dataset:
            train_step(image_batch)

        # Produce images for the GIF
        display.clear_output(wait=True)
        generate_and_save_images(generator, epoch + 1, seed)

    # Generate after the final epoch
    display.clear_output(wait=True)
    generate_and_save_images(generator, epochs, seed)

# Generate and save images
def generate_and_save_images(model, epoch, test_input):
    predictions = model(test_input, training=False)

    fig = plt.figure(figsize=(4, 4))

    for i in range(predictions.shape[0]):
        plt.subplot(4, 4, i+1)
        plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
        plt.axis('off')

    plt.savefig('image_at_epoch_{:04d}.png'.format(epoch))
    plt.show()

# Start training
from IPython import display
train(train_dataset, EPOCHS)

During training, the generator tries to produce images that can fool the discriminator, while the discriminator tries to become better at distinguishing real images from fake ones.

<a id="3.6"></a>
### Generating Images

After training, we can use the generator to produce new images.

In [None]:
# Generate new images
import numpy as np

noise = tf.random.normal([num_examples_to_generate, noise_dim])
predictions = generator(noise, training=False)

fig = plt.figure(figsize=(4, 4))

for i in range(predictions.shape[0]):
    plt.subplot(4, 4, i+1)
    plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
    plt.axis('off')

plt.show()

The generated images resemble handwritten digits from the MNIST dataset.

<a id="4"></a>
## 4. Latest Developments in GANs

GANs have evolved significantly since their introduction, with numerous variants addressing different challenges and applications.

<a id="4.1"></a>
### 4.1 Deep Convolutional GAN (DCGAN)

**DCGANs** [[2]](#ref2) introduce architectural guidelines to improve the stability of GANs:

- Replace pooling layers with strided convolutions and transposed convolutions.
- Use batch normalization in both the generator and discriminator.
- Remove fully connected hidden layers.
- Use ReLU activation in the generator and LeakyReLU in the discriminator.

<a id="4.2"></a>
### 4.2 Wasserstein GAN (WGAN)

**Wasserstein GANs** [[3]](#ref3) address training instability and mode collapse by using the Wasserstein distance (Earth Mover's Distance) as the loss metric.

**Key Changes:**

- Replace the discriminator with a critic that outputs real-valued scores.
- Remove the sigmoid activation in the output layer of the critic.
- Use weight clipping to enforce a Lipschitz constraint.

In [None]:
# WGAN Critic Loss
def critic_loss(real_output, fake_output):
    return tf.reduce_mean(fake_output) - tf.reduce_mean(real_output)

# WGAN Generator Loss
def generator_loss(fake_output):
    return -tf.reduce_mean(fake_output)

<a id="4.3"></a>
### 4.3 StyleGAN

**StyleGAN** [[4]](#ref4) introduces a new generator architecture capable of synthesizing high-resolution, photorealistic images.

**Key Innovations:**

- Style-based generator architecture.
- Adaptive Instance Normalization (AdaIN) layers.
- Stochastic variation through noise inputs.

**Applications:**

- Generating human faces indistinguishable from real photos.
- Fine-grained control over image attributes.

<a id="4.4"></a>
### 4.4 CycleGAN

**CycleGAN** [[5]](#ref5) enables image-to-image translation without paired training data.

**Key Concepts:**

- **Cycle Consistency Loss:** Ensures that translating an image from one domain to another and back results in the original image.
- **Two Generators and Two Discriminators:** For mapping between the two domains.

<a id="5"></a>
## 5. Conclusion

GANs have opened up exciting possibilities in generative modeling, enabling the creation of realistic images, videos, and more. Understanding the interplay between the generator and discriminator is crucial for effectively training GANs. Ongoing research continues to address challenges such as training stability, mode collapse, and scalability, leading to more powerful and versatile GAN architectures.

<a id="6"></a>
## 6. References

1. <a id="ref1"></a>Goodfellow, I., et al. (2014). *Generative Adversarial Nets*. [NeurIPS 2014](https://papers.nips.cc/paper/5423-generative-adversarial-nets)
2. <a id="ref2"></a>Radford, A., Metz, L., & Chintala, S. (2015). *Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks*. [arXiv:1511.06434](https://arxiv.org/abs/1511.06434)
3. <a id="ref3"></a>Arjovsky, M., Chintala, S., & Bottou, L. (2017). *Wasserstein GAN*. [arXiv:1701.07875](https://arxiv.org/abs/1701.07875)
4. <a id="ref4"></a>Karras, T., Laine, S., & Aila, T. (2018). *A Style-Based Generator Architecture for Generative Adversarial Networks*. [arXiv:1812.04948](https://arxiv.org/abs/1812.04948)
5. <a id="ref5"></a>Zhu, J., et al. (2017). *Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks*. [arXiv:1703.10593](https://arxiv.org/abs/1703.10593)

---

This notebook provides an in-depth exploration of GANs. You can run the code cells to see how GANs are implemented and experiment with different architectures and datasets.