<center>
    <h1>Generative Adversarial Networks (GAN)</h1>
</center>

# Brief Recap of Generative Adversarial Networks (GANs)

- Generative Adversarial Networks (GANs) are a class of deep learning models introduced by Ian Goodfellow and his colleagues in 2014.

- GANs consist of two neural networks that are trained simultaneously in a competitive process:

    1. **Generator**: This network learns to create data (e.g., images) that resembles the training data.
    
    2. **Discriminator**: This network learns to distinguish between real data from the training set and fake data produced by the generator.

- The two networks are locked in a constant battle, with the generator trying to produce increasingly realistic data to fool the discriminator, and the discriminator becoming better at detecting fake data.

## GAN Architecture

- Generative Adversarial Networks (GANs) consist of two main components: the Generator and the Discriminator.

- These two neural networks work in tandem to create a powerful generative model.

### How do GANs work?

The training process of GANs can be likened to a game between a counterfeiter (generator) and a detective (discriminator):

1. The generator creates fake data from random noise.

2. The discriminator is presented with both real and fake data and tries to distinguish between them.
3. Based on the discriminator's feedback, the generator adjusts its parameters to produce more convincing fakes.
4. This process continues iteratively, with both networks improving over time.

<center>
    <img src="static/image2.gif" alt="GAN Workflow" style="width:50%;">
</center>

### Generator

The Generator is responsible for creating synthetic data that resembles the training dataset.

**Structure:**

- Input Layer: Takes random noise as input.

- Hidden Layers: A series of dense and transposed convolutional layers that progressively upsample the input.
- Output Layer: Produces an image of the desired size.

### Discriminator

The Discriminator acts as a binary classifier, distinguishing between real images from the dataset and fake images produced by the Generator.

**Structure:**

- Input Layer: Takes an image as input.

- Hidden Layers: A series of convolutional layers that downsample the input.
- Output Layer: A single neuron that outputs the probability of the input being real.

<center>
    <img src="static/image1.png" alt="GAN Workflow" style="width:50%;">
</center>

### Challenges in Training GANs

While powerful, GANs are notoriously difficult to train due to several challenges:

- **Mode Collapse**: The generator may learn to produce only a limited variety of outputs.

- **Convergence Issues**: Finding the right balance between the generator and discriminator can be tricky.
- **Evaluation Metrics**: Quantitatively assessing the quality of generated samples is challenging.

## Applications of GANs

GANs have found numerous applications across various fields:

- **Image Generation**: Creating realistic images, artwork, and even faces of non-existent people.

- **Image-to-Image Translation**: Transforming images from one domain to another (e.g., sketches to photos, day to night scenes).
- **Super-Resolution**: Enhancing the resolution and quality of low-resolution images.
- **Text-to-Image Synthesis**: Generating images based on textual descriptions.
- **Data Augmentation**: Creating synthetic data to enhance training datasets for other machine learning models.


In this notebook, we'll implement a basic GAN architecture to generate handwritten digits similar to those in the MNIST dataset. Through this process, we'll explore the fundamental concepts and challenges of training GANs using TensorFlow.


# Implementing GANs with Tensorflow

## Generator Model

The create_generator_model() function defines the architecture of the generator in our GAN. Let's break it down:

In [1]:
import tensorflow.keras as keras
# Sample code to create a generator model
def create_generator_model():
    model = keras.Sequential([
        keras.layers.Dense(16, input_dim=100, activation='relu'),
        keras.layers.Dense(32, activation='relu'),
        keras.layers.Dense(2, activation='linear')
    ])
    return model

This function creates a sequential model using Keras, which is a linear stack of layers. Here's what each layer does:

1. **Input Layer**: 
   ```python
   keras.layers.Dense(16, input_dim=100, activation='relu')
   ```
   - This is the first layer of the generator.
   
   - It takes an input of dimension 100 (typically random noise).
   - It has 16 neurons (units) and uses the ReLU activation function.
   - The ReLU function helps introduce non-linearity, allowing the model to learn complex patterns.
   - For more information about the Dense Layer, refer this link: [Tensorflow Dense Layer Documentation](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense)

2. **Hidden Layer**:
   ```python
   keras.layers.Dense(32, activation='relu')
   ```
   - This is a hidden layer with 32 neurons.
   
   - It also uses the ReLU activation function.
   - This layer helps the model learn more complex representations of the data.

3. **Output Layer**:
   ```python
   keras.layers.Dense(2, activation='linear')
   ```
   - This is the output layer of the generator.

   - It has 2 neurons, corresponding to the two dimensions of our generated data points.
   - It uses a linear activation function, which means it outputs raw values without any transformation.

To use this generator in your GAN:

1. Create an instance of the model:
   ```python
   generator = make_generator_model()
   ```

2. You can then use this generator to produce fake samples:
   ```python
   noise = tf.random.normal([batch_size, 100])
   generated_samples = generator(noise, training=True)
   ```

**Important Points::**
- The generator's task is to transform the random input noise (100-dimensional vector) into a 2-dimensional point that resembles the real data distribution. 

- As the GAN training progresses, the generator will learn to produce points that will be increasingly similar to the real data.

- This generator architecture is relatively simple, suitable for generating 2D Gaussian data. For more complex data (like images), you would typically use a more sophisticated architecture, possibly including convolutional layers for upsampling.

## Discriminator Model

The discriminator in a GAN is responsible for distinguishing between real data samples and fake samples generated by the generator. Let's examine its architecture:

In [2]:
def create_discriminator_model():
    model = keras.Sequential([
        keras.layers.Dense(32, input_dim=2, activation='relu'),
        keras.layers.Dense(16, activation='relu'),
        keras.layers.Dense(1, activation='sigmoid')
    ])
    return model

This function creates a sequential model for the discriminator. Let's break down each layer:

**1. Input Layer:**
```python
keras.layers.Dense(32, input_dim=2, activation='relu')
```
- This is the input layer of the discriminator.

- It expects input data with 2 dimensions, corresponding to our 2D Gaussian data points.
- It has 32 neurons and uses the ReLU activation function.
- The ReLU function introduces non-linearity, allowing the model to learn complex patterns.

**2. Hidden Layer**
```python
keras.layers.Dense(16, activation='relu')
```
- This is a hidden layer with 16 neurons.

- It also uses the ReLU activation function.
- This layer helps the model learn more sophisticated representations of the input data.

**3. Output Layer**
```python
keras.layers.Dense(1, activation='sigmoid')
```
- This is the output layer of the discriminator.

- It has a single neuron, which outputs a probability between 0 and 1.
- The sigmoid activation function is used to squash the output into this probability range.
- An output close to 1 indicates the discriminator believes the input is real, while an output close to 0 suggests it thinks the input is fake.


To use this discriminator in your GAN:

1. Create an instance of the model:
   ```python
   discriminator = make_discriminator_model()
   ```

2. You can then use this discriminator to classify real and fake samples:
   ```python
   real_output = discriminator(real_samples, training=True)
   fake_output = discriminator(generated_samples, training=True)
   ```

**Important Points:**

- The discriminator's role is to learn to distinguish between the real data points and the fake points produced by the generator.

- As training progresses, the discriminator should become better at this task, which in turn forces the generator to produce more realistic samples.
- This architecture is suitable for 2D data. For more complex data types, such as images, you would typically use a more sophisticated architecture, possibly including convolutional layers for feature extraction.


# Let's Build a Real world Project to understand the concept of GANs better

# The MNSIT Handwritten Digit Recognition Problem

## Key Features:

- **Content**: 70,000 grayscale images of handwritten digits (0-9)

- **Image Size**: 28x28 pixels
- **Split**: 60,000 training images, 10,000 test images
- For more information about the dataset, refer to this link: [Tensorflow MNSIT Dataset](https://www.tensorflow.org/datasets/catalog/mnist)

## Our Challenge:

In this notebook, we'll use MNIST to train a Generative Adversarial Network (GAN). Our goal is to create a model that can generate new, realistic handwritten digits that resemble those in the MNIST dataset.

This project will demonstrate:
- Data preprocessing for GANs

- Designing generator and discriminator networks
- Training adversarial models
- Evaluating generated images

By working with MNIST, we'll gain practical experience in building and training GANs, a powerful class of generative models in deep learning.

## Loading and Preprocessing the data for GAN

In [None]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

# Load the MNIST dataset
(x_train, _), (x_test, _) = keras.datasets.mnist.load_data()

print("Training data shape:", x_train.shape)
print("Test data shape:", x_test.shape)

Here, for learning purposes, we'll be using a subset of the dataset for training our model and testing it

In [4]:
# Use a subset of the data (e.g., 10% for training, 5% for testing)
x_train = x_train[:6000]
x_test = x_test[:500]

### Normalizing pixel values

In the below code, we normalize the pixel values of the MNIST images to the range [-1, 1]. Below are the reasons, why this is important:

- Consistent Scale: It ensures all pixel values are on the same scale, which helps the neural network learn more effectively.

- Centered Data: By subtracting 127.5, we center the data around zero, which can improve training stability.
- Range [-1, 1]: This range is particularly useful for GANs, as the generator's output layer often uses a tanh activation, which produces values in this range.
- Improved Gradient Flow: Normalized data can lead to better gradient flow during backpropagation, potentially speeding up training.


In [21]:
# Normalize the images to [-1, 1]
x_train = (x_train.astype('float32') - 127.5) / 127.5
x_test = (x_test.astype('float32') - 127.5) / 127.5

### Adding channel dimension

In the below code, we include a channel dimension for:

- Convolutional Layers: Most deep learning frameworks, including TensorFlow, expect input images for convolutional layers to have a shape of (height, width, channels).

- Grayscale Images: MNIST images are grayscale, so we add a single channel. For color images, this would typically be 3 channels (RGB).

- Compatibility: This reshaping ensures our data is compatible with the input shape expected by our GAN models.
- Consistency: By applying this to both training and test sets, we maintain consistency across our dataset.

- The resulting shape for each image becomes (28, 28, 1), where:
    - 28 is the height of the image
    - 28 is the width of the image
    - 1 is the number of channels (grayscale)

In [6]:
# Add channel dimension
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

In the below code, we create a dataset from our training data:

- tf.data.Dataset: This is TensorFlow's API for building efficient input pipelines. It allows for fast data loading and preprocessing.

- from_tensor_slices(): This function creates a dataset from our training data (x_train). It allows us to work with our data in smaller, manageable pieces.

- shuffle():
    - BUFFER_SIZE = 6000: This determines how many elements from the dataset it will buffer for shuffling.
    - Shuffling helps prevent the model from learning the order of the training data, which could lead to overfitting.

- batch():
    - BATCH_SIZE = 64: This sets how many images will be processed in each training step.
    - Batching allows for more efficient computation and helps in stabilizing the training process.

In [7]:
# Create tf.data.Dataset
BUFFER_SIZE = 6000
BATCH_SIZE = 64

train_dataset = tf.data.Dataset.from_tensor_slices(x_train).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)

## Designing the model

### Generator Model

The below architecture progressively upsamples the input noise to generate a 28x28 pixel image, suitable for the MNIST dataset. The use of transposed convolutions allows the model to learn upsampling, effectively reversing the process of a typical convolutional network.

**Key Components:**
- Input: Takes a 100-dimensional noise vector.

- Dense Layer: Expands the input to 7x7x256, preparing for reshaping.
- Reshape: Converts the dense output to a 3D shape for convolutional processing.
- Transposed Convolutions: Used for upsampling, increasing the image size.
- Batch Normalization: Stabilizes learning by normalizing layer inputs.
- LeakyReLU: Activation function that allows a small gradient for negative inputs.
- Output: A single channel image (28x28x1) with tanh activation, producing values in [-1, 1].

In [8]:
def make_generator_model():
    model = keras.Sequential([
        # Input layer
        keras.layers.Dense(7*7*256, use_bias=False, input_shape=(100,)),
        keras.layers.BatchNormalization(),
        keras.layers.LeakyReLU(),
        
        # Reshape to start the convolutional structure
        keras.layers.Reshape((7, 7, 256)),
        
        # First upsampling block
        keras.layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False),
        keras.layers.BatchNormalization(),
        keras.layers.LeakyReLU(),
        
        # Second upsampling block
        keras.layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False),
        keras.layers.BatchNormalization(),
        keras.layers.LeakyReLU(),
        
        # Output layer
        keras.layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh')
    ])
    return model

### Discriminator Model

The below architecture progressively reduces the spatial dimensions of the input while increasing the number of feature channels. The final dense layer outputs a single value, which can be interpreted as the discriminator's confidence that the input image is real (after applying a sigmoid activation, which is typically done in the loss function).

**Key Components:**
- Input: Accepts 28x28x1 images (MNIST format).

- Convolutional Layers: Extract features from the input images.
    - First layer: 64 filters
    - Second layer: 128 filters
    - Both use 5x5 kernels and stride of 2 for downsampling

- LeakyReLU: Activation function that allows a small gradient for negative inputs, helping with the learning process.

- Dropout: Helps prevent overfitting by randomly setting input units to 0 during training.

- Flatten: Converts the 2D feature maps to a 1D vector for the dense layer.

- Dense Output: Single neuron output, representing the probability of the input being real.

In [9]:
def make_discriminator_model():
    model = keras.Sequential([
        # First convolutional layer
        keras.layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same', input_shape=[28, 28, 1]),
        keras.layers.LeakyReLU(),
        keras.layers.Dropout(0.3),
        
        # Second convolutional layer
        keras.layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same'),
        keras.layers.LeakyReLU(),
        keras.layers.Dropout(0.3),
        
        # Flatten the output
        keras.layers.Flatten(),
        
        # Output layer
        keras.layers.Dense(1)
    ])
    return model

In [None]:
# Create the generator and discriminator models
generator = make_generator_model()
discriminator = make_discriminator_model()

### Loss Functions

In a GAN, we need separate loss functions for the discriminator and the generator. These functions guide the training process for each network.

We use binary cross-entropy as our base loss function. The from_logits=True parameter indicates that the model output hasn't been passed through a sigmoid function yet.

In [11]:
# Define loss functions
cross_entropy = keras.losses.BinaryCrossentropy(from_logits=True)

#### Discriminator Loss

The discriminator loss measures how well the discriminator can distinguish between real and fake images:

- real_loss: Calculates the loss for real images. The target is 1 (tf.ones_like), as we want the discriminator to classify real images as real.

- fake_loss: Calculates the loss for fake images. The target is 0 (tf.zeros_like), as we want the discriminator to classify fake images as fake.
- total_loss: The sum of real and fake losses, giving equal importance to both tasks.

In [12]:
def discriminator_loss(real_output, fake_output):
    real_loss = cross_entropy(tf.ones_like(real_output), real_output)
    fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
    total_loss = real_loss + fake_loss
    return total_loss

#### Generator Loss

- The generator loss measures how well the generator can fool the discriminator.

- We use tf.ones_like(fake_output) as the target, meaning we want the generator to produce images that the discriminator classifies as real (1).
- The loss is lower when the discriminator is more easily fooled by the generated images.

In [13]:
def generator_loss(fake_output):
    return cross_entropy(tf.ones_like(fake_output), fake_output)

### Optimizers

In our GAN, we need separate optimizers for the generator and discriminator. These optimizers are responsible for updating the model parameters during training.

In [14]:
# Define optimizers
generator_optimizer = keras.optimizers.Adam(1e-4)
discriminator_optimizer = keras.optimizers.Adam(1e-4)

## Training the model

- We first define a train_step function that defines a single training iteration for our GAN.

- It's decorated with @tf.function for improved performance through graph execution.

In [15]:
# Training step
@tf.function
def train_step(images):
    noise = tf.random.normal([BATCH_SIZE, 100])

    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
        generated_images = generator(noise, training=True)

        real_output = discriminator(images, training=True)
        fake_output = discriminator(generated_images, training=True)

        gen_loss = generator_loss(fake_output)
        disc_loss = discriminator_loss(real_output, fake_output)

    gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
    gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)

    generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
    discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

    return gen_loss, disc_loss

**Key Steps of train_step:**

- Generate Noise: Create random noise as input for the generator.

- Generate Fake Images: Use the generator to create fake images from the noise.
- Discriminator Evaluation: Get the discriminator's output for both real and fake images.
- Calculate Losses: Compute losses for both the generator and discriminator.
- Compute Gradients: Use GradientTape to calculate gradients for both networks.
- Update Models: Apply the calculated gradients to update the model parameters.

In [16]:
# Function to get available devices
def get_available_devices():
    devices = tf.config.list_physical_devices()
    return [d for d in devices if d.device_type in ['CPU', 'GPU']]

# training loop
def train(dataset, epochs):
    # Get available devices
    available_devices = get_available_devices()
    device_type = "GPU" if any(d.device_type == "GPU" for d in available_devices) else "CPU"
    
    print(f"Training on {device_type}")

    for epoch in range(epochs):
        for image_batch in dataset:
            gen_loss, disc_loss = train_step(image_batch)
        
        # Print losses every 10 epochs
        if (epoch + 1) % 10 == 0:
            print(f"Epoch {epoch+1}, Gen Loss: {gen_loss:.4f}, Disc Loss: {disc_loss:.4f}")
        
        # Generate and save images every 10 epochs
        if (epoch + 1) % 10 == 0:
            generate_and_save_images(generator, epoch + 1, seed)

**Key Components of train loop:**

- Epoch Loop: Iterates through the specified number of epochs.

- Batch Processing: For each epoch, it processes the dataset batch by batch.
- Training Step: Calls the train_step function for each batch, updating both the generator and discriminator.
- Loss Reporting: Prints the generator and discriminator losses every 10 epochs, allowing us to monitor training progress.
- Image Generation: Every 10 epochs, it generates and saves sample images using the current state of the generator. This helps visualize the improvement in image quality over time.

The below for visualizing the progress of our GAN. It generates sample images using the current state of the generator and saves them as a grid.

In [17]:
import os

def generate_and_save_images(model, epoch, test_input):
    predictions = model(test_input, training=False)
    fig = plt.figure(figsize=(4, 4))

    for i in range(predictions.shape[0]):
        plt.subplot(4, 4, i+1)
        plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
        plt.axis('off')

    # Check if directory exists, if not, create it
    output_dir = 'data'
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    # Save the figure in the output directory
    plt.savefig(os.path.join(output_dir, f'image_at_epoch_{epoch:04d}.png'))
    plt.close()

## Set training configurations and visualize the results

Creates a fixed random seed for consistent image generation during training.

In [18]:
seed = tf.random.normal([16, 100])

Sets the number of training epochs to 50 and calls the train function to start the GAN training process.

In [None]:
EPOCHS = 50
train(train_dataset, EPOCHS)

The below code generates a new set of 16 random images using the trained generator and creates a 4x4 grid to display these images. It also rescales the pixel values from [-1, 1] to [0, 255] for proper display.

In [None]:
num_examples_to_generate = 16
random_vector_for_generation = tf.random.normal([num_examples_to_generate, 100])
generated_images = generator(random_vector_for_generation, training=False)

fig = plt.figure(figsize=(4, 4))
for i in range(num_examples_to_generate):
    plt.subplot(4, 4, i+1)
    plt.imshow(generated_images[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
    plt.axis('off')
plt.show()