# Generative Adversarial Networks (GANs) for Dog Image Generation
### Project Overview
This project focuses on using Generative Adversarial Networks (GANs) to generate images of dogs. GANs are a class of machine learning models that consist of two neural networks: a generator and a discriminator. The generator creates images, while the discriminator evaluates them, distinguishing between real and generated images. The goal is to train the GAN so that the generated images become indistinguishable from real dog images.

### Learning Algorithms and Task
- Type of Learning: Deep Learning
- Algorithms: Generative Adversarial Networks (GANs)
- Task: Image Generation

### Motivation and Goal
The primary motivation behind this project is to explore the capabilities of GANs in generating realistic images. Specifically, we aim to create images of dogs that can be convincingly classified as real by a pre-trained classifier. This project will not only enhance understanding of GANs but also contribute to the creative frontier of machine learning, where models can generate new and lifelike data.

### Data Source and Provenance
The dataset used for this project is sourced from the ImageNet database. It includes images of dogs and their corresponding annotations, detailing the breed and bounding box coordinates.

### Dataset Citation
#### Primary Reference
Khosla, A., Jayadevaprakash, N., Yao, B., & Fei-Fei, L. (2011). Stanford Dogs Dataset. Retrieved from http://vision.stanford.edu/aditya86/ImageNetDogs/

#### Secondary Reference
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A Large-Scale Hierarchical Image Database. In IEEE Computer Vision and Pattern Recognition (CVPR). Retrieved from http://vision.stanford.edu/aditya86/ImageNetDogs/
### Data Description
The dataset consists of:
- Images: JPG format images of dogs.
- Annotations: XML files containing metadata such as image dimensions, dog breed, and bounding box coordinates.
### Data Size
- Number of Images: 20,580
- Annotation Fields: filename, width, height, class, xmin, ymin, xmax, ymax

## Exploratory Data Analysis (EDA)
### Initial Data Loading and Inspection
We started by loading the dataset and parsing the annotation files to create a DataFrame. The DataFrame provides a structured view of the annotations associated with each image.

In [None]:
import os
import xml.etree.ElementTree as ET
import pandas as pd
import matplotlib.pyplot as plt
import cv2
import numpy as np
from PIL import Image

In [None]:
# Define the paths
images_path = 'all-dogs/'
annotations_path = 'Annotation/'

# Function to parse annotation files
def parse_annotation(annotation_file):
    tree = ET.parse(annotation_file)
    root = tree.getroot()
    
    annotation_data = {
        'filename': root.find('filename').text,
        'width': int(root.find('size/width').text),
        'height': int(root.find('size/height').text),
        'class': root.find('object/name').text,
        'xmin': int(root.find('object/bndbox/xmin').text),
        'ymin': int(root.find('object/bndbox/ymin').text),
        'xmax': int(root.find('object/bndbox/xmax').text),
        'ymax': int(root.find('object/bndbox/ymax').text),
    }
    
    return annotation_data

# Load all annotations into a DataFrame
annotations = []
for annotation_folder in os.listdir(annotations_path):
    folder_path = os.path.join(annotations_path, annotation_folder)
    for annotation_file in os.listdir(folder_path):
        annotation_path = os.path.join(folder_path, annotation_file)
        annotation_data = parse_annotation(annotation_path)
        annotations.append(annotation_data)

annotations_df = pd.DataFrame(annotations)
print(annotations_df.head())

### Visualizations
To understand the dataset better, we visualized some sample images with their bounding boxes.

In [None]:
# Visualize some sample images with bounding boxes
def plot_image_with_bounding_box(image_path, annotation):
    image = cv2.imread(image_path)
    if image is None:
        print(f"Warning: Could not read image {image_path}. Skipping...")
        return
    cv2.rectangle(image, (annotation['xmin'], annotation['ymin']), (annotation['xmax'], annotation['ymax']), (255, 0, 0), 2)
    plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
    plt.title(annotation['class'])
    plt.show()

# Plot a few sample images
sample_annotations = annotations_df.sample(3)
for idx, row in sample_annotations.iterrows():
    image_file = row['filename'] + '.jpg'
    image_path = os.path.join(images_path, image_file)
    plot_image_with_bounding_box(image_path, row)

### Data Cleaning and Transformation
We checked for missing values and ensured that the bounding box coordinates were within the image dimensions.

In [None]:
# Check for missing values
print(annotations_df.isnull().sum())

# Ensure bounding box coordinates are within image dimensions
annotations_df['xmin'] = annotations_df[['xmin', 'width']].min(axis=1)
annotations_df['ymin'] = annotations_df[['ymin', 'height']].min(axis=1)
annotations_df['xmax'] = annotations_df[['xmax', 'width']].min(axis=1)
annotations_df['ymax'] = annotations_df[['ymax', 'height']].min(axis=1)

# Remove any rows with invalid coordinates (optional)
annotations_df = annotations_df[(annotations_df['xmin'] >= 0) & (annotations_df['ymin'] >= 0) & 
                                (annotations_df['xmax'] <= annotations_df['width']) & 
                                (annotations_df['ymax'] <= annotations_df['height'])]

print(annotations_df.shape)

### Normalization and Preprocessing
We normalized the images to have pixel values between 0 and 1, as GANs typically require normalized inputs.

In [None]:
# Function to normalize images
def normalize_image(image):
    return image / 255.0

# Function to load and preprocess images using annotations
def load_and_preprocess_images(images_path, annotations_path):
    processed_images = []
    annotations = []

    for annotation_folder in os.listdir(annotations_path):
        folder_path = os.path.join(annotations_path, annotation_folder)
        for annotation_file in os.listdir(folder_path):
            annotation_path = os.path.join(folder_path, annotation_file)
            annotation_data = parse_annotation(annotation_path)
            
            image_file = annotation_data['filename'] + '.jpg'
            image_path = os.path.join(images_path, image_file)
            
            # Skip if the path contains %s
            if '%s' in image_path:
                continue
            
            try:
                image = Image.open(image_path)
            except FileNotFoundError:
                continue

            if image is not None:
                xmin, ymin, xmax, ymax = (annotation_data['xmin'], annotation_data['ymin'],
                                          annotation_data['xmax'], annotation_data['ymax'])
                w = np.min((xmax - xmin, ymax - ymin))
                bbox = (xmin, ymin, xmin + w, ymin + w)
                cropped_image = image.crop(bbox)
                resized_image = cropped_image.resize((128, 128))
                normalized_image = normalize_image(np.array(resized_image))
                processed_images.append(normalized_image)
                annotations.append(annotation_data)
    
    return np.array(processed_images), annotations

# Display the first image as a preview
image_path = os.path.join(images_path, annotations_df.iloc[0]['filename'] + '.jpg')
# Load and preprocess all images
processed_images, annotations = load_and_preprocess_images(images_path, annotations_path)

# Display the first image as a preview
if len(processed_images) > 0:
    processed_image = processed_images[0]
    processed_image_uint8 = (processed_image * 255).astype('uint8')
    plt.imshow(cv2.cvtColor(processed_image_uint8, cv2.COLOR_BGR2RGB))
    plt.title("Preview of the first processed image")
    plt.show()

# Print shapes for confirmation
print(f'Total images: {len(processed_images)}')

## Perform Analysis Using Deep Learning Models

In [None]:
import tensorflow as tf
from tensorflow.keras.layers import Dense, Reshape, Flatten, LeakyReLU, Conv2D, Conv2DTranspose, Dropout, BatchNormalization
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
import numpy as np

gpus = tf.config.list_physical_devices('GPU')
if gpus:
    print("GPUs available:")
    for gpu in gpus:
        print(f"- {gpu}")
else:
    print("No GPUs available. Training will be performed on CPU.")

### Basic GAN
We'll start with a basic GAN model and gradually move to more advanced architectures.
#### Model Architecture
The generator is a neural network that takes a random noise vector as input and transforms it into an image. The discriminator is a neural network that takes an image as input and outputs a single value representing whether the image is real or fake.

In [None]:
# Define GAN components
def build_generator_gan():
    model = Sequential()
    model.add(Dense(256, input_dim=100))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dense(512))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dense(1024))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dense(128 * 128 * 3, activation='tanh'))
    model.add(Reshape((128, 128, 3)))
    return model

def build_discriminator_gan():
    model = Sequential()
    model.add(Flatten(input_shape=(128, 128, 3)))
    model.add(Dense(512))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dense(256))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dense(1, activation='sigmoid'))
    return model

#### Compile and Train the GAN
The GAN is compiled and trained by alternating between training the discriminator and the generator.

In [None]:
def compile_and_train_gan(generator, discriminator, epochs, batch_size, save_interval):
    discriminator.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5), metrics=['accuracy'])
    discriminator.trainable = False

    gan = Sequential([generator, discriminator])
    gan.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5))

    train_gan(gan, generator, discriminator, epochs, batch_size, save_interval)

### Deep Convolutional GAN (DCGAN)
#### Model Architecture
The DCGAN generator uses convolutional layers to create more detailed and high-quality images. The DCGAN discriminator uses convolutional layers to better distinguish between real and fake images.

In [None]:
# Generator model
def build_generator_dcgan():
    model = Sequential()
    model.add(Dense(128 * 16 * 16, activation="relu", input_dim=100))
    model.add(Reshape((16, 16, 128)))
    model.add(Conv2DTranspose(128, kernel_size=4, strides=2, padding="same"))
    model.add(BatchNormalization(momentum=0.8))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Conv2DTranspose(64, kernel_size=4, strides=2, padding="same"))
    model.add(BatchNormalization(momentum=0.8))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Conv2DTranspose(3, kernel_size=4, strides=2, padding="same", activation='tanh'))
    return model

# Discriminator model
def build_discriminator_dcgan():
    model = Sequential()
    model.add(Conv2D(64, kernel_size=4, strides=2, input_shape=(128, 128, 3), padding="same"))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dropout(0.3))
    model.add(Conv2D(128, kernel_size=4, strides=2, padding="same"))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dropout(0.3))
    model.add(Flatten())
    model.add(Dense(1, activation='sigmoid'))
    return model

#### Compile and Train the DCGAN
The DCGAN is compiled and trained similarly to the basic GAN, but using the DCGAN generator and discriminator.

In [None]:
def compile_and_train_dcgan(generator, discriminator, epochs, batch_size, save_interval):
    discriminator.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5), metrics=['accuracy'])
    discriminator.trainable = False

    gan = Sequential([generator, discriminator])
    gan.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5))

    train_gan(gan, generator, discriminator, epochs, batch_size, save_interval)

#### Training Function

In [None]:
def train_gan(gan, generator, discriminator, epochs, batch_size, save_interval):
    valid = np.ones((batch_size, 1))
    fake = np.zeros((batch_size, 1))

    for epoch in range(epochs):
        idx = np.random.randint(0, processed_images.shape[0], batch_size)
        real_imgs = processed_images[idx]

        noise = np.random.normal(0, 1, (batch_size, 100))
        gen_imgs = generator.predict(noise)

        d_loss_real = discriminator.train_on_batch(real_imgs, valid)
        d_loss_fake = discriminator.train_on_batch(gen_imgs, fake)
        d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

        noise = np.random.normal(0, 1, (batch_size, 100))
        g_loss = gan.train_on_batch(noise, valid)

        if epoch % save_interval == 0 or epoch == 0:
            print(f"{epoch} [D loss: {d_loss[0]} | D accuracy: {100 * d_loss[1]}] [G loss: {g_loss}]")
            save_images(generator, epoch)

def save_images(generator, epoch, folder='images/'):
    noise = np.random.normal(0, 1, (25, 100))
    gen_imgs = generator.predict(noise)
    gen_imgs = 0.5 * gen_imgs + 0.5  # Rescale to [0, 1]

    fig, axs = plt.subplots(5, 5, figsize=(10, 10))
    count = 0
    for i in range(5):
        for j in range(5):
            axs[i, j].imshow(gen_imgs[count])
            axs[i, j].axis('off')
            count += 1
    plt.show()
    fig.savefig(f"{folder}/gan_generated_image_epoch_{epoch}.png")
    plt.close()

### Train and Compare the Models
Train both the basic GAN and DCGAN models, and compare their performance by evaluating the generated images.

In [None]:
# Train the basic GAN
generator_gan = build_generator_gan()
discriminator_gan = build_discriminator_gan()
compile_and_train_gan(generator_gan, discriminator_gan, epochs=20000, batch_size=64, save_interval=1000)

In [None]:
# Train the DCGAN
generator_dcgan = build_generator_dcgan()
discriminator_dcgan = build_discriminator_dcgan()
compile_and_train_dcgan(generator_dcgan, discriminator_dcgan, epochs=20000, batch_size=64, save_interval=1000)

### Hyperparameter Tuning

In [None]:
import random

# Define hyperparameter space
param_grid = {
    'learning_rate': [0.0002, 0.0001, 0.00005],
    'beta1': [0.5, 0.4, 0.6],
    'batch_size': [64, 128, 256]
}

# Random search to select hyperparameters
def random_search(param_grid, n_iter=10):
    best_params = None
    best_performance = float('inf')

    for _ in range(n_iter):
        params = {key: random.choice(values) for key, values in param_grid.items()}
        performance = train_gan_with_params(params)
        if performance < best_performance:
            best_performance = performance
            best_params = params

    return best_params

# Train GAN with selected hyperparameters
def train_gan_with_params(params):
    learning_rate = params['learning_rate']
    beta1 = params['beta1']
    batch_size = params['batch_size']
    
    generator = build_generator_gan()
    discriminator = build_discriminator_gan()
    
    optimizer = Adam(learning_rate, beta_1=beta1)
    discriminator.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    discriminator.trainable = False
    
    gan = Sequential([generator, discriminator])
    gan.compile(loss='binary_crossentropy', optimizer=optimizer)
    
    train_gan(gan, generator, discriminator, epochs=5000, batch_size=batch_size, save_interval=1000)
    
    # Dummy performance metric (e.g., loss after a few epochs)
    performance = random.random()  # Replace this with actual evaluation metric
    return performance

# Perform random search
best_params = random_search(param_grid, n_iter=10)
print(f'Best hyperparameters: {best_params}')

## Analysis Using Deep Learning Models: GAN vs DCGAN

In this analysis, we compare two Generative Adversarial Network (GAN) models: the basic GAN and the Deep Convolutional GAN (DCGAN). The objective is to evaluate which model performs better in generating realistic images of dogs. We will discuss the differences between the models, their architectures, and the outcomes of their training.

### Model Architectures
#### Basic GAN
- Generator: The basic GAN generator uses fully connected (dense) layers to transform a noise vector into a 128x128x3 image.
- Discriminator: The discriminator flattens the input image and passes it through several dense layers to classify it as real or fake.
DCGAN
- Generator: The DCGAN generator uses convolutional transpose layers to upsample the noise vector into a 128x128x3 image. This helps in capturing spatial hierarchies and generating more detailed images.
- Discriminator: The DCGAN discriminator uses convolutional layers to downsample the input image, enabling it to better distinguish between real and fake images by capturing local patterns.

### Results
#### Basic GAN
The images generated by the basic GAN were quite blurry and lacked distinct features. The model struggled to create coherent shapes or structures.
#### DCGAN
The DCGAN generated images with more structure and detail compared to the basic GAN. Visible patterns and textures indicated that the DCGAN was better at capturing and replicating the features of the original dataset.

### Discussion
The DCGAN outperformed the basic GAN in generating more realistic images. This can be attributed to the following reasons:
- Convolutional Layers: DCGAN uses convolutional layers, which are better at capturing spatial hierarchies in images. This helps in generating more detailed and structured outputs.
- Batch Normalization: The use of batch normalization in the DCGAN helps in stabilizing the training process and improving the quality of generated images.
- Deeper Architecture: The deeper architecture of the DCGAN allows it to model more complex features, resulting in higher-quality images.

### Conclusion
The DCGAN model demonstrated superior performance in generating realistic images compared to the basic GAN. The use of convolutional layers and batch normalization significantly improved the quality of the generated images. Training both models for more epochs and experimenting with hyperparameters could lead to even better results. This analysis shows the importance of using advanced architectures and techniques in deep learning models to achieve high-quality outputs.