<center>
    <img src="images/personal_logo.png"/>
</center>

# Deep Unsupervised Learning - Final Project
### Juan Carlos Garzon Pico
### Viviane Alves

---

<br>

<div align="center">
  
[![GitHub](https://badges.aleen42.com/src/github.svg)](https://github.com/Juank0621)
[![LinkedIn](https://img.shields.io/badge/LinkedIn-Profile-blue?logo=linkedin)](https://www.linkedin.com/in/juancarlosgarzon)
![Python](https://badges.aleen42.com/src/python.svg)

</div>

### CIFAR10 AI System

We are developing an AI system using deep learning techniques like Convolutional Autoencoders (CAE), Variational Autoencoders (VAE), and Generative Adversarial Networks (GANs) with the CIFAR10 dataset. These models will help in facial feature extraction, attribute classification, and image generation. By leveraging these approaches, we aim to enhance face recognition, noise reduction, and synthetic face generation for improved image analysis.

In [None]:
import os
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
import seaborn as sns
from PIL import Image
from tqdm.auto import tqdm   
#from tqdm.rich import tqdm  # Import tqdm.rich for progress bars

import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision.datasets import CIFAR10
import torchvision.transforms.functional as TF
import torch.nn.functional as F
from torchvision.models import resnet18  # Import ResNet18
from torchsummary import summary

from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

torch.set_float32_matmul_precision('medium')

## PyTorch and GPU Information

This code snippet displays the PyTorch version, CUDA version, cuDNN version, and the number of GPUs available for PyTorch.

The first line prints the PyTorch version being used.
The second and third lines retrieve and display the CUDA and cuDNN versions used by PyTorch.
The final line shows the number of GPUs available for PyTorch, helping to confirm whether your system is utilizing the GPU for processing.
This is useful for ensuring that your environment is correctly set up to use GPU acceleration.

In [2]:
print("PyTorch Version:", torch.__version__)

PyTorch Version: 2.6.0+cu124


In [3]:
# Get the CUDA version used by PyTorch
cuda_version = torch.version.cuda
print("CUDA Version:", cuda_version)

# Get the cuDNN version used by PyTorch
cudnn_version = torch.backends.cudnn.version()
print("cuDNN Version:", cudnn_version)

CUDA Version: 12.4
cuDNN Version: 90100


In [4]:
# Get the number of GPUs available
num_gpus = torch.cuda.device_count()
print("Num GPUs Available:", num_gpus)

Num GPUs Available: 1


Let's make sure that we have access to GPU. We can use `nvidia-smi` command to do that.

In [5]:
!nvidia-smi

Fri Apr  4 11:07:29 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA GeForce RTX 4080        Off | 00000000:01:00.0  On |                  N/A |
| 30%   32C    P2              35W / 320W |    840MiB / 16376MiB |     10%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

## Dataset Directory

Here we define the transformations to be applied to the images in the CIFAR10 dataset.

In [7]:
kwargs = {'num_workers': 8, 'pin_memory': True} # DataLoader optimization for better performance on CUDA.

In [None]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

## Generative Adversarial Network (GAN)

## CelebA Dataset for Image Super-Resolution

We will use the CelebA dataset for training a Variational Autoencoder (VAE) combined with a Generative Adversarial Network (GAN) to perform image super-resolution.

### Workflow

1. **Dataset Preparation**: Load the CelebA dataset and preprocess it for low-resolution and high-resolution image pairs.
2. **Model Architecture**: Combine VAE and GAN to create a VAE-GAN model for super-resolution.
3. **Training**: Train the model to generate high-resolution images from low-resolution inputs.
4. **Evaluation**: Use metrics like PSNR and SSIM to evaluate the quality of the generated images.

In [None]:
# Import CelebA dataset
from torchvision.datasets import CelebA

# Define transformations for CelebA dataset
transform = transforms.Compose([
    transforms.CenterCrop(178),  # Crop to square
    transforms.Resize((64, 64)),  # High-resolution target
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# Load CelebA dataset
celeba_dataset = CelebA(root='./data', split='train', transform=transform, download=True)

# Create low-resolution version for input
low_res_transform = transforms.Compose([
    transforms.Resize((16, 16)),  # Low-resolution input
    transforms.Resize((64, 64)),  # Upscale back to match high-resolution size
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

celeba_low_res_dataset = CelebA(root='./data', split='train', transform=low_res_transform, download=False)

## VAE-GAN Architecture

The VAE-GAN combines the Variational Autoencoder (VAE) with a Generative Adversarial Network (GAN) to improve the quality of generated images. The VAE acts as the generator, and the GAN discriminator ensures the generated images are realistic.

In [None]:
# Define VAE-GAN Discriminator
class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()
        self.model = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=4, stride=2, padding=1),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Conv2d(64, 128, kernel_size=4, stride=2, padding=1),
            nn.BatchNorm2d(128),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Conv2d(128, 256, kernel_size=4, stride=2, padding=1),
            nn.BatchNorm2d(256),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Conv2d(256, 512, kernel_size=4, stride=2, padding=1),
            nn.BatchNorm2d(512),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Flatten(),
            nn.Linear(512 * 4 * 4, 1),
            nn.Sigmoid()
        )

    def forward(self, x):
        return self.model(x)

In [None]:
# Define VAE-GAN Training Loop
def train_vae_gan(vae, discriminator, dataloader, optimizer_vae, optimizer_disc, num_epochs=50):
    criterion = nn.BCELoss()
    for epoch in range(num_epochs):
        for high_res, low_res in zip(celeba_dataset, celeba_low_res_dataset):
            high_res = high_res[0].unsqueeze(0).to(device)
            low_res = low_res[0].unsqueeze(0).to(device)

            # Train Discriminator
            optimizer_disc.zero_grad()
            real_labels = torch.ones((high_res.size(0), 1)).to(device)
            fake_labels = torch.zeros((high_res.size(0), 1)).to(device)

            real_output = discriminator(high_res)
            fake_images, _, _ = vae(low_res)
            fake_output = discriminator(fake_images.detach())

            real_loss = criterion(real_output, real_labels)
            fake_loss = criterion(fake_output, fake_labels)
            disc_loss = real_loss + fake_loss
            disc_loss.backward()
            optimizer_disc.step()

            # Train VAE (Generator)
            optimizer_vae.zero_grad()
            fake_output = discriminator(fake_images)
            gen_loss = criterion(fake_output, real_labels)
            vae_loss = vae_loss_function(low_res, fake_images, _, _) + gen_loss
            vae_loss.backward()
            optimizer_vae.step()

        print(f"Epoch [{epoch + 1}/{num_epochs}], Disc Loss: {disc_loss.item():.4f}, VAE Loss: {vae_loss.item():.4f}")

## Evaluation Metrics: PSNR and SSIM

We use Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) to evaluate the quality of the generated high-resolution images.

In [None]:
from skimage.metrics import peak_signal_noise_ratio as psnr
from skimage.metrics import structural_similarity as ssim

def evaluate_metrics(high_res, generated):
    high_res_np = high_res.squeeze(0).cpu().numpy().transpose(1, 2, 0)
    generated_np = generated.squeeze(0).cpu().numpy().transpose(1, 2, 0)

    psnr_value = psnr(high_res_np, generated_np, data_range=1.0)
    ssim_value = ssim(high_res_np, generated_np, multichannel=True, data_range=1.0)

    return psnr_value, ssim_value

# Example usage
high_res, low_res = celeba_dataset[0][0].unsqueeze(0).to(device), celeba_low_res_dataset[0][0].unsqueeze(0).to(device)
generated, _, _ = vae(low_res)
psnr_value, ssim_value = evaluate_metrics(high_res, generated)
print(f"PSNR: {psnr_value:.2f}, SSIM: {ssim_value:.4f}")

## Citation

```bibtex
@inproceedings{liu2015faceattributes,
  title = {Deep Learning Face Attributes in the Wild},
  author = {Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou},
  booktitle = {Proceedings of International Conference on Computer Vision (ICCV)},
  month = {December},
  year = {2015} 
}