# Face Morph VAE Demo on UTKFace Dataset

This notebook demonstrates training and evaluating a Variational Autoencoder (VAE) for face morphing on the UTKFace dataset.  
We build a convolutional VAE that encodes 64x64 RGB face images into a 100-dimensional latent space and reconstructs them.  
We visualize reconstructions, generate new samples, and perform latent space interpolation between face images.

---

## Setup & Imports


In [None]:
import os
import time
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, random_split
from torchvision import transforms
import torchvision
import matplotlib.pyplot as plt
import PIL.Image

from model import VAE_UTKFace, ResidualBlock
from utils import (
    train_model,
    show_reconstructions,
    sample_from_latent,
    show_interpolation_candidates,
    interpolate_faces,
)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")


## Dataset Preparation

We load the UTKFace dataset images from a directory, resize them to 64x64, and convert to tensors.  
A custom Dataset class handles loading images without labels since the VAE is unsupervised.  
We split the dataset into 90% training and 10% validation.

https://www.kaggle.com/datasets/jangedoo/utkface-new

In [None]:
seed = 68
torch.manual_seed(seed)

BATCH_SIZE = 64
encoding_size = 100
input_channel = 3
learning_rate = 1e-3
num_epochs = 50
image_path = "/kaggle/input/utkface-new/UTKFace"

transform = transforms.Compose([
    transforms.Resize((64, 64)),
    transforms.ToTensor()
])

class UTKFaceDataset(Dataset):
    def __init__(self, root_dir, transform=None):
        self.root_dir = root_dir
        self.transform = transform
        self.images = [f for f in os.listdir(root_dir) if f.endswith('.jpg') or f.endswith('.png')]

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img_path = os.path.join(self.root_dir, self.images[idx])
        image = PIL.Image.open(img_path).convert('RGB')
        if self.transform:
            image = self.transform(image)
        return image

dataset = UTKFaceDataset(image_path, transform=transform)

val_size = int(0.1 * len(dataset))
train_size = len(dataset) - val_size
train_dataset, val_dataset = random_split(dataset, [train_size, val_size])

train_dl = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True, drop_last=True)
val_dl = DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=False, drop_last=True)


## Model, Loss, Optimizer & Scheduler

We instantiate the VAE model, define the BCEWithLogitsLoss for reconstruction, Adam optimizer, and a ReduceLROnPlateau scheduler.


In [None]:
model = VAE_UTKFace(input_channel, encoding_size).to(device)
loss_fn = nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', patience=5, factor=0.5)


## Training Loop

Train the model for 50 epochs.  
We apply gradient clipping and cyclical β annealing for the KL term.  
We plot training and validation loss curves after training.


In [None]:
train_loss, val_loss = train_model(model, num_epochs, train_dl, val_dl, loss_fn, optimizer, clip_norm=True, max_norm=50.0)

plt.plot(train_loss, label='Train Loss')
plt.plot(val_loss, label='Validation Loss')
plt.legend()
plt.grid(True)
plt.show()


## Visualizing Reconstructions and Sampling

We display reconstructions on the validation dataset and generate new samples from the latent space.


In [None]:
show_reconstructions(model, val_dl)
sample_from_latent(model)


## Latent Space Interpolation

We select two random faces from the validation set, visualize them, then perform interpolation between their latent vectors to morph one face into the other.


In [None]:
val_iter = iter(val_dl)
val_batch = next(val_iter)
img1, img2 = val_batch[0], val_batch[1]

show_interpolation_candidates(img1, img2)

interp_grid = interpolate_faces(model, img1, img2, steps=10)
plt.figure(figsize=(14, 2))
plt.axis("off")
plt.title("Final Latent Interpolation After Training")
plt.imshow(interp_grid.permute(1, 2, 0).cpu())
plt.show()


## Reconstructing Random Noise Inputs

To further explore the model's generative capability, we feed it random noise as input and visualize the reconstructions.


In [None]:
sample_input = torch.randn(20, 3, 64, 64).to(device)
show_reconstructions(model, sample_input=sample_input)


## Conclusion
In this project, we implemented a convolutional Variational Autoencoder to learn a latent representation of face images from the UTKFace dataset.
The model successfully reconstructs faces and generates new face samples by sampling the latent space.
Latent space interpolation demonstrates smooth morphing between different faces, showcasing the meaningful structure learned by the model.
This approach highlights the power of VAEs for unsupervised learning and generative modeling of complex image data.