<a href="https://colab.research.google.com/github/mukul-mschauhan/Machine-Learning-Projects/blob/master/Age_Detection_Using_Resnet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### AGe and Gender Detection Using UTKFace Dataset

https://www.kaggle.com/datasets/jangedoo/utkface-new

UTKFace dataset is a large-scale face dataset with long age span (range from 0 to 116 years old). The dataset consists of over 20,000 face images with annotations of age, gender, and ethnicity. The images cover large variation in pose, facial expression, illumination, occlusion, resolution, etc. This dataset could be used on a variety of tasks, e.g., face detection, age estimation, age progression/regression, landmark localization, etc. The entire process is highlighted below:-


* Imports
* Dataset + DataLoader
* Model definition
* model = AgeGenderNet()
* Optimizer = ...
* Train / validate functions
* Training loop

In [2]:
!pip -q install kagglehub

import os, glob, random
import numpy as np

import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader

import torchvision
from torchvision import transforms
from PIL import Image

import kagglehub
import os
import glob

# Download dataset
path = kagglehub.dataset_download("jangedoo/utkface-new")

print("Raw dataset path:", path)

Downloading from https://www.kaggle.com/api/v1/datasets/download/jangedoo/utkface-new?dataset_version_number=1...


100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 331M/331M [00:01<00:00, 214MB/s]

Extracting files...





Raw dataset path: /root/.cache/kagglehub/datasets/jangedoo/utkface-new/versions/1


### Find the actual folder that contains images

In [3]:
import os, glob

def find_image_dir(base_path):
    best_dir = None
    best_count = 0
    for root, _, _ in os.walk(base_path):
        jpg_count = len(glob.glob(os.path.join(root, "*.jpg")))
        if jpg_count > best_count:
            best_count = jpg_count
            best_dir = root
    return best_dir, best_count

IMAGE_DIR, n_jpg = find_image_dir(path)

print("Detected IMAGE_DIR:", IMAGE_DIR)
print("Number of .jpg images:", n_jpg)

Detected IMAGE_DIR: /root/.cache/kagglehub/datasets/jangedoo/utkface-new/versions/1/UTKFace
Number of .jpg images: 23708


### Collect valid images + split into train/val

What this block does

* glob(...) collects all .jpg files.

* Filters out weird files by checking UTKFace naming convention:

* filename format: age_gender_race_*.jpg

so we require:

* parts[0] is age (digit)

* parts[1] is gender (digit)

* Shuffles and splits 80/20 into training and validation paths.

Why it‚Äôs needed

* UTKFace datasets often contain a few bad files (broken names, non-image metadata, etc.).

* Filtering prevents runtime crashes later.

In [4]:
import random

all_images = glob.glob(os.path.join(IMAGE_DIR, "*.jpg"))

valid_images = []
for p in all_images:
    name = os.path.basename(p)
    parts = name.split("_")
    if len(parts) >= 2 and parts[0].isdigit() and parts[1].isdigit():
        valid_images.append(p)

random.seed(42)
random.shuffle(valid_images)

split = int(0.8 * len(valid_images))
train_paths = valid_images[:split]
val_paths   = valid_images[split:]

print("Train:", len(train_paths), "Val:", len(val_paths))
assert len(train_paths) > 0 and len(val_paths) > 0


Train: 18966 Val: 4742


### Transforms: converting image ‚Üí tensor with consistent size

What this block does

* Resize: makes all images 224x224 so batching works.
* Augmentation (train only): horizontal flip helps generalization.
* ToTensor: converts PIL image to torch.Tensor in shape [C,H,W].
* Normalize: matches ImageNet normalization because the backbone is pretrained on ImageNet.

Why it‚Äôs needed

* Without resizing: dataloader can‚Äôt stack images of different sizes.
* Without ToTensor: you‚Äôll get collation errors because batches need tensors.
* Without normalize: pretrained backbones perform worse (input distribution mismatch).

In [5]:
from torchvision import transforms

train_tfms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ToTensor(),
    transforms.Normalize(mean=(0.485,0.456,0.406), std=(0.229,0.224,0.225)),
])

val_tfms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=(0.485,0.456,0.406), std=(0.229,0.224,0.225)),
])


### Dataset class: read image + parse age/gender from filename

What this block does

* Implements a PyTorch Dataset:

* __len__ tells how many samples.

* __getitem__ returns one sample.

Each sample returns:

* img: tensor [3,224,224]

* age: float tensor (e.g., 25.0)

* gender: integer tensor (0 or 1)

Why it‚Äôs safe

* If filename parsing fails: it ‚Äúmoves on‚Äù to the next item.

* If an image is corrupted: it also moves on.

* This avoids dataloader crashes mid-epoch.

In [6]:
import os
from PIL import Image
import torch
from torch.utils.data import Dataset

class UTKFaceDatasetSafe(Dataset):
    def __init__(self, image_paths, transform=None):
        self.paths = image_paths
        self.transform = transform

    def __len__(self):
        return len(self.paths)

    def __getitem__(self, idx):
        path = self.paths[idx]
        filename = os.path.basename(path)

        parts = filename.split("_")
        if len(parts) < 2 or (not parts[0].isdigit()) or (not parts[1].isdigit()):
            return self.__getitem__((idx + 1) % len(self.paths))

        age = float(parts[0])
        gender = int(parts[1])

        try:
            img = Image.open(path).convert("RGB")
        except Exception:
            return self.__getitem__((idx + 1) % len(self.paths))

        if self.transform is None:
            raise ValueError("Transform is None. Please pass train_tfms/val_tfms.")
        img = self.transform(img)

        age = torch.tensor(age, dtype=torch.float32)
        gender = torch.tensor(gender, dtype=torch.long)

        return img, age, gender


### DataLoader: batch + shuffle + workers

What this block does

* Wraps dataset into a batch generator.

* shuffle=True for training so batches are random.

* shuffle=False for validation so evaluation is stable.

* num_workers=0 for maximum stability (especially in Colab).

* pin_memory=True can speed up host‚ÜíGPU transfer.

In [7]:
from torch.utils.data import DataLoader

train_ds = UTKFaceDatasetSafe(train_paths, transform=train_tfms)
val_ds   = UTKFaceDatasetSafe(val_paths, transform=val_tfms)

train_loader = DataLoader(train_ds, batch_size=64, shuffle=True, num_workers=0, pin_memory=True)
val_loader   = DataLoader(val_ds, batch_size=64, shuffle=False, num_workers=0, pin_memory=True)


### Model

What this block does

* Loads a pretrained ResNet18.

* Removes its classifier (fc) by replacing with Identity.

* The backbone outputs a feature vector f.

Two task-specific heads:

* Age regression head ‚Üí outputs 1 number per image (age).
* Gender classification head ‚Üí outputs 2 logits (male/female).

Why multi-task learning helps

* The backbone learns face features useful for both tasks.
* Gender task can regularize the representation and sometimes improves age too.

Why gender head outputs ‚Äúlogits‚Äù
* CrossEntropyLoss expects raw logits, not probabilities.

In [8]:
import torch.nn as nn
import torchvision

class AgeGenderNet(nn.Module):
    def __init__(self):
        super().__init__()

        m = torchvision.models.resnet18(weights=torchvision.models.ResNet18_Weights.DEFAULT)
        feat_dim = m.fc.in_features
        m.fc = nn.Identity()
        self.backbone = m

        self.age_head = nn.Sequential(
            nn.Linear(feat_dim, 128),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(128, 1)
        )

        self.gender_head = nn.Sequential(
            nn.Linear(feat_dim, 128),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(128, 2)
        )

    def forward(self, x):
        f = self.backbone(x)
        age = self.age_head(f).squeeze(1)
        gender_logits = self.gender_head(f)
        return age, gender_logits


### Loss Function

Age loss: SmoothL1

* Better than MSE when labels are noisy / outliers exist.

* Penalizes big errors more gently than MSE.

Gender loss: CrossEntropy

* Standard for multi-class classification (binary included).

Weighted sum

``Final loss L = ùêøùëéùëîùëí + ùúÜ‚ãÖùêøùëîùëíùëõùëëùëíùëü``

LAMBDA_GENDER balances tasks.

In [13]:
age_loss_fn = nn.SmoothL1Loss()
gender_loss_fn = nn.CrossEntropyLoss()

LAMBDA_GENDER = 1.0

import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using device:", device)
model = AgeGenderNet().to(device)

optimizer = torch.optim.AdamW(
    model.parameters(),
    lr=3e-4,
    weight_decay=1e-4
)

Using device: cuda


### Training

What happens in each iteration

* Move batch to GPU/CPU.

* Forward pass ‚Üí age + gender logits.

* Compute two losses.

* Combine them into a single loss.

* Backprop + update weights.

Track metrics:

* MAE (years): average absolute age error.

* Accuracy: gender correctness.

Why ``detach()`` is used

* Prevents metric computation from creating extra graph nodes (saves memory).

In [10]:
def train_one_epoch(model, loader, optimizer, device):
    model.train()
    total_loss, total_age_mae, total_acc = 0.0, 0.0, 0.0

    for x, age, gender in loader:
        x = x.to(device)
        age = age.to(device)
        gender = gender.to(device)

        pred_age, pred_gender_logits = model(x)

        loss_age = age_loss_fn(pred_age, age)
        loss_gender = gender_loss_fn(pred_gender_logits, gender)
        loss = loss_age + LAMBDA_GENDER * loss_gender

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        mae = (pred_age.detach() - age).abs().mean().item()
        acc = (pred_gender_logits.detach().argmax(dim=1) == gender).float().mean().item()

        total_loss += loss.item() * x.size(0)
        total_age_mae += mae * x.size(0)
        total_acc += acc * x.size(0)

    n = len(loader.dataset)
    return total_loss / n, total_age_mae / n, total_acc / n

### Validation Step

In [11]:
@torch.no_grad()
def validate(model, loader, device):
    model.eval()
    total_loss, total_age_mae, total_acc = 0.0, 0.0, 0.0

    for x, age, gender in loader:
        x = x.to(device)
        age = age.to(device)
        gender = gender.to(device)

        pred_age, pred_gender_logits = model(x)

        loss_age = age_loss_fn(pred_age, age)
        loss_gender = gender_loss_fn(pred_gender_logits, gender)
        loss = loss_age + LAMBDA_GENDER * loss_gender

        mae = (pred_age - age).abs().mean().item()
        acc = (pred_gender_logits.argmax(dim=1) == gender).float().mean().item()

        total_loss += loss.item() * x.size(0)
        total_age_mae += mae * x.size(0)
        total_acc += acc * x.size(0)

    n = len(loader.dataset)
    return total_loss / n, total_age_mae / n, total_acc / n


In [None]:
best_val = 1e9
EPOCHS = 8

for epoch in range(1, EPOCHS + 1):
    tr_loss, tr_mae, tr_acc = train_one_epoch(model, train_loader, optimizer, device)
    va_loss, va_mae, va_acc = validate(model, val_loader, device)

    print(
        f"Epoch {epoch:02d} | "
        f"train: loss={tr_loss:.4f}, age_MAE={tr_mae:.2f}y, gender_acc={tr_acc:.3f} | "
        f"val: loss={va_loss:.4f}, age_MAE={va_mae:.2f}y, gender_acc={va_acc:.3f}"
    )

    if va_loss < best_val:
        best_val = va_loss
        torch.save(model.state_dict(), "best_age_gender.pt")
        print("  ‚úÖ saved best_age_gender.pt")

Epoch 01 | train: loss=9.2638, age_MAE=9.42y, gender_acc=0.854 | val: loss=6.1858, age_MAE=6.39y, gender_acc=0.882
  ‚úÖ saved best_age_gender.pt
Epoch 02 | train: loss=5.8652, age_MAE=6.09y, gender_acc=0.890 | val: loss=5.3252, age_MAE=5.55y, gender_acc=0.899
  ‚úÖ saved best_age_gender.pt
Epoch 03 | train: loss=5.4347, age_MAE=5.69y, gender_acc=0.903 | val: loss=5.0699, age_MAE=5.29y, gender_acc=0.897
  ‚úÖ saved best_age_gender.pt
Epoch 04 | train: loss=5.0609, age_MAE=5.33y, gender_acc=0.913 | val: loss=5.3151, age_MAE=5.51y, gender_acc=0.883
Epoch 05 | train: loss=4.8806, age_MAE=5.16y, gender_acc=0.920 | val: loss=5.3070, age_MAE=5.56y, gender_acc=0.912
