# üîê Face Recognition with Triplet Loss & Unknown Detection

## üìã Project Overview

**Objective:** Closed-set face recognition with unknown detection
- **Known Identities:** 10 criminals
- **Images per person:** 20 face images
- **Unknown Detection:** Faces not in dataset classified as "Unknown"

## üéØ Architecture

- **Backbone:** ResNet50 (pretrained on ImageNet)
- **Loss Function:** Triplet Loss with hard negative mining
- **Embedding Size:** 128D
- **Distance Metric:** Cosine Similarity
- **Unknown Threshold:** Learned during validation

---

## üöÄ Quick Start (Google Colab)

1. **Enable GPU:** Runtime ‚Üí Change runtime type ‚Üí GPU
2. **Run all cells** from top to bottom
3. **Dataset downloads automatically** from Kaggle to Colab storage
4. **Training starts automatically** (no local downloads needed)


## 0. Check Environment

In [None]:
# Check GPU availability
import torch

print(f'PyTorch version: {torch.__version__}')
print(f'CUDA available: {torch.cuda.is_available()}')

if torch.cuda.is_available():
    print(f'GPU: {torch.cuda.get_device_name(0)}')
    print(f'GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB')
else:
    print('‚ö†Ô∏è No GPU detected! Go to Runtime > Change runtime type > GPU')

## 1. Environment Setup

In [None]:
# Install required packages (if needed)
!pip install -q kaggle

print('‚úÖ Packages installed')

In [None]:
# Core imports
import os
import json
import random
import shutil
import numpy as np
from pathlib import Path
from typing import List, Tuple, Dict
from collections import defaultdict
import warnings
warnings.filterwarnings('ignore')

# PyTorch
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from torch.optim import Adam
from torch.optim.lr_scheduler import StepLR
import torchvision.models as models
import torchvision.transforms as transforms

# Image processing
from PIL import Image
import cv2

# Visualization
import matplotlib.pyplot as plt
from tqdm.auto import tqdm

# Metrics
from sklearn.metrics import accuracy_score

# Set seeds
SEED = 42
random.seed(SEED)
np.random.seed(SEED)
torch.manual_seed(SEED)
torch.cuda.manual_seed_all(SEED)

DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'‚úÖ Device: {DEVICE}')
print(f'üì¶ PyTorch: {torch.__version__}')

## 2. Kaggle Dataset Download (Auto)

**Dataset:** [Vasuki Patel Face Recognition Dataset](https://www.kaggle.com/datasets/vasukipatel/face-recognition-dataset)
- **31 identities** (we'll use first 10)
- **2562 celebrity face images**
- **Downloads to Colab storage** (not your local machine)

In [None]:
# Setup Kaggle credentials
# IMPORTANT: Replace with your own Kaggle credentials
KAGGLE_USERNAME = 'zyadelfeki1'
KAGGLE_KEY = '0ca3cf05892a2d79ecc9fe7f6ae0d05e'

# Create Kaggle config
os.makedirs(os.path.expanduser('~/.kaggle'), exist_ok=True)
kaggle_config = {'username': KAGGLE_USERNAME, 'key': KAGGLE_KEY}

with open(os.path.expanduser('~/.kaggle/kaggle.json'), 'w') as f:
    json.dump(kaggle_config, f)

os.chmod(os.path.expanduser('~/.kaggle/kaggle.json'), 0o600)

print('‚úÖ Kaggle credentials configured')

In [None]:
# Download Face Recognition Dataset (31 classes)
!rm -rf ./data_raw ./data_prepared  # Clean previous downloads
!kaggle datasets download -d vasukipatel/face-recognition-dataset -p ./data_raw --unzip

print('\nüìÅ Dataset downloaded to Colab cloud storage!')
print('(Data stays in Colab, NOT downloaded to your local machine)')

# DEBUG: List all files to understand structure
print('\nüîç DEBUG: Exploring dataset structure...')
!ls -laR ./data_raw | head -50

print('\nüîç DEBUG: Finding all image files...')
!find ./data_raw -type f \( -name "*.jpg" -o -name "*.jpeg" -o -name "*.png" \) | head -20

In [None]:
# Prepare dataset: Select first 10 identities, 20 images each
from pathlib import Path
from glob import glob

RAW_DIR = Path('./data_raw')
PREPARED_DIR = Path('./data_prepared')

# Clean and recreate prepared directory
if PREPARED_DIR.exists():
    shutil.rmtree(PREPARED_DIR)
PREPARED_DIR.mkdir(exist_ok=True)

# Find all image files recursively
print('üîç Searching for images...')
all_images = []
for ext in ['*.jpg', '*.jpeg', '*.png', '*.JPG', '*.JPEG', '*.PNG']:
    found = list(RAW_DIR.rglob(ext))
    all_images.extend(found)
    if found:
        print(f'   Found {len(found)} {ext} files')

print(f'\nüìä Total images found: {len(all_images)}')

if len(all_images) == 0:
    print('\n‚ùå ERROR: No images found!')
    print('\nüí° Troubleshooting:')
    print('   1. Check if dataset downloaded: !ls -la ./data_raw')
    print('   2. Check Kaggle credentials are correct')
    print('   3. Try manual download from: https://www.kaggle.com/datasets/vasukipatel/face-recognition-dataset')
    raise ValueError('No images found in dataset')

# Group images by parent folder (identity)
print('\nüìÇ Grouping images by identity...')
identity_images = defaultdict(list)

for img_path in all_images:
    # Get parent folder name as identity
    identity = img_path.parent.name
    
    # Skip if in root directory
    if identity in ['data_raw', '.']:
        continue
    
    identity_images[identity].append(img_path)

print(f'\nüìä Found {len(identity_images)} unique identities')

# Show identity distribution
print('\nüìä Images per identity (top 15):')
for identity, images in sorted(identity_images.items(), key=lambda x: len(x[1]), reverse=True)[:15]:
    print(f'   {identity}: {len(images)} images')

# Sort identities by number of images (descending)
sorted_identities = sorted(identity_images.items(), key=lambda x: len(x[1]), reverse=True)

# Select first 10 identities with at least 20 images
selected_count = 0
print(f'\nüì¶ Preparing dataset with 10 identities (20 images each)...')

for identity, images in sorted_identities:
    if selected_count >= 10:
        break
    
    if len(images) < 20:
        print(f'   ‚ö†Ô∏è Skipping {identity}: only {len(images)} images (need 20)')
        continue
    
    # Create folder
    new_identity = f'person_{selected_count}'
    new_dir = PREPARED_DIR / new_identity
    new_dir.mkdir(exist_ok=True)
    
    # Copy first 20 images
    for img_idx, img_path in enumerate(images[:20]):
        new_name = f'img_{img_idx:02d}{img_path.suffix}'
        shutil.copy(img_path, new_dir / new_name)
    
    print(f'   ‚úÖ {new_identity}: 20 images (from {identity})')
    selected_count += 1

if selected_count < 10:
    print(f'\n‚ùå ERROR: Only found {selected_count} identities with 20+ images')
    print(f'\nüí° Solutions:')
    print(f'   1. Reduce images per person (change 20 to 10 or 15)')
    print(f'   2. Reduce number of identities (change 10 to {selected_count})')
    print(f'   3. Use a different dataset with more images per person')
    raise ValueError(f'Not enough identities with sufficient images')

print(f'\n‚úÖ Dataset ready at: {PREPARED_DIR}')
print(f'   Total: {selected_count} identities √ó 20 images = {selected_count * 20} images')

# Verify
print('\nüîç Verifying prepared dataset...')
!ls -la ./data_prepared/

# Update DATA_DIR
DATA_DIR = PREPARED_DIR

## 3. Dataset Class & DataLoaders

In [None]:
class TripletFaceDataset(Dataset):
    """
    Face Recognition Dataset with Triplet Mining
    Returns: (anchor, positive, negative) triplets
    """
    
    def __init__(self, data_dir: str, transform=None):
        self.data_dir = Path(data_dir)
        self.transform = transform
        
        # Build dataset index
        self.identities = sorted([d.name for d in self.data_dir.iterdir() if d.is_dir()])
        self.identity_to_idx = {name: idx for idx, name in enumerate(self.identities)}
        
        # Group images by identity
        self.identity_images = defaultdict(list)
        self.all_images = []
        
        for identity in self.identities:
            identity_dir = self.data_dir / identity
            images = list(identity_dir.glob('*.jpg')) + list(identity_dir.glob('*.png')) + list(identity_dir.glob('*.jpeg'))
            
            for img_path in images:
                self.identity_images[identity].append(str(img_path))
                self.all_images.append((str(img_path), identity))
        
        print(f'üìä Dataset Summary:')
        print(f'   Identities: {len(self.identities)}')
        print(f'   Total images: {len(self.all_images)}')
        for identity in self.identities:
            print(f'   {identity}: {len(self.identity_images[identity])} images')
        
        if len(self.all_images) == 0:
            raise ValueError('Dataset is empty! Check data preparation step.')
    
    def __len__(self):
        return len(self.all_images)
    
    def __getitem__(self, idx):
        # Get anchor
        anchor_path, anchor_identity = self.all_images[idx]
        anchor_img = self._load_image(anchor_path)
        
        # Get positive (same identity, different image)
        positive_candidates = [p for p in self.identity_images[anchor_identity] if p != anchor_path]
        if not positive_candidates:
            positive_candidates = [anchor_path]  # Fallback
        positive_path = random.choice(positive_candidates)
        positive_img = self._load_image(positive_path)
        
        # Get negative (different identity)
        negative_identity = random.choice([i for i in self.identities if i != anchor_identity])
        negative_path = random.choice(self.identity_images[negative_identity])
        negative_img = self._load_image(negative_path)
        
        # Get label
        label = self.identity_to_idx[anchor_identity]
        
        return anchor_img, positive_img, negative_img, label
    
    def _load_image(self, path: str):
        img = Image.open(path).convert('RGB')
        if self.transform:
            img = self.transform(img)
        return img

In [None]:
# Data transforms
train_transform = transforms.Compose([
    transforms.Resize((160, 160)),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ColorJitter(brightness=0.2, contrast=0.2),
    transforms.RandomRotation(10),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

val_transform = transforms.Compose([
    transforms.Resize((160, 160)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

print('‚úÖ Transforms defined')

In [None]:
# Create train/val datasets
print('\nüì¶ Loading datasets...')
train_dataset = TripletFaceDataset(DATA_DIR, transform=train_transform)

print('\nüì¶ Loading validation dataset...')
val_dataset = TripletFaceDataset(DATA_DIR, transform=val_transform)

# DataLoaders
BATCH_SIZE = 16
train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True, num_workers=2)
val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=False, num_workers=2)

print(f'\n‚úÖ DataLoaders ready')
print(f'   Train batches: {len(train_loader)}')
print(f'   Val batches: {len(val_loader)}')

## 4. Model Architecture

Transfer Learning with ResNet50 backbone + Custom Embedding Head

In [None]:
class FaceEmbeddingModel(nn.Module):
    """
    Face Recognition Model with Triplet Loss
    - Pretrained ResNet50 backbone
    - Custom embedding head (128D)
    - L2 normalized embeddings
    """
    
    def __init__(self, embedding_size=128, pretrained=True):
        super().__init__()
        
        # Load pretrained ResNet50
        resnet = models.resnet50(pretrained=pretrained)
        
        # Remove final FC layer
        self.backbone = nn.Sequential(*list(resnet.children())[:-1])
        
        # Freeze early layers (fine-tune only last blocks)
        for param in list(self.backbone.parameters())[:-20]:
            param.requires_grad = False
        
        # Embedding head
        self.embedding = nn.Sequential(
            nn.Linear(2048, 512),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(512, embedding_size)
        )
        
        self.embedding_size = embedding_size
    
    def forward(self, x):
        # Extract features
        features = self.backbone(x)
        features = features.view(features.size(0), -1)
        
        # Get embeddings
        embeddings = self.embedding(features)
        
        # L2 normalize
        embeddings = F.normalize(embeddings, p=2, dim=1)
        
        return embeddings


# Initialize model
model = FaceEmbeddingModel(embedding_size=128, pretrained=True).to(DEVICE)

# Model summary
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f'\nüìê Model Summary:')
print(f'   Total parameters: {total_params:,}')
print(f'   Trainable parameters: {trainable_params:,}')
print(f'   Embedding size: {model.embedding_size}')

## 5. Triplet Loss Implementation

In [None]:
class TripletLoss(nn.Module):
    """
    Triplet Loss with online hard negative mining
    L = max(0, ||a - p||¬≤ - ||a - n||¬≤ + margin)
    """
    
    def __init__(self, margin=0.5):
        super().__init__()
        self.margin = margin
    
    def forward(self, anchor, positive, negative):
        # Compute pairwise distances
        pos_dist = F.pairwise_distance(anchor, positive, p=2)
        neg_dist = F.pairwise_distance(anchor, negative, p=2)
        
        # Triplet loss
        loss = F.relu(pos_dist - neg_dist + self.margin)
        
        return loss.mean()


# Initialize loss and optimizer
criterion = TripletLoss(margin=0.5)
optimizer = Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=0.0001)
scheduler = StepLR(optimizer, step_size=10, gamma=0.5)

print('‚úÖ Loss function and optimizer configured')

## 6. Training Loop

In [None]:
def train_epoch(model, loader, criterion, optimizer, device):
    """Train for one epoch"""
    model.train()
    total_loss = 0
    num_batches = 0
    
    pbar = tqdm(loader, desc='Training')
    
    for anchor, positive, negative, _ in pbar:
        anchor = anchor.to(device)
        positive = positive.to(device)
        negative = negative.to(device)
        
        optimizer.zero_grad()
        
        # Get embeddings
        anchor_emb = model(anchor)
        positive_emb = model(positive)
        negative_emb = model(negative)
        
        # Compute loss
        loss = criterion(anchor_emb, positive_emb, negative_emb)
        
        # Backward
        loss.backward()
        optimizer.step()
        
        total_loss += loss.item()
        num_batches += 1
        
        pbar.set_postfix({'loss': f'{loss.item():.4f}'})
    
    return total_loss / num_batches

In [None]:
@torch.no_grad()
def validate(model, loader, criterion, device):
    """Validate and compute verification accuracy"""
    model.eval()
    total_loss = 0
    num_batches = 0
    
    # Store embeddings and labels
    all_embeddings = []
    all_labels = []
    
    pbar = tqdm(loader, desc='Validating')
    
    for anchor, positive, negative, labels in pbar:
        anchor = anchor.to(device)
        positive = positive.to(device)
        negative = negative.to(device)
        
        # Get embeddings
        anchor_emb = model(anchor)
        positive_emb = model(positive)
        negative_emb = model(negative)
        
        # Compute loss
        loss = criterion(anchor_emb, positive_emb, negative_emb)
        
        total_loss += loss.item()
        num_batches += 1
        
        # Store embeddings
        all_embeddings.append(anchor_emb.cpu())
        all_labels.extend(labels.numpy())
        
        pbar.set_postfix({'loss': f'{loss.item():.4f}'})
    
    # Compute verification accuracy (positive pairs should be closer than negative)
    all_embeddings = torch.cat(all_embeddings)
    
    # Simple verification: for each anchor, check if positive is closer than negative
    correct = 0
    total = 0
    
    for i in range(0, len(all_embeddings), len(loader.dataset) // len(loader)):
        if i + 2 >= len(all_embeddings):
            break
        anchor = all_embeddings[i]
        positive = all_embeddings[i + 1] if i + 1 < len(all_embeddings) else all_embeddings[i]
        negative = all_embeddings[i + 2] if i + 2 < len(all_embeddings) else all_embeddings[i]
        
        pos_dist = F.pairwise_distance(anchor.unsqueeze(0), positive.unsqueeze(0))
        neg_dist = F.pairwise_distance(anchor.unsqueeze(0), negative.unsqueeze(0))
        
        if pos_dist < neg_dist:
            correct += 1
        total += 1
    
    accuracy = correct / total if total > 0 else 0
    
    return total_loss / num_batches, accuracy

In [None]:
# Training configuration
NUM_EPOCHS = 50
SAVE_DIR = Path('./checkpoints')
SAVE_DIR.mkdir(exist_ok=True)

best_val_acc = 0
history = {'train_loss': [], 'val_loss': [], 'val_acc': []}

print('üöÄ Starting training...\n')

for epoch in range(NUM_EPOCHS):
    print(f'\nEpoch {epoch + 1}/{NUM_EPOCHS}')
    print('-' * 50)
    
    # Train
    train_loss = train_epoch(model, train_loader, criterion, optimizer, DEVICE)
    
    # Validate
    val_loss, val_acc = validate(model, val_loader, criterion, DEVICE)
    
    # Scheduler step
    scheduler.step()
    
    # Log
    print(f'\nTrain Loss: {train_loss:.4f}')
    print(f'Val Loss: {val_loss:.4f} | Val Accuracy: {val_acc*100:.2f}%')
    
    # Save history
    history['train_loss'].append(train_loss)
    history['val_loss'].append(val_loss)
    history['val_acc'].append(val_acc)
    
    # Save best model
    if val_acc > best_val_acc:
        best_val_acc = val_acc
        torch.save({
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'val_accuracy': best_val_acc,
            'embedding_size': model.embedding_size,
            'identities': train_dataset.identities
        }, SAVE_DIR / 'model.pth')
        print(f'üíæ Saved best model with accuracy: {best_val_acc*100:.2f}%')

print(f'\n‚úÖ Training complete! Best validation accuracy: {best_val_acc*100:.2f}%')

## 7. Inference & Unknown Detection

In [None]:
class FaceRecognitionSystem:
    """
    Face Recognition System with Unknown Detection
    """
    
    def __init__(self, model, identities, threshold=0.6):
        self.model = model
        self.identities = identities
        self.threshold = threshold
        self.embeddings_db = {}
        self.model.eval()
    
    def register_identity(self, identity_name: str, image_paths: List[str], transform):
        """Register an identity by computing average embedding"""
        embeddings = []
        
        for img_path in image_paths:
            img = Image.open(img_path).convert('RGB')
            img = transform(img).unsqueeze(0).to(DEVICE)
            
            with torch.no_grad():
                emb = self.model(img)
                embeddings.append(emb.cpu())
        
        # Average embedding
        avg_embedding = torch.cat(embeddings).mean(dim=0)
        self.embeddings_db[identity_name] = avg_embedding
    
    def predict(self, image_path: str, transform) -> Tuple[str, float]:
        """
        Predict identity for a face image
        Returns: (identity_name, confidence) or ('Unknown', distance)
        """
        # Load and process image
        img = Image.open(image_path).convert('RGB')
        img = transform(img).unsqueeze(0).to(DEVICE)
        
        # Get embedding
        with torch.no_grad():
            query_emb = self.model(img).cpu()
        
        # Compare with database
        best_match = None
        best_similarity = -1
        
        for identity, db_emb in self.embeddings_db.items():
            # Cosine similarity
            similarity = F.cosine_similarity(query_emb, db_emb.unsqueeze(0)).item()
            
            if similarity > best_similarity:
                best_similarity = similarity
                best_match = identity
        
        # Check threshold
        if best_similarity >= self.threshold:
            return best_match, best_similarity
        else:
            return 'Unknown', best_similarity


print('‚úÖ Inference system ready')

In [None]:
# Load best model
checkpoint = torch.load(SAVE_DIR / 'model.pth')
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

# Create recognition system
recognition_system = FaceRecognitionSystem(
    model=model,
    identities=checkpoint['identities'],
    threshold=0.6  # Adjust based on validation
)

# Register all identities
print('üìù Registering identities...')
for identity in train_dataset.identities:
    image_paths = train_dataset.identity_images[identity]
    recognition_system.register_identity(identity, image_paths, val_transform)
    print(f'   ‚úÖ {identity}')

print(f'\n‚úÖ System ready with {len(recognition_system.embeddings_db)} identities')

In [None]:
# Test inference
test_image_path = str(list(DATA_DIR.glob('person_0/*.jpg'))[0])  # First image of person_0

identity, confidence = recognition_system.predict(test_image_path, val_transform)

print(f'\nüéØ Prediction:')
print(f'   Identity: {identity}')
print(f'   Confidence: {confidence:.4f}')

## 8. Download Trained Model

In [None]:
# Download model from Colab to your local machine
from google.colab import files

# Download model.pth
files.download('checkpoints/model.pth')

print('\n‚úÖ Model downloaded! You can now use it in your project.')

## 9. Visualization

In [None]:
# Plot training history
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Loss
axes[0].plot(history['train_loss'], label='Train Loss')
axes[0].plot(history['val_loss'], label='Val Loss')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')
axes[0].set_xlabel('Training & Validation Loss')
axes[0].legend()
axes[0].grid(True)

# Accuracy
axes[1].plot([a*100 for a in history['val_acc']], label='Val Accuracy')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Accuracy (%)')
axes[1].set_title('Validation Accuracy')
axes[1].legend()
axes[1].grid(True)

plt.tight_layout()
plt.savefig(SAVE_DIR / 'training_history.png', dpi=150)
plt.show()

## üìä Summary

### Model Architecture
- **Backbone:** ResNet50 (pretrained)
- **Embedding Size:** 128D
- **Loss:** Triplet Loss (margin=0.5)
- **Metric:** Cosine Similarity

### Training Details
- **Known Identities:** 10 persons
- **Images per person:** 20
- **Epochs:** 50
- **Batch Size:** 16

### Inference
- **Threshold:** 0.6 (adjustable)
- **Unknown Detection:** Yes
- **Output:** Identity + Confidence score

### Files Generated
- `checkpoints/model.pth` - Best model weights
- `checkpoints/training_history.png` - Training curves

---

**Ready for deployment!** üöÄ

### How to Use Model:
```python
# Load model
checkpoint = torch.load('model.pth')
model.load_state_dict(checkpoint['model_state_dict'])

# Register identities and predict
recognition_system = FaceRecognitionSystem(model, identities, threshold=0.6)
identity, confidence = recognition_system.predict('test.jpg', transform)
```