# Face Recognition Model Training with Triplet Loss

In this notebook, we will go through the steps to create a face recognition model using triplet loss. We will cover data preparation, model definition, and the training process.

## 1. Setup and Imports

First, let's import the necessary libraries and set up the environment.


In [None]:
# Import necessary libraries
import os
import random
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms
from PIL import Image

# Check if GPU is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')


## 2. Data Preparation

We will prepare the data by organizing it into triplets (anchor, positive, negative).

### 2.1 Organizing the Data

Ensure your data is organized in the following structure:

/dataset/

    /dataset/  
      /person1/
          img1.jpg
          img2.jpg
          ...
      /person2/
          img1.jpg
          img2.jpg
          ...


In [None]:

def create_triplets(dataset_path, num_triplets=1000):
  """
  Create triplets for training from the dataset.

  Parameters:
  - dataset_path (str): Path to the dataset directory.
  - num_triplets (int): Number of triplets to generate.

  Returns:
  - List of triplets (anchor, positive, negative) paths.
  """
  triplets = []
  person_folders = [f.path for f in os.scandir(dataset_path) if f.is_dir()]

  for _ in range(num_triplets):
    anchor_person = random.choice(person_folders)
    anchor_images = os.listdir(anchor_person)

    if len(anchor_images) < 2:
      continue  # Skip if there are not enough images

    anchor = random.choice(anchor_images)
    positive = random.choice([img for img in anchor_images if img != anchor])

    negative_person = random.choice([p for p in person_folders if p != anchor_person])
    negative = random.choice(os.listdir(negative_person))

    triplets.append((os.path.join(anchor_person, anchor),
                      os.path.join(anchor_person, positive),
                      os.path.join(negative_person, negative)))

  return triplets


### 2.2 Dataset Class

Next, we'll create a custom dataset class to load the triplet images and apply necessary transformations.


In [None]:
class TripletDataset(Dataset):
  """
  Custom Dataset class for loading triplet images.
  """

  def __init__(self, triplet_list):
    self.triplet_list = triplet_list
    self.transform = transforms.Compose([
        transforms.Resize((224, 224)),  # Resize images to 224x224
        transforms.ToTensor(),           # Convert images to tensor
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])  # Normalize
    ])

  def __len__(self):
    return len(self.triplet_list)

  def __getitem__(self, idx):
    anchor_path, positive_path, negative_path = self.triplet_list[idx]
    anchor_image = self.load_image(anchor_path)
    positive_image = self.load_image(positive_path)
    negative_image = self.load_image(negative_path)
    return anchor_image, positive_image, negative_image

  def load_image(self, image_path):
    """
    Load and transform the image from the specified path.
    """
    image = Image.open(image_path).convert('RGB')  # Load image and convert to RGB
    return self.transform(image)  # Apply transformations


### 2.3 Creating DataLoader

Now, let's create the triplet list and the DataLoader for training.


In [None]:
# Create triplet data
triplet_list = create_triplets('path_to_your_dataset', num_triplets=1000)

# Create dataset and DataLoader with shuffling
train_dataset = TripletDataset(triplet_list)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)  # Shuffle data


## 3. Model Definition

Now, we'll define our face recognition model. We'll use a simple Convolutional Neural Network (CNN) that outputs embeddings for the images.


In [None]:
class FaceRecognitionModel(nn.Module):
  """
  A simple CNN model for face recognition.
  """

  def __init__(self):
    super(FaceRecognitionModel, self).__init__()
    self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1)
    self.pool = nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
    self.fc1 = nn.Linear(64 * 112 * 112, 128)  # Adjust based on input size
    self.fc2 = nn.Linear(128, 128)  # Output embedding size

  def forward(self, x):
    x = self.pool(F.relu(self.conv1(x)))
    x = x.view(x.size(0), -1)  # Flatten the tensor
    x = F.relu(self.fc1(x))
    x = self.fc2(x)  # Embedding output
    return x


## 4. Triplet Loss Function

We will implement the triplet loss function that encourages the model to minimize the distance between the anchor and positive embeddings while maximizing the distance between the anchor and negative embeddings.


In [None]:
def triplet_loss(anchor, positive, negative, margin=1.0):
  """
  Compute the triplet loss.

  Parameters:
  - anchor: Embedding for anchor image.
  - positive: Embedding for positive image.
  - negative: Embedding for negative image.
  - margin: Margin for triplet loss.

  Returns:
  - Computed loss.
  """
  pos_distance = torch.nn.functional.pairwise_distance(anchor, positive)
  neg_distance = torch.nn.functional.pairwise_distance(anchor, negative)
  loss = torch.mean(torch.clamp(pos_distance - neg_distance + margin, min=0.0))
  return loss


## 5. Training the Model

We will now implement the training function, which will iterate over the dataset and optimize the model using the triplet loss.


In [None]:
def train_model(model, train_loader, optimizer, num_epochs, device):
  """
  Train the face recognition model.

  Parameters:
  - model: The face recognition model.
  - train_loader: DataLoader for training data.
  - optimizer: Optimizer for training.
  - num_epochs: Number of epochs to train.
  - device: Device to run the model on (CPU or GPU).
  """
  model.train()  # Set the model to training mode
  for epoch in range(num_epochs):
    total_loss = 0.0
    for anchor, positive, negative in train_loader:
      anchor, positive, negative = anchor.to(device), positive.to(device), negative.to(device)

      optimizer.zero_grad()
      anchor_embedding = model(anchor)
      positive_embedding = model(positive)
      negative_embedding = model(negative)

      loss = triplet_loss(anchor_embedding, positive_embedding, negative_embedding)
      loss.backward()
      optimizer.step()

      total_loss += loss.item()

    avg_loss = total_loss / len(train_loader)
    print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {avg_loss:.4f}')


## 6. Putting It All Together

Now, let's run everything together: we will initialize the model, define the optimizer, and start training.


In [None]:
# Initialize the model, optimizer
model = FaceRecognitionModel().to(device)
optimizer = optim.Adam(model.parameters(), lr=0.001)  # Adjust learning rate as needed

# Set the number of epochs
num_epochs = 20  # Adjust as necessary

# Start training
train_model(model, train_loader, optimizer, num_epochs, device)


# Conclusion

In this notebook, we created a face recognition model using triplet loss. We covered data preparation, model definition, loss calculation, and training. This basic framework can be expanded upon with more complex models, data augmentation, and optimization strategies for improved performance.

Summary
* Data Preparation: Organized images into triplet format.
* Model Definition: Implemented a CNN for face embeddings.
* Loss Function: Used triplet loss for training.
* Training: Monitored the model's performance during training.