## Triplet Loss

Triplet Loss — это один из лоссов для contrastive learning. Чтобы учить модель с помощью этого лосса, модели не нужен последний классификационный слой. Этот лосс работает прямо с эмбеддингами $x_i$ элементов, которые выдает модель.

Снова скажем, что идея лосса — заставить эмбеддинги лиц одного человека быть более близкими по некоторому расстоянию, а эмбеддинги лиц разных людей — далекими друг от друга. Общая формула лосса выглядит так:

$$L(e, p, n) = max\{d(a, p) - d(a, n) + margin, 0\},$$

здесь
- $e$ — эмбеддинг входного лица (output модели)
- $p$ — "positive" эмбеддинг для входного лица (т.е. эмбеддинг такого элемента, что мы хотим, чтобы $e$ и $p$ были близки. В нашем случае это значит, что $e$ и $p$ должны быть выходами сети на два разных фото одного и того же человека).
- $n$ — "negative" эмбеддинг для входного лица (т.е. эмбеддинг такого элемента, что мы хотим, чтобы $e$ и $p$ были далеки. В нашем случае это значит, что $e$ и $p$ должны быть выходами сети на два разных фото разных людей).
- $d(x, y)$ — метрика расстояния между эмбеддингами, по которой мы их сравниваем.
- margin — гиперпараметр, который заставляет $d(a, p)$ и $d(a, n)$ быть еще дальше друг от друга.

**Эмбеддинги $e$, $p$ и $n$ нужно нормализовать, прежде чем подавать в лосс-функцию**.

У TripletLoss есть куча разных вариаций. В некоторых из них больше гиперпараметров, в других предлагают использовать больше одного позитивного и негативного эмбеддинга за раз. Где-то предлагается умный способ выбора negative эмбеддинга (например, выбирается такой, на котором нейросеть пока плохо работает, т.е. считает $e$ и $n$ близкими).

Пример реализации TripletLoss можно найти [здесь](https://pytorch.org/docs/stable/generated/torch.nn.TripletMarginWithDistanceLoss.html#torch.nn.TripletMarginWithDistanceLoss).

Будьте готовы, что TripletLoss придется настраивать, чтобы добиться хорошего результата при обучении сети.


**Что нужно учесть при реализации Triplet Loss**:
- при обучении мы обычно хотим следить за ходом обучения модели, считая какую-то метрику качества. Тут у нас больше нет классификационного слоя, так что accuracy мы считать не можем. Нужно придумать, как в случае Triplet Loss считать метрику качества на вализации в течение обучения. Подумайте, как можно это сделать?
- скорее всего, чтобы обучить сеть на Triplet Loss, придется написать кастомный Dalaset/Dataloader, который будет возвращать тройки элементов (anchor, positive, negative).
- не забудьте нормализовать эмбеддинги перед подсчетом лосса! Это можно сделать руками, а можно, например, добавить в конец сети batchnorm без обучаемых параметров.

**Доп литература по Triplet Loss**:

- Идея TripletLoss: https://en.wikipedia.org/wiki/Triplet_loss
- Хорошая статья про batch mining techniques для выбора positive и negative элементов: https://omoindrot.github.io/triplet-loss#triplet-mining


In [1]:
import os
import random
import math
import itertools
from collections import defaultdict

import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader, Sampler, Subset, random_split
import torchvision
import torchvision.transforms as transforms
from torchvision.models import resnet34
from torch.optim.lr_scheduler import CosineAnnealingLR


import matplotlib.pyplot as plt
from PIL import Image, ImageDraw
import cv2
from tqdm.notebook import tqdm 

from defenitions import ArcFaceLoss
from models import get_recognition_model

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device


device(type='cuda')

In [2]:

class BalancedBatchSampler(Sampler):
    def __init__(self, dataset, n_classes, n_samples):
        super().__init__(dataset)
        self.labels = [item[1] for item in dataset.samples]
        self.labels_set = list(set(self.labels))
        self.label_to_indices = {label: np.where(np.array(self.labels) == label)[0]
                                 for label in self.labels_set}
        for l in self.labels_set:
            np.random.shuffle(self.label_to_indices[l])
        self.used_label_indices_count = {label: 0 for label in self.labels_set}
        
        self.n_classes = n_classes
        self.n_samples = n_samples
        self.dataset_size = len(dataset.samples)
        self.batch_size = self.n_classes * self.n_samples

    def __iter__(self):
        self.used_label_indices_count = {label: 0 for label in self.labels_set}
        
        num_batches = self.dataset_size // self.batch_size
        for _ in range(num_batches):
            classes = np.random.choice(self.labels_set, self.n_classes, replace=False)
            indices = []
            for class_ in classes:
                indices_for_class = self.label_to_indices[class_]
                start_index = self.used_label_indices_count[class_]
                
                batch_indices = indices_for_class[start_index : start_index + self.n_samples]
                
                if len(batch_indices) < self.n_samples:
                    remaining = self.n_samples - len(batch_indices)
                    batch_indices = np.concatenate([batch_indices, indices_for_class[:remaining]])
                    self.used_label_indices_count[class_] = remaining
                else:
                    self.used_label_indices_count[class_] += self.n_samples
                
                indices.extend(batch_indices)
            yield indices

    def __len__(self):
        return self.dataset_size // self.batch_size * self.batch_size

In [3]:
transform_train = transforms.Compose([
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ColorJitter(brightness=0.2, contrast=0.2),
    transforms.RandomAffine(degrees=10, translate=(0.05, 0.05), scale=(0.9, 1.1)),
    transforms.Resize((128, 128)),
    transforms.ToTensor(),
])

transform_val = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.ToTensor(),
])

aligned_root = './data/celeba_aligned_top_200' 
dataset = torchvision.datasets.ImageFolder(root=aligned_root)

train_size = int(0.8 * len(dataset))
val_size = len(dataset) - train_size
train_dataset, val_dataset = random_split(dataset, [train_size, val_size])

class TransformedSubset(Dataset):
    def __init__(self, subset, transform=None):
        self.subset = subset
        self.transform = transform
    
    def __getitem__(self, index):
        x, y = self.subset[index]
        if self.transform:
            x = self.transform(x)
        return x, y
    
    def __len__(self):
        return len(self.subset)

train_dataset_transformed = TransformedSubset(train_dataset, transform=transform_train)
val_dataset_transformed = TransformedSubset(val_dataset, transform=transform_val)

P = 16  
K = 4   

train_samples = [dataset.samples[i] for i in train_dataset.indices]

class DummyDataset:
    def __init__(self, samples):
        self.samples = samples

train_dummy_dataset = DummyDataset(train_samples)

train_sampler = BalancedBatchSampler(train_dummy_dataset, n_classes=P, n_samples=K)

train_loader = DataLoader(train_dataset_transformed, batch_sampler=train_sampler, num_workers=0) 

val_loader = DataLoader(val_dataset_transformed, batch_size=64, shuffle=False, num_workers=0)

print(f"Загружено {len(dataset)} изображений.")
print(f"Обучающая выборка: {len(train_dataset)} изображений.")
print(f"Валидационная выборка: {len(val_dataset)} изображений.")
print(f"Train DataLoader использует BalancedBatchSampler с P={P}, K={K}.")

Загружено 5574 изображений.
Обучающая выборка: 4459 изображений.
Валидационная выборка: 1115 изображений.
Train DataLoader использует BalancedBatchSampler с P=16, K=4.




In [4]:
import torch.nn.functional as F

import torch.nn.functional as F

def get_hard_triplets(embeddings, labels):
    pairwise_dist = torch.cdist(embeddings, embeddings, p=2)
    mask_positive = (labels.unsqueeze(1) == labels.unsqueeze(0)).bool()
    
    mask_positive.fill_diagonal_(False)
    
    if not mask_positive.any():
        return torch.tensor(0.0).to(embeddings.device) 
        
    dist_ap, _ = torch.max(pairwise_dist * mask_positive.float(), dim=1)
    
    dist_an, _ = torch.min(pairwise_dist + 1e5 * (~mask_positive).logical_not().float(), dim=1)
    
    triplet_loss = F.relu(dist_ap - dist_an + 0.2) 
    
    num_non_zero_triplets = (triplet_loss > 1e-16).sum()
    if num_non_zero_triplets == 0:
        return torch.tensor(0.0).to(embeddings.device)
        
    loss = triplet_loss.sum() / num_non_zero_triplets
    return loss

### Tripletloss

In [11]:
from tqdm.notebook import tqdm
import torch
import torch.nn.functional as F
from torch.optim.lr_scheduler import CosineAnnealingLR

def validate_with_knn(model, val_loader, device):
    """
    Проводит валидацию модели, обученной на метрическом лоссе,
    с помощью метода k-ближайших соседей (k=1), что эквивалентно Rank-1 accuracy.
    """
    model.eval()
    
    all_embeddings = []
    all_labels = []
    
    with torch.no_grad():
        for imgs, labels in val_loader:
            imgs = imgs.to(device)
            embeddings = model(imgs)
            embeddings = F.normalize(embeddings, p=2, dim=1) 
            
            all_embeddings.append(embeddings.cpu())
            all_labels.append(labels.cpu())
            
    gallery_embeddings = torch.cat(all_embeddings, dim=0)
    gallery_labels = torch.cat(all_labels, dim=0)
    
    if len(gallery_labels) == 0:
        return 0.0

    similarity_matrix = torch.mm(gallery_embeddings, gallery_embeddings.T)
    
    similarity_matrix.fill_diagonal_(-1)
    
    _, nearest_indices = torch.max(similarity_matrix, dim=1)
    
    predicted_labels = gallery_labels[nearest_indices]
    correct_predictions = (predicted_labels == gallery_labels).sum().item()
    
    accuracy = 100.0 * correct_predictions / len(gallery_labels)
    
    return accuracy


EMBEDDING_SIZE = 512
EPOCHS = 40
LR = 3e-5

CHECKPOINT_PATH = "./models/learn_facerec_triplet_only_best.pth"
CHECHPOINT_FROM_ZADANIE_2 = "./models/face_rec_zadanie2.pth"

facerec_model = get_recognition_model(embedding_size=EMBEDDING_SIZE).to(device)
facerec_model.load_state_dict(torch.load(CHECHPOINT_FROM_ZADANIE_2))

optimizer = torch.optim.AdamW(facerec_model.parameters(), lr=LR)
scheduler = CosineAnnealingLR(optimizer, T_max=EPOCHS, eta_min=1e-5) 

best_val_accuracy = 0.0

print("Starting training with Triplet Loss only...")

for epoch in range(EPOCHS):
    facerec_model.train()
    
    total_train_loss = 0
    
    num_batches = len(train_sampler) // train_sampler.batch_size
    progress_bar = tqdm(train_loader, total=num_batches, desc=f"Epoch {epoch+1}/{EPOCHS}")
    
    for imgs, labels in progress_bar:
        imgs = imgs.to(device)
        labels = labels.to(device)
        
        optimizer.zero_grad()
        
        embeddings = facerec_model(imgs)
        embeddings = F.normalize(embeddings, p=2, dim=1)
        
        loss = get_hard_triplets(embeddings, labels)
        
        loss.backward()
        optimizer.step()
        
        total_train_loss += loss.item()
        
        progress_bar.set_postfix(loss=loss.item())

    scheduler.step()
    
    avg_train_loss = total_train_loss / num_batches
    
    val_accuracy = validate_with_knn(facerec_model, val_loader, device)
    
    print(f"\nEpoch {epoch+1} done. Avg Triplet Train Loss: {avg_train_loss:.4f}")
    print(f"Validation k-NN Accuracy (Rank-1): {val_accuracy:.2f}%")
    
    if val_accuracy > best_val_accuracy:
        best_val_accuracy = val_accuracy
        torch.save(facerec_model.state_dict(), CHECKPOINT_PATH)
        print(f"model saved, Accuracy: {best_val_accuracy:.2f}% at epoch {epoch+1}")


Starting training with Triplet Loss only...


Epoch 1/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 1 done. Avg Triplet Train Loss: 1.5493
Validation k-NN Accuracy (Rank-1): 28.07%
model saved, Accuracy: 28.07% at epoch 1


Epoch 2/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 2 done. Avg Triplet Train Loss: 1.1924
Validation k-NN Accuracy (Rank-1): 25.92%


Epoch 3/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 3 done. Avg Triplet Train Loss: 0.8171
Validation k-NN Accuracy (Rank-1): 23.95%


Epoch 4/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 4 done. Avg Triplet Train Loss: 0.5778
Validation k-NN Accuracy (Rank-1): 21.70%


Epoch 5/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 5 done. Avg Triplet Train Loss: 0.4769
Validation k-NN Accuracy (Rank-1): 20.81%


Epoch 6/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 6 done. Avg Triplet Train Loss: 0.4356
Validation k-NN Accuracy (Rank-1): 21.35%


Epoch 7/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 7 done. Avg Triplet Train Loss: 0.4115
Validation k-NN Accuracy (Rank-1): 20.81%


Epoch 8/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 8 done. Avg Triplet Train Loss: 0.3957
Validation k-NN Accuracy (Rank-1): 21.35%


Epoch 9/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 9 done. Avg Triplet Train Loss: 0.3829
Validation k-NN Accuracy (Rank-1): 21.43%


Epoch 10/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 10 done. Avg Triplet Train Loss: 0.3722
Validation k-NN Accuracy (Rank-1): 20.99%


Epoch 11/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 11 done. Avg Triplet Train Loss: 0.3637
Validation k-NN Accuracy (Rank-1): 21.52%


Epoch 12/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 12 done. Avg Triplet Train Loss: 0.3563
Validation k-NN Accuracy (Rank-1): 21.17%


Epoch 13/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 13 done. Avg Triplet Train Loss: 0.3496
Validation k-NN Accuracy (Rank-1): 21.52%


Epoch 14/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 14 done. Avg Triplet Train Loss: 0.3438
Validation k-NN Accuracy (Rank-1): 21.17%


Epoch 15/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 15 done. Avg Triplet Train Loss: 0.3386
Validation k-NN Accuracy (Rank-1): 22.15%


Epoch 16/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 16 done. Avg Triplet Train Loss: 0.3340
Validation k-NN Accuracy (Rank-1): 22.06%


Epoch 17/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 17 done. Avg Triplet Train Loss: 0.3298
Validation k-NN Accuracy (Rank-1): 22.42%


Epoch 18/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 18 done. Avg Triplet Train Loss: 0.3260
Validation k-NN Accuracy (Rank-1): 21.88%


Epoch 19/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 19 done. Avg Triplet Train Loss: 0.3227
Validation k-NN Accuracy (Rank-1): 21.88%


Epoch 20/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 20 done. Avg Triplet Train Loss: 0.3194
Validation k-NN Accuracy (Rank-1): 21.79%


Epoch 21/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 21 done. Avg Triplet Train Loss: 0.3161
Validation k-NN Accuracy (Rank-1): 21.79%


Epoch 22/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 22 done. Avg Triplet Train Loss: 0.3132
Validation k-NN Accuracy (Rank-1): 21.79%


Epoch 23/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 23 done. Avg Triplet Train Loss: 0.3111
Validation k-NN Accuracy (Rank-1): 22.15%


Epoch 24/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 24 done. Avg Triplet Train Loss: 0.3087
Validation k-NN Accuracy (Rank-1): 21.70%


Epoch 25/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 25 done. Avg Triplet Train Loss: 0.3065
Validation k-NN Accuracy (Rank-1): 21.97%


Epoch 26/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 26 done. Avg Triplet Train Loss: 0.3045
Validation k-NN Accuracy (Rank-1): 21.79%


Epoch 27/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 27 done. Avg Triplet Train Loss: 0.3025
Validation k-NN Accuracy (Rank-1): 21.61%


Epoch 28/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 28 done. Avg Triplet Train Loss: 0.3009
Validation k-NN Accuracy (Rank-1): 21.52%


Epoch 29/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 29 done. Avg Triplet Train Loss: 0.2993
Validation k-NN Accuracy (Rank-1): 21.17%


Epoch 30/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 30 done. Avg Triplet Train Loss: 0.2974
Validation k-NN Accuracy (Rank-1): 20.99%


Epoch 31/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 31 done. Avg Triplet Train Loss: 0.2961
Validation k-NN Accuracy (Rank-1): 21.35%


Epoch 32/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 32 done. Avg Triplet Train Loss: 0.2946
Validation k-NN Accuracy (Rank-1): 20.72%


Epoch 33/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 33 done. Avg Triplet Train Loss: 0.2935
Validation k-NN Accuracy (Rank-1): 21.35%


Epoch 34/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 34 done. Avg Triplet Train Loss: 0.2922
Validation k-NN Accuracy (Rank-1): 20.63%


Epoch 35/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 35 done. Avg Triplet Train Loss: 0.2910
Validation k-NN Accuracy (Rank-1): 20.81%


Epoch 36/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 36 done. Avg Triplet Train Loss: 0.2899
Validation k-NN Accuracy (Rank-1): 20.27%


Epoch 37/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 37 done. Avg Triplet Train Loss: 0.2888
Validation k-NN Accuracy (Rank-1): 20.63%


Epoch 38/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 38 done. Avg Triplet Train Loss: 0.2877
Validation k-NN Accuracy (Rank-1): 20.54%


Epoch 39/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 39 done. Avg Triplet Train Loss: 0.2867
Validation k-NN Accuracy (Rank-1): 21.35%


Epoch 40/40:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 40 done. Avg Triplet Train Loss: 0.2856
Validation k-NN Accuracy (Rank-1): 21.52%


### ArcFace + Tripletloss

In [7]:
import itertools
from tqdm.notebook import tqdm

N_PEOPLE_IN_DATASET = len(dataset.classes) 
EMBEDDING_SIZE = 512
EPOCHS = 20
LR = 3e-4
TRIPLET_WEIGHT = 5.0 

facerec_model = get_recognition_model(embedding_size=EMBEDDING_SIZE).to(device)

arcface_loss_fn = ArcFaceLoss(num_classes=N_PEOPLE_IN_DATASET, embedding_size=EMBEDDING_SIZE, margin=0.3).to(device)

optimizer = torch.optim.AdamW(
    itertools.chain(facerec_model.parameters(), arcface_loss_fn.parameters()), 
    lr=LR
)
scheduler = CosineAnnealingLR(optimizer, T_max=EPOCHS, eta_min=1e-5) 

best_val_accuracy = 0.0

for epoch in range(EPOCHS):
    facerec_model.train()
    arcface_loss_fn.train()
    
    total_train_loss = 0
    total_arcface_loss = 0
    total_triplet_loss = 0
    
    num_batches = len(train_sampler) // train_sampler.batch_size
    progress_bar = tqdm(train_loader, total=num_batches, desc=f"Epoch {epoch+1}/{EPOCHS}")
    
    for imgs, labels in progress_bar:
        imgs = imgs.to(device)
        labels = labels.to(device)
        
        optimizer.zero_grad()
        
        embeddings = facerec_model(imgs)
        embeddings = F.normalize(embeddings, p=2, dim=1)
        
        loss_arcface = arcface_loss_fn(embeddings, labels)
        loss_triplet = get_hard_triplets(embeddings, labels)
        
        total_loss = loss_arcface + TRIPLET_WEIGHT * loss_triplet
        
        total_loss.backward()
        optimizer.step()
        
        total_train_loss += total_loss.item()
        total_arcface_loss += loss_arcface.item()
        total_triplet_loss += loss_triplet.item()

    scheduler.step()
    
    avg_train_loss = total_train_loss / num_batches
    avg_arcface_loss = total_arcface_loss / num_batches
    avg_triplet_loss = total_triplet_loss / num_batches
    
    print(f"\nEpoch {epoch+1} done. Avg Train Loss: {avg_train_loss:.4f} "
          f"(ArcFace: {avg_arcface_loss:.4f}, Triplet: {avg_triplet_loss:.4f})")
    
    facerec_model.eval()
    arcface_loss_fn.eval()
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for imgs, labels in val_loader:
            imgs, labels = imgs.to(device), labels.to(device)
            embeddings = F.normalize(facerec_model(imgs), p=2, dim=1)
            W_norm = F.normalize(arcface_loss_fn.W, p=2, dim=1)
            logits = torch.mm(embeddings, W_norm.T) * arcface_loss_fn.s
            _, predicted = torch.max(logits, 1)
            val_total += labels.size(0)
            val_correct += (predicted == labels).sum().item()
            
    val_accuracy = 100 * val_correct / val_total
    print(f"Validation Accuracy: {val_accuracy:.2f}%")
    
    if val_accuracy > best_val_accuracy:
        best_val_accuracy = val_accuracy
        torch.save(facerec_model.state_dict(), './models/learn_facerec_triplet_arc.pth')
        torch.save(arcface_loss_fn.state_dict(), './models/learn_lossfn_triplet_arc.pth')
        print(f"model saved, Accuracy: {best_val_accuracy:.2f}% at epoch {epoch+1}")


Epoch 1/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 1 done. Avg Train Loss: 29.2352 (ArcFace: 25.4927, Triplet: 0.7485)
Validation Accuracy: 1.88%
model saved, Accuracy: 1.88% at epoch 1


Epoch 2/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 2 done. Avg Train Loss: 26.6383 (ArcFace: 24.2469, Triplet: 0.4783)
Validation Accuracy: 3.41%
model saved, Accuracy: 3.41% at epoch 2


Epoch 3/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 3 done. Avg Train Loss: 26.0565 (ArcFace: 23.7402, Triplet: 0.4633)
Validation Accuracy: 8.52%
model saved, Accuracy: 8.52% at epoch 3


Epoch 4/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 4 done. Avg Train Loss: 25.6462 (ArcFace: 23.3022, Triplet: 0.4688)
Validation Accuracy: 16.95%
model saved, Accuracy: 16.95% at epoch 4


Epoch 5/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 5 done. Avg Train Loss: 25.0800 (ArcFace: 22.6765, Triplet: 0.4807)
Validation Accuracy: 19.64%
model saved, Accuracy: 19.64% at epoch 5


Epoch 6/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 6 done. Avg Train Loss: 24.5485 (ArcFace: 22.0638, Triplet: 0.4969)
Validation Accuracy: 20.09%
model saved, Accuracy: 20.09% at epoch 6


Epoch 7/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 7 done. Avg Train Loss: 24.1210 (ArcFace: 21.5805, Triplet: 0.5081)
Validation Accuracy: 17.49%


Epoch 8/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 8 done. Avg Train Loss: 23.6928 (ArcFace: 21.0935, Triplet: 0.5199)
Validation Accuracy: 25.02%
model saved, Accuracy: 25.02% at epoch 8


Epoch 9/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 9 done. Avg Train Loss: 22.7759 (ArcFace: 20.0601, Triplet: 0.5432)
Validation Accuracy: 31.48%
model saved, Accuracy: 31.48% at epoch 9


Epoch 10/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 10 done. Avg Train Loss: 22.0330 (ArcFace: 19.2282, Triplet: 0.5610)
Validation Accuracy: 40.90%
model saved, Accuracy: 40.90% at epoch 10


Epoch 11/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 11 done. Avg Train Loss: 21.7650 (ArcFace: 18.8618, Triplet: 0.5807)
Validation Accuracy: 49.51%
model saved, Accuracy: 49.51% at epoch 11


Epoch 12/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 12 done. Avg Train Loss: 20.3880 (ArcFace: 17.3362, Triplet: 0.6104)
Validation Accuracy: 53.27%
model saved, Accuracy: 53.27% at epoch 12


Epoch 13/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 13 done. Avg Train Loss: 18.6354 (ArcFace: 15.4206, Triplet: 0.6429)
Validation Accuracy: 58.03%
model saved, Accuracy: 58.03% at epoch 13


Epoch 14/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 14 done. Avg Train Loss: 17.5083 (ArcFace: 14.1197, Triplet: 0.6777)
Validation Accuracy: 60.54%
model saved, Accuracy: 60.54% at epoch 14


Epoch 15/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 15 done. Avg Train Loss: 16.9895 (ArcFace: 13.5075, Triplet: 0.6964)
Validation Accuracy: 65.65%
model saved, Accuracy: 65.65% at epoch 15


Epoch 16/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 16 done. Avg Train Loss: 15.7178 (ArcFace: 12.1034, Triplet: 0.7229)
Validation Accuracy: 69.87%
model saved, Accuracy: 69.87% at epoch 16


Epoch 17/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 17 done. Avg Train Loss: 15.2444 (ArcFace: 11.5638, Triplet: 0.7361)
Validation Accuracy: 69.15%


Epoch 18/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 18 done. Avg Train Loss: 14.5760 (ArcFace: 10.8141, Triplet: 0.7524)
Validation Accuracy: 71.30%
model saved, Accuracy: 71.30% at epoch 18


Epoch 19/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 19 done. Avg Train Loss: 14.0741 (ArcFace: 10.2798, Triplet: 0.7589)
Validation Accuracy: 71.84%
model saved, Accuracy: 71.84% at epoch 19


Epoch 20/20:   0%|          | 0/69 [00:00<?, ?it/s]


Epoch 20 done. Avg Train Loss: 13.6458 (ArcFace: 9.8112, Triplet: 0.7669)
Validation Accuracy: 72.02%
model saved, Accuracy: 72.02% at epoch 20
