# Colab Tutorial
For a quick tutorial on how to use google colab see:
https://towardsdatascience.com/getting-started-with-google-colab-f2fff97f594c

# Introduction
This notebook contains all the necessary code and imports to train the models presented in the thesis. 

\\

In order to train a model, the following files need to be uploaded:

*   model.py
*   pit.py
*   cait.py
*   dataset.py
*   loss_functions.py
*   train.py
*   utils.py


To download the currently available SPED dataset see:
https://www.dropbox.com/s/aklu4tz3hurycj0/SPED_900.zip?dl=0

To get acces to and download the Mapillary SLS dataset, permission can be requested at https://www.mapillary.com/dataset/places

After downloading these, they can be uploaded to google disk and then unpacked on a colab instance. 


\\

After uploading the necessary datasets to the instance, the different configurations modified throughout the notebook can be modified to use a variation of traininig schemes. 

\\





In [1]:
!nvidia-smi

Thu Jun 10 01:58:37 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.27       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   47C    P0    30W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [2]:
# mount drive

from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [3]:
# !git clone https://github.com/oyvowm/visual_loop_closure_detection
# %cd visual_loop_closure_detection/

# # unzip the dataset
# import time
# import zipfile
# !unzip -q /content/gdrive/MyDrive/Master/MSLS/images_vol_1.zip

# time.sleep(5)
# !unzip -q /content/gdrive/MyDrive/Master/MSLS/images_vol_2.zip

# time.sleep(5)
# !unzip -q /content/gdrive/MyDrive/Master/MSLS/images_vol_3.zip

# time.sleep(5)
# !unzip -q /content/gdrive/MyDrive/Master/MSLS/images_vol_4.zip

# time.sleep(5)
# !unzip -q /content/gdrive/MyDrive/Master/MSLS/images_vol_5.zip

# time.sleep(5)
# !unzip -q /content/gdrive/MyDrive/Master/MSLS/images_vol_6.zip

# !unzip -q /content/gdrive/MyDrive/Master/MSLS/metadata.zip
# !unzip -q /content/gdrive/MyDrive/Master/MSLS/patch_v1.1.zip

In [4]:
%cd visual_loop_closure_detection/

/content/visual_loop_closure_detection


In [5]:
pip install -r /content/visual_loop_closure_detection/requirements.txt

Collecting git+https://github.com/rwightman/pytorch-image-models.git (from -r /content/visual_loop_closure_detection/requirements.txt (line 1))
  Cloning https://github.com/rwightman/pytorch-image-models.git to /tmp/pip-req-build-ubnlb96f
  Running command git clone -q https://github.com/rwightman/pytorch-image-models.git /tmp/pip-req-build-ubnlb96f
Building wheels for collected packages: timm
  Building wheel for timm (setup.py) ... [?25l[?25hdone
  Created wheel for timm: filename=timm-0.4.11-cp37-none-any.whl size=372674 sha256=31440fc4b7e2963648065a991b1e49202dd221153091c3bac77761c6c1c1ff8b
  Stored in directory: /tmp/pip-ephem-wheel-cache-9k3pd6yk/wheels/20/b8/27/66bb141495c14daa67474754678277959ca333a352dab313a5
Successfully built timm


In [6]:
%cd visual_loop_closure_detection/

import torch
import torch.nn as nn
import datetime
import numpy as np
import time
import torch.backends.cudnn as cudnn
import json
import os
import torch.nn.functional as F
from torch.autograd import Variable
import torch.utils.data as data
from torch.utils.tensorboard import SummaryWriter
from torchvision.utils import make_grid
import torchvision.datasets as datasets
import torchvision
from torchvision import transforms
from PIL import Image
from torch.utils.data import Dataset
from matplotlib.pyplot import imshow
import matplotlib.pyplot as plt
import matplotlib
from timm.scheduler import create_scheduler
from timm.optim import create_optimizer_v2
from timm.data import RandomResizedCropAndInterpolation
from timm.data.random_erasing import RandomErasing  
from timm.data.mixup import *
from timm.data.auto_augment import *
from timm.loss import SoftTargetCrossEntropy, LabelSmoothingCrossEntropy
from tqdm import tqdm
from utils import *
from model import VisionTransformer, DistilledVisionTransformer, ArcFaceDeit
from pit import PoolingTransformer, DistilledPoolingTransformer
from cait import cait_models, cait_models_twoQ
from train import *
from datasets import *
from loss_functions import *


[Errno 2] No such file or directory: 'visual_loop_closure_detection/'
/content/visual_loop_closure_detection


In [7]:
training_config = {
            'continue_training': False,
            'original_model_path': '',
            'dataset_path': 'train_val',
            'learning_rate': 3e-5,
            'scheduler': 'exponential',
            'epochs': 20,
            'steps': 40, # steps until first learning rate restart for CosineAnnealing scheduler
            'mult': 5,   # gets multiplied with the steps variable after each restart
            'mixup': False,
            'save_model_name': '224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth', 
            'distilled': True, 
            'arcface': False, # with this set to true a distilled pre-trained model will be used and 'distilled' should set to False
            'save_frequency': 1, # frequency to save model
            'model': 'vit',
            'two_outputs': True, # needs to be True for vit and pit models when using triplet loss.
            'subsets_per_epoch': 5, # defines how many subsets constitutes one epoch.
}

# in order to use pit "pit_s_distill_891.pth" needs to be downloaded from: https://github.com/naver-ai/pit
# and then uploaded here so that the 'original_model_path' key matches its path.
if training_config['model'] == 'pit':
    training_config['original_model_path'] = '/content/pit_s_distill_819.pth'

# set random seeds
torch.manual_seed(0)
torch.cuda.manual_seed_all(0)
np.random.seed(0)


cait_config = {
        "img_size": (224,224),
        "patch_size": 16,
        "num_classes": 900,
        "embed_dim": 288,
        "depth": 24,
        "num_heads": 6,
        "qkv_bias": True,
        #"drop_path_rate": 0.1,
        "init_scale": 1e-5,
        "depth_token_only": 2,
        "triplet": True,
}



vit_config = {
        "img_size": (224,224),
        "patch_size": 16,
        "in_chans": 3,
        "num_classes": 900,
        "embed_dim": 384,
        "triplet": True,
        "depth": 12,
        "num_heads": 6,
        "hidden_mult": 4,
        "qkv_bias": True,
        #"drop_path": 0.1,
        "embed_fn": 'vit',     
}

pit_config = {
        "img_size": (224,224),
        "patch_size": 16,
        "stride": 8,
        "base_dims": [48, 48, 48],
        "depth": [2, 6, 4],
        "heads": [3, 6, 12],
        "mlp_ratio": 4,
        "triplet": True,
}

if training_config['model'] == 'vit':
    model_config = vit_config
    if training_config['distilled']:
        model = DistilledVisionTransformer(**model_config)
    elif training_config['arcface']:
        model = ArcFaceDeit(**model_config)
    else:
        model = VisionTransformer(**model_config)
    pre_trained_size = 224

elif training_config['model'] == 'cait':
    pre_trained_size = 384 # cait XS is pre-trained on 384x384 images. 
    model_config = cait_config
    if training_config['two_outputs']:
        model = cait_models_twoQ(**model_config)
    else:
        model = cait_models(**model_config)



else:
    model_config = pit_config
    if training_config['distilled']:
        model = DistilledPoolingTransformer(**model_config)
    else:
        model = PoolingTransformer(**model_config)
    pre_trained_size = 224
continue_training = training_config['continue_training']


device = torch.device('cuda' if torch.cuda.is_available else 'cpu')
torch.cuda.current_device()

torch.cuda.get_device_name(0)



'Tesla P100-PCIE-16GB'

In [8]:
##### loading pre-trained model if initiating training: 


if continue_training == False:
    print('Initializing parameters from pre-trained model...')
    if training_config['model'] == 'vit':
        checkpoint = torch.hub.load_state_dict_from_url(
            url="https://dl.fbaipublicfiles.com/deit/deit_small_distilled_patch16_224-649709d9.pth",
            map_location="cpu", check_hash=True
        )
        checkpoint_model = checkpoint['model']
        # if a convolutional backbone is used:
        if model_config['embed_fn'] == 'convolution':
            conv_model = torch.load('/content/lvvit_s-224-83.3.pth.tar')

            # extract patch_embed keys from conv backbone model
            patch_embed_keys = []
            for k in conv_model:
              patch_embed_keys.append(k)
            for k in patch_embed_keys:
              if k[:5] != 'patch':
                conv_model.pop(k)
            
            # remove patch_embed keys from checkpoint model
            patch_embed_keys = []
            for k in checkpoint_model:
              if k[:5] == 'patch':
                patch_embed_keys.append(k)
            for k in patch_embed_keys:
              checkpoint_model.pop(k)
            
            # merge the two state_dicts
            for k in conv_model:
              checkpoint_model[k] = conv_model[k]

    elif training_config['model'] == 'cait':
        checkpoint = torch.hub.load_state_dict_from_url(
            url="https://dl.fbaipublicfiles.com/deit/XS24_384.pth",
            map_location="cpu", check_hash=True
        )
        
        checkpoint_model = {}
        for k in model.state_dict().keys():
            # if k == ('extra_token' or 'q2'): 
            # #    # initializing extra token with the same parameters as the cls token
            #     #checkpoint_model[k] = checkpoint["model"]['module.'+'cls_token']
            #     print(f'skipping {k}')
            try:
                checkpoint_model[k] = checkpoint["model"]['module.'+k]
            except:
                print(f'skipping {k}')
            
            


    else: # pit
        checkpoint_model = torch.load(training_config['original_model_path'], map_location='cpu')
    state_dict = model.state_dict()      # the state_dict of the new model
    if model_config['img_size'] != (224, 224) or pre_trained_size == 384: # and model_config['img_size'] != (384,384):
        print('>>> model pre-trained with different image size, interpolating position embeddings...')
        """ 
        taken from https://github.com/facebookresearch/deit/blob/ab5715372db8c6cad5740714b2216d55aeae052e/main.py
        modified to allow non-square images
        interpolate position embedding, used when diverging from the 224*224 input sizes
        """
        pos_embed_checkpoint = checkpoint_model['pos_embed']
        embedding_size = pos_embed_checkpoint.shape[-1]  # embedding size of pretrained models positional embedding
        num_patches = model.patch_embed.num_patches   # number of patches in the new model
        new_size_h = int(model_config['img_size'][0] / model_config['patch_size'])
        new_size_w = int(model_config['img_size'][1] / model_config['patch_size'])

        if training_config['model'] == 'cait': # in cait the extra tokens are not affected by positional embeddings
            orig_size = int((pos_embed_checkpoint.shape[-2]) ** 0.5) # num_patches along H,W in old model
            pos_tokens = pos_embed_checkpoint.reshape(-1, orig_size, orig_size, embedding_size).permute(0, 3, 1, 2) # (1, embed_dim, orig, orig))

            pos_tokens = torch.nn.functional.interpolate(
                pos_tokens, size=(new_size_h, new_size_w), mode='bicubic', align_corners=False) # interpolate into (1, embed_dim, new, new)
            new_pos_embed = pos_tokens.permute(0, 2, 3, 1).flatten(1, 2) # reshape into (1, num_patches, embed_dim)
            
        else:
            num_extra_tokens = model.pos_embed.shape[-2] - num_patches # extra tokens: cls_token and dist if distilattion is used.
           
            # height (== width) for the checkpoint position embedding
            orig_size = int((pos_embed_checkpoint.shape[-2] - num_extra_tokens) ** 0.5) # num_patches along H,W in old model

            
            # class_token and dist_token are kept unchanged
            extra_tokens = pos_embed_checkpoint[:, :num_extra_tokens] 
            # only the position tokens associated with the image patch embeddings are interpolated
            pos_tokens = pos_embed_checkpoint[:, num_extra_tokens:]
            pos_tokens = pos_tokens.reshape(-1, orig_size, orig_size, embedding_size).permute(0, 3, 1, 2) # (1, embed_dim, orig, orig)

            pos_tokens = torch.nn.functional.interpolate(
                pos_tokens, size=(new_size_h, new_size_w), mode='bicubic', align_corners=False) # interpolate into (1, embed_dim, new, new)
            pos_tokens = pos_tokens.permute(0, 2, 3, 1).flatten(1, 2) # reshape into (1, num_patches, embed_dim)
            new_pos_embed = torch.cat((extra_tokens, pos_tokens), dim=1)

        checkpoint_model['pos_embed'] = new_pos_embed


    if not model_config['triplet']:
        checkpoint_model.pop('head.weight')
        checkpoint_model.pop('head.bias')
        if training_config['distilled'] and training_config['model'] != 'cait':
            checkpoint_model.pop('head_dist.weight')
            checkpoint_model.pop('head_dist.bias')
    model.load_state_dict(checkpoint_model, strict=False)

    model.to(device)



Initializing parameters from pre-trained model...


Downloading: "https://dl.fbaipublicfiles.com/deit/XS24_384.pth" to /root/.cache/torch/hub/checkpoints/XS24_384.pth


HBox(children=(FloatProgress(value=0.0, max=106768335.0), HTML(value='')))


>>> model pre-trained with different image size, interpolating position embeddings...


In [9]:
##### LOAD DATA #####

# uncomment if using RandErasing, leads to unknown CUDA error for MSLS...
#torch.multiprocessing.set_start_method('spawn')

# Define transformations
validation_transform = transforms.Compose([
                                        transforms.Resize(model_config["img_size"]),
                                        transforms.ToTensor(),
                                        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
                                        ])
training_transform = transforms.Compose([
                                        transforms.Resize((256,256)),
                                        #transforms.Resize(model_config["img_size"]),
                                        
                                        # As in https://github.com/rwightman/pytorch-image-models/blob/b4ebf9263e1c09a928b7d68f3011f7fff040ea5e/timm/data/transforms_factory.py
                                        # first do a resize + crop then horizontal flip and then additional rand_augments
                                        #RandomResizedCropAndInterpolation((224, 224), scale=(0.9,1), interpolation='bicubic'), 
                                        transforms.RandomCrop(model_config["img_size"]),
                                        transforms.RandomHorizontalFlip(0.5),
                                        rand_augment_transform('rand-m9-mstd0.5-inc1', {}),                     
                                        transforms.ToTensor(),
                                        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
                                        #RandomErasing(0.25, mode='pixel', max_count=1),   
                                        ])
if model_config["triplet"]:
    print('triplet')
    # create initial dataset
    posDistThr = 5

    # negatives are defined outside a radius of 25 m
    negDistThr = 15

    # number of negatives per triplet
    nNeg = 5

    # number of cached queries
    cached_queries = 1000

    # number of cached negatives
    cached_negatives = 5000

    # whether to use positive sampling
    positive_sampling = True

    # num positives
    if training_config['two_outputs']:
        num_positives = 2
    else:
        num_positives = 1

    # choose task to test on [im2im, seq2im, im2seq, seq2seq]
    task = 'im2im'
    # training_config['dataset_path']
    train_dataset = MSLS(root_dir = '', cities = '', transform = training_transform, mode = 'train', task = task,
                        negDistThr = negDistThr, posDistThr = posDistThr, nNeg = nNeg, cached_queries = cached_queries,
                        cached_negatives = cached_negatives, positive_sampling = positive_sampling,
                        num_positives = num_positives)
    
    
else:
    # create initial dataset
    dataset = datasets.ImageFolder(training_config['dataset_path'])
    # applies transformations and splits the datasets into a training and validation set
    training, validation = split_dataset(dataset, training_transform, validation_transform)
    print('length trainng set:',len(training),'length validation set:',len(validation))

    # samplers
    sampler_train = torch.utils.data.RandomSampler(training)
    sampler_val = torch.utils.data.SequentialSampler(validation)

    # create dataloaders
    train_loader = data.DataLoader(training, sampler=sampler_train, batch_size=64, num_workers=4, pin_memory=True)
    val_loader = data.DataLoader(validation, sampler=sampler_val, batch_size=64, num_workers=4, pin_memory=True)

    print(len(train_loader))
    i = (training[87][0])
    im = transforms.ToPILImage()(i).convert("RGB")
    imshow(np.asarray((im)))

    print(f' trainset: {training} valset: {validation}')






triplet
=====> trondheim
=====> london
#Sideways [727/3105]; #Night; [0/3105]
Forward and Day weighted with 1.0000
Sideways and Day weighted with 5.2710


  return array(a, dtype, copy=False, order=order)


In [10]:
save_path = os.path.join('/content/gdrive/MyDrive/Master', training_config['save_model_name'])
print(save_path)


# using the timm function create_optimizer as this allows prohibiting weight decay for certain parameters
# see: https://discuss.pytorch.org/t/weight-decay-in-the-optimizers-is-a-bad-idea-especially-with-batchnorm/16994/2
optimizer = create_optimizer_v2(model, 'adamw', training_config['learning_rate'], 5e-4)
print(optimizer)

if training_config['scheduler'] == 'exponential':
    exp_decay = math.exp(-0.15) # 0.20 when 20 epoch
    scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=exp_decay)
else:
    scheduler = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(
            optimizer, 
            T_0=training_config['steps'], 
            T_mult=training_config['mult'],
            verbose=False,
            eta_min=1e-8
        )


# fiks sånn at mixup bruke

if training_config['arcface'] == True:
    criterion = None
    mixup_fn = None

else: 
  if training_config['mixup']:
    print('using mixup')
    mixup_fn = Mixup(mixup_alpha = 0.4, cutmix_alpha = 0.4, num_classes=model_config["num_classes"]) # mixup 0.8 and cutmix 1 in deit paper
    criterion = SoftTargetCrossEntropy()
  else:
      if model_config['triplet']:
          if training_config['two_outputs']:
              criterion = TripletLossTwoInputs(margin=0.2, alpha=0.5).cuda()
          else:
              criterion = TripletLoss(margin=0.3).cuda()
      else:
          print('not using mixup')
          criterion = nn.CrossEntropyLoss()  
          mixup_fn = None




#### load model ###

if continue_training == False:
    train_loss, val_loss = [], []
    learning_rate_plot = []
    current_epoch = 0
    training_time = 0

if continue_training == True:
    print('---- continuing training ----')
    checkpoint = torch.load(save_path, map_location = device)
    model.load_state_dict(checkpoint['model_state_dict'])
    model.to(device)
    optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
    current_epoch = checkpoint['epoch']
    loss = checkpoint['loss']
    train_loss = checkpoint['train_loss']
    val_loss = checkpoint['val_loss']
    learning_rate_plot = checkpoint['learning_rate_plot']
    scheduler.load_state_dict(checkpoint['scheduler_state_dict'])
    training_time = checkpoint['training_time']



/content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth
AdamW (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 3e-05
    weight_decay: 0.0

Parameter Group 1
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 3e-05
    weight_decay: 0.0005
)


In [11]:
# start the training
#torch.multiprocessing.set_start_method('spawn')
tb = SummaryWriter('/content/gdrive/MyDrive/rundeittriplet') # where to log tensorboard information

for epoch in range(current_epoch, training_config['epochs']):
    print(f"Epoch {epoch+1} of {training_config['epochs']}", flush=True)

    
    print(f"Current LR [Epoch Begin]: {scheduler.get_last_lr()}", flush=True)

    # set manual seeds per epoch
    np.random.seed(epoch)
    torch.manual_seed(epoch)
    torch.cuda.manual_seed_all(epoch)

    if model_config["triplet"]:
        model.train()
        train_epoch_loss, lrs, epoch_time = train_triplet(model, train_dataset, optimizer, scheduler,
                                                          epoch, device, criterion, training_config['model'], 
                                                          training_config['two_outputs'], training_config['subsets_per_epoch'])
        val_epoch_loss = 0
    else:
        
        lrs, train_epoch_loss, accuracy_train, epoch_time = train_one_epoch(model, train_loader, optimizer,
                                                  scheduler, epoch, device, mixup_fn, criterion, training_config['arcface'])
        val_epoch_loss, accuracy_val = validate_model(model, val_loader, device, training_config['arcface'])
    
    # appending losses and lrs
    train_loss.append(train_epoch_loss)
    if not model_config['triplet']:
        val_loss.append(val_epoch_loss)    
    learning_rate_plot.extend(lrs)
    training_time += epoch_time


    # SAVE MODEL
    if epoch % training_config['save_frequency'] == 0:
        save_model(epoch, model, optimizer, train_epoch_loss, train_loss, learning_rate_plot, 
                   scheduler, training_time, save_path, val_loss)


    # Tensorboard stuff

    # training loss/accuracy
    tb.add_scalar("Training Loss", train_epoch_loss, epoch)
    

    if not model_config['triplet']:
        tb.add_scalar("Training Accuracy", accuracy_train, epoch)
        # validation loss/accuracy
        tb.add_scalar("Validation Loss", val_epoch_loss, epoch)
        tb.add_scalar("Validation Accuracy", accuracy_val, epoch)
    
    if training_config['model'] == 'pit':
      tb.add_histogram("cls_token", model.cls_token[:,0], epoch)
    else:
      tb.add_histogram("cls_token", model.cls_token, epoch)
    if model_config['triplet'] and training_config['two_outputs']:
        if training_config['model'] == 'cait':
           tb.add_histogram("second_token", model.extra_token, epoch)
        elif training_config['model'] == 'pit':
           tb.add_histogram("second_token", model.cls_token[:,1])
        else:
            tb.add_histogram("second_token", model.dist_token, epoch)

    
    
    print('\n' + f"Training loss: {train_epoch_loss:.3f}", flush=True)
    
    print('------------------------------------------------------------', flush=True)
    if not model_config['triplet']:
        
        print(f'Training accuracy: {accuracy_train}', '\n', flush=True)
        print(f"Validation loss: {val_epoch_loss:.3f}", flush=True)
        print(f'Validation accuracy: {accuracy_val}', flush=True)
        print('------------------------------------------------------------', flush=True)
print('Finished Training')

tb.flush()
tb.close()


plt.figure(figsize=(10, 7))
plt.plot(learning_rate_plot, color='blue', label='lr')
plt.xlabel('Iterations')
plt.ylabel('lr')
plt.legend()
#plt.savefig(f"outputs/lr_schedule_s{steps}_m{mult}.jpg")

plt.figure(figsize=(10, 7))
plt.plot(train_loss, color='orange', label='train loss')
plt.plot(val_loss, color='red', label='validataion loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
#plt.savefig(f"outputs/{loss_plot_name}.jpg")


print('\n\n')

Epoch 1 of 20
Current LR [Epoch Begin]: [3e-05, 3e-05]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.20it/s]
compute positive descriptors: 21it [00:06,  3.39it/s]
compute negative descriptors: 75it [00:20,  3.67it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:10<00:00,  4.44s/it]
compute query descriptors: 17it [00:05,  3.23it/s]
compute positive descriptors: 20it [00:05,  3.36it/s]
compute negative descriptors: 76it [00:20,  3.67it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 97/97 [07:11<00:00,  4.44s/it]
compute query descriptors: 17it [00:05,  3.31it/s]
compute positive descriptors: 21it [00:06,  3.35it/s]
compute negative descriptors: 76it [00:20,  3.67it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 97/97 [07:10<00:00,  4.44s/it]
compute query descriptors: 17it [00:05,  3.22it/s]
compute positive descriptors: 21it [00:06,  3.30it/s]
compute negative descriptors: 76it [00:20,  3.67it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 97/97 [07:10<00:00,  4.44s/it]

average data loading time: 0.015488102878491903
average batch time: 4.440780068181224
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.190
------------------------------------------------------------
Epoch 2 of 20
Current LR [Epoch Begin]: [2.5821239292751733e-05, 2.5821239292751733e-05]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.22it/s]
compute positive descriptors: 21it [00:06,  3.33it/s]
compute negative descriptors: 75it [00:20,  3.64it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:18<00:00,  4.53s/it]
compute query descriptors: 17it [00:05,  3.24it/s]
compute positive descriptors: 20it [00:06,  3.27it/s]
compute negative descriptors: 77it [00:21,  3.66it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 97/97 [07:18<00:00,  4.52s/it]
compute query descriptors: 17it [00:05,  3.29it/s]
compute positive descriptors: 21it [00:06,  3.39it/s]
compute negative descriptors: 75it [00:20,  3.66it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 96/96 [07:10<00:00,  4.49s/it]
compute query descriptors: 17it [00:05,  3.27it/s]
compute positive descriptors: 19it [00:05,  3.34it/s]
compute negative descriptors: 77it [00:20,  3.67it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 97/97 [07:16<00:00,  4.50s/it]

average data loading time: 0.015273458889904565
average batch time: 4.507711165327127
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.095
------------------------------------------------------------
Epoch 3 of 20
Current LR [Epoch Begin]: [2.2224546620451535e-05, 2.2224546620451535e-05]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.18it/s]
compute positive descriptors: 20it [00:06,  3.32it/s]
compute negative descriptors: 76it [00:20,  3.66it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:11<00:00,  4.45s/it]
compute query descriptors: 17it [00:05,  3.28it/s]
compute positive descriptors: 21it [00:06,  3.33it/s]
compute negative descriptors: 77it [00:20,  3.67it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 96/96 [07:05<00:00,  4.43s/it]
compute query descriptors: 17it [00:05,  3.26it/s]
compute positive descriptors: 22it [00:06,  3.41it/s]
compute negative descriptors: 74it [00:20,  3.65it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 97/97 [07:09<00:00,  4.43s/it]
compute query descriptors: 17it [00:05,  3.27it/s]
compute positive descriptors: 21it [00:06,  3.37it/s]
compute negative descriptors: 75it [00:20,  3.68it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 96/96 [07:16<00:00,  4.55s/it]

average data loading time: 0.015460688833127985
average batch time: 4.464514805245276
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.067
------------------------------------------------------------
Epoch 4 of 20
Current LR [Epoch Begin]: [1.9128844548653197e-05, 1.9128844548653197e-05]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.22it/s]
compute positive descriptors: 20it [00:06,  3.29it/s]
compute negative descriptors: 76it [00:20,  3.65it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:21<00:00,  4.55s/it]
compute query descriptors: 17it [00:05,  3.25it/s]
compute positive descriptors: 21it [00:06,  3.33it/s]
compute negative descriptors: 75it [00:20,  3.66it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 97/97 [07:15<00:00,  4.49s/it]
compute query descriptors: 17it [00:05,  3.29it/s]
compute positive descriptors: 21it [00:06,  3.34it/s]
compute negative descriptors: 76it [00:20,  3.65it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 96/96 [07:07<00:00,  4.46s/it]
compute query descriptors: 17it [00:05,  3.24it/s]
compute positive descriptors: 21it [00:06,  3.35it/s]
compute negative descriptors: 76it [00:20,  3.66it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 97/97 [07:26<00:00,  4.60s/it]

average data loading time: 0.015352782044915882
average batch time: 4.523585417473963
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.057
------------------------------------------------------------
Epoch 5 of 20
Current LR [Epoch Begin]: [1.6464349082820793e-05, 1.6464349082820793e-05]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.20it/s]
compute positive descriptors: 21it [00:06,  3.24it/s]
compute negative descriptors: 77it [00:21,  3.66it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:13<00:00,  4.47s/it]
compute query descriptors: 17it [00:05,  3.22it/s]
compute positive descriptors: 21it [00:06,  3.35it/s]
compute negative descriptors: 76it [00:20,  3.65it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 97/97 [07:15<00:00,  4.49s/it]
compute query descriptors: 17it [00:05,  3.27it/s]
compute positive descriptors: 21it [00:06,  3.34it/s]
compute negative descriptors: 75it [00:20,  3.66it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 97/97 [07:08<00:00,  4.42s/it]
compute query descriptors: 17it [00:05,  3.27it/s]
compute positive descriptors: 20it [00:05,  3.38it/s]
compute negative descriptors: 75it [00:20,  3.68it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 97/97 [07:09<00:00,  4.43s/it]

average data loading time: 0.014881027113531054
average batch time: 4.448632798244044
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.053
------------------------------------------------------------
Epoch 6 of 20
Current LR [Epoch Begin]: [1.417099658223044e-05, 1.417099658223044e-05]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.14it/s]
compute positive descriptors: 20it [00:06,  3.25it/s]
compute negative descriptors: 76it [00:20,  3.66it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:22<00:00,  4.57s/it]
compute query descriptors: 17it [00:05,  3.17it/s]
compute positive descriptors: 21it [00:06,  3.28it/s]
compute negative descriptors: 75it [00:20,  3.62it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 97/97 [07:32<00:00,  4.67s/it]
compute query descriptors: 17it [00:05,  3.17it/s]
compute positive descriptors: 20it [00:06,  3.32it/s]
compute negative descriptors: 76it [00:20,  3.64it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 97/97 [07:21<00:00,  4.55s/it]
compute query descriptors: 17it [00:05,  3.25it/s]
compute positive descriptors: 21it [00:06,  3.29it/s]
compute negative descriptors: 74it [00:20,  3.66it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 97/97 [07:16<00:00,  4.50s/it]

average data loading time: 0.015274064442546097
average batch time: 4.571268552357388
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.034
------------------------------------------------------------
Epoch 7 of 20
Current LR [Epoch Begin]: [1.2197089792217973e-05, 1.2197089792217973e-05]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.19it/s]
compute positive descriptors: 20it [00:06,  3.25it/s]
compute negative descriptors: 77it [00:21,  3.66it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:14<00:00,  4.48s/it]
compute query descriptors: 17it [00:05,  3.24it/s]
compute positive descriptors: 19it [00:05,  3.32it/s]
compute negative descriptors: 76it [00:20,  3.67it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 97/97 [07:27<00:00,  4.61s/it]
compute query descriptors: 17it [00:05,  3.24it/s]
compute positive descriptors: 21it [00:06,  3.38it/s]
compute negative descriptors: 76it [00:20,  3.66it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 97/97 [07:17<00:00,  4.51s/it]
compute query descriptors: 17it [00:05,  3.32it/s]
compute positive descriptors: 21it [00:06,  3.36it/s]
compute negative descriptors: 75it [00:20,  3.66it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 96/96 [07:21<00:00,  4.60s/it]

average data loading time: 0.016802641156415915
average batch time: 4.5519144362565465
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.031
------------------------------------------------------------
Epoch 8 of 20
Current LR [Epoch Begin]: [1.049813247333466e-05, 1.049813247333466e-05]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.26it/s]
compute positive descriptors: 20it [00:05,  3.34it/s]
compute negative descriptors: 75it [00:20,  3.64it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:21<00:00,  4.55s/it]
compute query descriptors: 17it [00:05,  3.27it/s]
compute positive descriptors: 21it [00:06,  3.31it/s]
compute negative descriptors: 76it [00:20,  3.67it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 97/97 [07:31<00:00,  4.65s/it]
compute query descriptors: 17it [00:05,  3.22it/s]
compute positive descriptors: 22it [00:06,  3.33it/s]
compute negative descriptors: 75it [00:20,  3.65it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 96/96 [07:47<00:00,  4.87s/it]
compute query descriptors: 17it [00:05,  3.21it/s]
compute positive descriptors: 20it [00:06,  3.25it/s]
compute negative descriptors: 76it [00:20,  3.65it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 96/96 [07:42<00:00,  4.82s/it]

average data loading time: 0.015993091726550168
average batch time: 4.722512589217468
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.034
------------------------------------------------------------
Epoch 9 of 20
Current LR [Epoch Begin]: [9.035826357366062e-06, 9.035826357366062e-06]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.18it/s]
compute positive descriptors: 20it [00:06,  3.19it/s]
compute negative descriptors: 76it [00:20,  3.64it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:17<00:00,  4.51s/it]
compute query descriptors: 17it [00:05,  3.23it/s]
compute positive descriptors: 20it [00:06,  3.33it/s]
compute negative descriptors: 75it [00:20,  3.66it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 97/97 [07:26<00:00,  4.61s/it]
compute query descriptors: 17it [00:05,  3.24it/s]
compute positive descriptors: 22it [00:06,  3.33it/s]
compute negative descriptors: 75it [00:20,  3.62it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 97/97 [07:39<00:00,  4.74s/it]
compute query descriptors: 17it [00:05,  3.28it/s]
compute positive descriptors: 21it [00:06,  3.36it/s]
compute negative descriptors: 76it [00:20,  3.65it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 97/97 [07:34<00:00,  4.69s/it]

average data loading time: 0.015789688247995277
average batch time: 4.635079992186163
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.030
------------------------------------------------------------
Epoch 10 of 20
Current LR [Epoch Begin]: [7.777207819376744e-06, 7.777207819376744e-06]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.17it/s]
compute positive descriptors: 21it [00:06,  3.26it/s]
compute negative descriptors: 74it [00:20,  3.63it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:47<00:00,  4.82s/it]
compute query descriptors: 17it [00:05,  3.15it/s]
compute positive descriptors: 20it [00:06,  3.28it/s]
compute negative descriptors: 76it [00:20,  3.65it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 97/97 [07:49<00:00,  4.84s/it]
compute query descriptors: 17it [00:05,  3.24it/s]
compute positive descriptors: 20it [00:06,  3.30it/s]
compute negative descriptors: 75it [00:20,  3.64it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 97/97 [07:27<00:00,  4.62s/it]
compute query descriptors: 17it [00:05,  3.24it/s]
compute positive descriptors: 21it [00:06,  3.33it/s]
compute negative descriptors: 75it [00:20,  3.67it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 97/97 [07:20<00:00,  4.54s/it]

average data loading time: 0.016177218599417776
average batch time: 4.702381470154241
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.025
------------------------------------------------------------
Epoch 11 of 20
Current LR [Epoch Begin]: [6.693904804452894e-06, 6.693904804452894e-06]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.18it/s]
compute positive descriptors: 21it [00:06,  3.26it/s]
compute negative descriptors: 74it [00:20,  3.64it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:13<00:00,  4.47s/it]
compute query descriptors: 17it [00:05,  3.19it/s]
compute positive descriptors: 22it [00:06,  3.30it/s]
compute negative descriptors: 74it [00:20,  3.64it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 97/97 [07:39<00:00,  4.74s/it]
compute query descriptors: 17it [00:05,  3.21it/s]
compute positive descriptors: 21it [00:06,  3.31it/s]
compute negative descriptors: 74it [00:20,  3.60it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 97/97 [07:42<00:00,  4.77s/it]
compute query descriptors: 17it [00:05,  3.24it/s]
compute positive descriptors: 20it [00:06,  3.28it/s]
compute negative descriptors: 76it [00:21,  3.60it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 97/97 [07:37<00:00,  4.72s/it]

average data loading time: 0.016411355475789494
average batch time: 4.673088701729922
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.021
------------------------------------------------------------
Epoch 12 of 20
Current LR [Epoch Begin]: [5.761497258622622e-06, 5.761497258622622e-06]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.17it/s]
compute positive descriptors: 21it [00:06,  3.13it/s]
compute negative descriptors: 76it [00:21,  3.60it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:50<00:00,  4.85s/it]
compute query descriptors: 17it [00:05,  3.13it/s]
compute positive descriptors: 21it [00:06,  3.24it/s]
compute negative descriptors: 76it [00:21,  3.60it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 97/97 [07:48<00:00,  4.82s/it]
compute query descriptors: 17it [00:05,  3.13it/s]
compute positive descriptors: 20it [00:06,  3.25it/s]
compute negative descriptors: 76it [00:20,  3.63it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 96/96 [07:34<00:00,  4.73s/it]
compute query descriptors: 17it [00:05,  3.16it/s]
compute positive descriptors: 21it [00:06,  3.30it/s]
compute negative descriptors: 74it [00:20,  3.57it/s]

>> Searching for hard negatives...



epoch subset 4 of 4: 100%|██████████| 97/97 [07:48<00:00,  4.83s/it]

average data loading time: 0.017156974289768426
average batch time: 4.808502886646478
Saving model at /content/gdrive/MyDrive/Master/224x224_0.2margin_20epoch_faktisknormalize_triplemkkt_pit.pth






Training loss: 0.016
------------------------------------------------------------
Epoch 13 of 20
Current LR [Epoch Begin]: [4.9589666466475954e-06, 4.9589666466475954e-06]
Training for 4 subsets out of 4 in total


compute query descriptors: 17it [00:05,  3.16it/s]
compute positive descriptors: 21it [00:06,  3.16it/s]
compute negative descriptors: 75it [00:21,  3.55it/s]

>> Searching for hard negatives...



epoch subset 1 of 4: 100%|██████████| 97/97 [07:51<00:00,  4.86s/it]
compute query descriptors: 17it [00:05,  3.22it/s]
compute positive descriptors: 22it [00:06,  3.29it/s]
compute negative descriptors: 75it [00:20,  3.62it/s]

>> Searching for hard negatives...



epoch subset 2 of 4: 100%|██████████| 96/96 [07:50<00:00,  4.90s/it]
compute query descriptors: 17it [00:05,  3.05it/s]
compute positive descriptors: 21it [00:06,  3.27it/s]
compute negative descriptors: 75it [00:20,  3.59it/s]

>> Searching for hard negatives...



epoch subset 3 of 4: 100%|██████████| 96/96 [07:54<00:00,  4.94s/it]
compute query descriptors: 17it [00:05,  3.10it/s]
compute positive descriptors: 21it [00:06,  3.27it/s]
compute negative descriptors: 75it [00:20,  3.62it/s]

>> Searching for hard negatives...



epoch subset 4 of 4:  18%|█▊        | 17/96 [01:24<06:24,  4.86s/it]

KeyboardInterrupt: ignored