## In this notebook we will see how to evaluate on the following benchmarks:

- Pittsburgh (pitts30k-val, pitts30k-test and pitts250k-test) [1]
- MapillarySLS [2]
- Cross Season [3]
- ESSEX [3]
- Inria Holidays [3]
- Nordland [3]
- SPED [3]

[1] NetVLAD: CNN architecture for weakly supervised place recognition (https://github.com/Relja/netvlad)

[2] Mapillary Street-Level Sequences: A Dataset for Lifelong Place Recognition (https://github.com/FrederikWarburg/mapillary_sls)

[3] VPR-Bench: An Open-Source Visual Place Recognition Evaluation Framework with Quantifiable Viewpoint and Appearance Change (https://github.com/MubarizZaffar/VPR-Bench)

You'll need to download Pittsburgh dataset from [1] (you need to email Relja for the dataset), MapillarySLS validation from [2]. For the other datasets, visit [3] for detail on their amazing benchmark, they also host those datasets on this link (https://surfdrive.surf.nl/files/index.php/s/sbZRXzYe3l0v67W), huge thanks.

---

**Note:** I rewrote the code for loading these datasets to ensure consistency in evaluation across all datasets and to improve its speed. The original code for these datasets was slow for valid reasons. For instance, VPR-Bench calculates multiple metrics, including latency, which requires individual image processing in the forward pass. MSLS offers various evaluation modes, such as Image_to_Image, Sequence_to_Sequence, Sequence_to_Image, among others. In this project, we focus solely on measuring recall@K and as a result, we can significantly speed up the validation process. Therefoe, you'll need to use the precomputed ground_truth that we provide in this repo (in the directory datasets).

That being said, all you need to do is download the dataset and place it in a specific directory (we will need the dataset images). After that, you can hard-code the directory path into a global variable, as we will show in the following steps.


In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# !rm -rf '/content/drive/MyDrive/geoloc_fcm/geoloc-fcm-gsv-cities'

In [None]:

# !git clone https://github.com/fracrumatte/geoloc-fcm-gsv-cities.git  '/content/drive/MyDrive/geoloc_fcm/geoloc-fcm-gsv-cities'


In [None]:
!pip install pytorch-metric-learning==1.6.3
!pip install faiss-gpu==1.7.2
!pip install pytorch-lightning==1.8.4
!pip install timm


In [None]:
%reload_ext autoreload
%autoreload 2

import sys
sys.path.append('/content/drive/MyDrive/geoloc_fcm/geoloc-fcm-gsv-cities') # append parent directory, we need it

import torch
from torch.utils.data import Dataset, DataLoader
import torchvision.transforms as T
import matplotlib.pyplot as plt
import numpy as np
from tqdm.notebook import tqdm
from utils.validation import get_validation_recalls

In [None]:
MEAN=[0.485, 0.456, 0.406]; STD=[0.229, 0.224, 0.225]

IM_SIZE = (320, 320)

def input_transform(image_size=IM_SIZE):
    return T.Compose([
        # T.Resize(image_size, interpolation=T.InterpolationMode.BICUBIC),
		T.Resize(image_size,  interpolation=T.InterpolationMode.BILINEAR),

        T.ToTensor(),
        T.Normalize(mean=MEAN, std=STD)
    ])

In this project, we provide for each benchmark (or test dataset) a Dataset Class that encapsulates images sequentially as follows:

$[R_1, R_2, ..., R_n, Q_1, Q_2, ..., Q_m]$ where $R_i$ are the reference images and $Q_i$ are the queries. We keep the number of references and queries as variables in the object so that we can split into references/queries later when evaluating. We also store a ground_truth matrix that indicates which references are prositives for each query.

**Note:** make sure that for every [BenchmarkClass].py, the global variable DATASET_ROOT (where each dataset images are located) is well initialized, otherwise you won't be able to run the following steps. Also, GT_ROOT is the location of the precomputed ground_truth and filenames that WE PROVIDED (by default in ../datasets/).

In [None]:

from dataloaders.val.SF_Dataset import SF_Dataset
from dataloaders.val.TokyoDataset import Tokyo_Dataset



def get_val_dataset(dataset_name, input_transform=input_transform()):
    dataset_name = dataset_name.lower()

    if 'cross' in dataset_name:
        ds = CrossSeasonDataset(input_transform = input_transform)

    elif 'essex' in dataset_name:
        ds = EssexDataset(input_transform = input_transform)

    elif 'inria' in dataset_name:
        ds = InriaDataset(input_transform = input_transform)

    elif 'nordland' in dataset_name:
        ds = NordlandDataset(input_transform = input_transform)

    elif 'sped' in dataset_name:
        ds = SPEDDataset(input_transform = input_transform)

    elif 'msls' in dataset_name:
        ds = MSLS(input_transform = input_transform)

    elif 'pitts' in dataset_name:
        ds = PittsburghDataset(which_ds=dataset_name, input_transform = input_transform)
    elif 'sf_val' in dataset_name:
        ds = SF_Dataset(which_ds=dataset_name, input_transform = input_transform)
    elif 'sf_test' in dataset_name:
        ds = SF_Dataset(which_ds=dataset_name, input_transform = input_transform)
    elif 'tokyo_test' in dataset_name:
        ds = Tokyo_Dataset(which_ds=dataset_name, input_transform = input_transform)

    else:
        raise ValueError

    num_references = ds.num_references
    num_queries = ds.num_queries
    ground_truth = ds.ground_truth
    return ds, num_references, num_queries, ground_truth



We define a function to which we give a model, a dataloader and it returns the resulting representations

In [None]:
def get_descriptors(model, dataloader, device):
    descriptors = []
    with torch.no_grad():
        for batch in tqdm(dataloader, 'Calculating descritptors...'):
            imgs, labels = batch
            output = model(imgs.to(device)).cpu()
            descriptors.append(output)

    return torch.cat(descriptors)

Let's now load a pre-trained model

In [None]:

from main import VPRModel
# from main import trainer

# define which device you'd like run experiments on (cuda:0 if you only have one gpu)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

model = VPRModel(backbone_arch='resnet18',
                 layers_to_crop=[4],
                #  agg_arch='ConvAP',
                #  agg_config={'in_channels': 256,
                #             'out_channels': 256,
                #             's1' : 2,
                #             's2' : 2},
                  agg_arch='MixVPR',
                  agg_config={'in_channels' : 256,
                'in_h' : 20,
                'in_w' : 20,
                'out_channels' : 256,
                'mix_depth' : 4,
                'mlp_ratio' : 1,
                'out_rows' : 4},
        )



state_dict = torch.load('/content/drive/MyDrive/geoloc_fcm/LOGS/resnet18/lightning_logs/version_150/checkpoints/resnet18_epoch(09)_step(19520)_R1[0.7829]_R5[0.8595].ckpt')
model.load_state_dict(state_dict['state_dict'])

model.eval()
model = model.to(device)


## Running validation on one of the benchmarks

In [None]:
# all_datasets = ['CrossSeason' ,'Essex' ,'Inria' ,'Nordland' ,'SPED' ,'MSLS']
val_dataset_name = 'sf_val'  #mettere sf ??????????
batch_size = 32

val_dataset, num_references, num_queries, ground_truth = get_val_dataset(val_dataset_name)
val_loader = DataLoader(val_dataset, num_workers=4, batch_size=batch_size)

descriptors = get_descriptors(model, val_loader, device)
print(f'Descriptor dimension {descriptors.shape[1]}')

# now we split into references and queries


In [None]:
r_list = descriptors[ : num_references].cpu()
q_list = descriptors[num_references : ].cpu()
recalls_dict, preds = get_validation_recalls(r_list=r_list,
                                    q_list=q_list,
                                    k_values=[1, 5 , 10, 15, 20, 25], #[1, 5 , 10, 15, 20, 25]
                                    gt=ground_truth,
                                    print_results=True,
                                    dataset_name=val_dataset_name,
                                    )

## Evaluating on all benchmarks

In [None]:
# val_dataset_names = ['CrossSeason' ,'Essex' ,'Inria', 'MSLS', 'SPED', 'Nordland', 'pitts30k_test', 'pitts250k_test', 'sf', 'tokyo']
val_dataset_names = ['tokyo_test','sf_test', 'sf_val']
batch_size = 32

for val_name in val_dataset_names:
    val_dataset, num_references, num_queries, ground_truth = get_val_dataset(val_name)
    val_loader = DataLoader(val_dataset, num_workers=4, batch_size=batch_size)
    print(f'Evaluating on {val_name}')
    descriptors = get_descriptors(model, val_loader, device)

    print(f'Descriptor dimension {descriptors.shape[1]}')
    r_list = descriptors[ : num_references]
    q_list = descriptors[num_references : ]

    recalls_dict, preds = get_validation_recalls(r_list=r_list,
                                                q_list=q_list,
                                                k_values=[1, 5, 10, 15, 20, 25],
                                                gt=ground_truth,
                                                print_results=True,
                                                dataset_name=val_name,
                                                faiss_gpu=False
                                                )
    del descriptors
    print('========> DONE!\n\n')

In [None]:
#Qualitative analysis
gt=ground_truth
k_values=[5]
correct = {}
for q_idx, pred in enumerate(preds):

            for i, n in enumerate(k_values):
                # if in top N then also in top NN, where NN > N
                correct[q_idx]=[0,pred[:n]]
                if np.any(np.in1d(pred[:n], gt[q_idx])):
                    correct[q_idx] = [1,pred[:n]]
                    break

In [None]:
counter =0

imgs_list=[]
for i,val in enumerate(correct.values()):
  # if val == 0:
    counter+=1
    imgs_list.append(val_dataset.qImages[i][7])

imgs_list[0]


In [None]:
len(imgs_list)

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.image as mpimg

fig, axes = plt.subplots(100,5, figsize=(10, 200))
for i, ax in enumerate(axes.flat):
    if(i<500):
      img = plt.imread('/content/drive/MyDrive/geoloc_fcm/extracted_datasets/sf_xs/val/queries/'+imgs_list[i]+'.jpg')
      ax.imshow(img)
      ax.axis('off')
      ax.set_title(f'Image {i}', fontsize=8)  # Aggiungi l'etichetta con l'indice dell'immagine

plt.show()

# img = mpimg.imread('/content/drive/MyDrive/geoloc_fcm/extracted_datasets/sf_xs/val/queries/'+'@0553005.75@4174559.90@10@S@037.71676@-122.39858@F5LZ_39AImKuAaxYaZg_DQ@@180@@@@201704@@'+'.jpg')
# plt.imshow(img)
# plt.axis('off')
# plt.show()


In [None]:
#comparing the two models, what our model rekognized and the other donesn't

dict_our={}
dict_mix={}

lis_res=[]
for i in range(len(preds)):
  if( i in dict_our.keys and i in dict_mix.keys):
    if(dict_our[i][0]==1 and dict_mix[i][0]==0):
      lis_res.append(i)
      lis_res.append(dict_our[i][1])
      lis_res.append(dict_mix[i][1])

l = len(lis_res) // 3

fig, axes = plt.subplots(l,3, figsize=(6, 40))
for i, ax in enumerate(axes.flat):
    if(i<100):
      img = plt.imread('/content/drive/MyDrive/geoloc_fcm/extracted_datasets/sf_xs/val/queries/'+lis_res[i]+'.jpg')
      ax.imshow(img)
      ax.axis('off')

plt.show()