# How to prepare a faiss index file for Place Recognition pipeline

This tutorial shows how to prepare a faiss index file for the Place Recognition pipeline. The index file is a binary file that contains the descriptors of the places in the database. The index file is used by the `PlaceRecognitionPipeline` to perform the Place Recognition task.

In [1]:
import os

import faiss
from hydra.utils import instantiate
import numpy as np
from omegaconf import OmegaConf
import torch
from torch.utils.data import DataLoader
from tqdm import tqdm

from opr.datasets.itlp import ITLPCampus

## Config

In [2]:
DATABASE_TRACK_DIR = "/home/docker_opr/Datasets/OpenPlaceRecognition/itlp_campus_outdoor/00_2023-02-10"

BATCH_SIZE = 64
NUM_WORKERS = 4
DEVICE = "cuda"

MODEL_CONFIG_PATH = "../configs/model/place_recognition/minkloc3d.yaml"
WEIGHTS_PATH = "../weights/place_recognition/minkloc3d_nclt.pth"

## Step 1 - Initialize dataset and dataloader

For this example, we will use the ITLP-Campus outdoor dataset with only LiDAR data.

In [3]:
db_dataset = ITLPCampus(
    dataset_root=DATABASE_TRACK_DIR,
    sensors=["lidar"],
    mink_quantization_size=0.5,
    load_semantics=False,
    load_text_descriptions=False,
    load_text_labels=False,
    load_aruco_labels=False,
    indoor=False,
)

In [4]:
len(db_dataset)

609

In [5]:
db_dataset[0].keys()

dict_keys(['idx', 'pose', 'pointcloud_lidar_coords', 'pointcloud_lidar_feats'])

In [6]:
db_dataloader = DataLoader(
    db_dataset,
    batch_size=BATCH_SIZE,
    shuffle=False,
    num_workers=NUM_WORKERS,
    collate_fn=db_dataset.collate_fn,
)


In [7]:
sample_batch = next(iter(db_dataloader))
sample_batch.keys()


dict_keys(['idxs', 'poses', 'pointclouds_lidar_coords', 'pointclouds_lidar_feats'])

## Step 2 - Initialize model

We will use hydra's `instantiate` function to initialize the model. The model is a `MinkLoc3D` - a simple LiDAR-only architecture.

In [8]:
model_config = OmegaConf.load(MODEL_CONFIG_PATH)
model = instantiate(model_config)
model.load_state_dict(torch.load(WEIGHTS_PATH))
model = model.to(DEVICE)
model.eval();



## Step 3 - Calculate descriptors

We will calculate the descriptors for each place in the dataset.

In [9]:
descriptors = []
with torch.no_grad():
    for batch in tqdm(db_dataloader):
        batch = {k: v.to(DEVICE) for k, v in batch.items()}
        final_descriptor = model(batch)["final_descriptor"]
        descriptors.append(final_descriptor.detach().cpu().numpy())

descriptors = np.concatenate(descriptors, axis=0)

100%|██████████| 10/10 [00:01<00:00,  7.42it/s]


In [10]:
descriptors.shape

(609, 256)

## Step 4 - Build faiss index & write it to disk

In [11]:
index = faiss.IndexFlatL2(descriptors.shape[1])
index.add(descriptors)
print(index.is_trained)
print(index.ntotal)

True
609


In [12]:
faiss.write_index(index, DATABASE_TRACK_DIR + "/index.faiss")


## Result

As a result, we now have a Faiss index file that can be used by the `PlaceRecognitionPipeline` to perform the Place Recognition task.

In this example, the file is saved in the same directory as the track data. However, in a general case, the index file can be saved anywhere. The minimum requirement to initialize the `PlaceRecognitionPipeline` is that the database directory should contain both the `index.faiss` file and the `track.csv` file.

In [13]:
dirs = [d for d in os.listdir(DATABASE_TRACK_DIR) if os.path.isdir(os.path.join(DATABASE_TRACK_DIR, d))]
files = [f for f in os.listdir(DATABASE_TRACK_DIR) if os.path.isfile(os.path.join(DATABASE_TRACK_DIR, f))]


for item in sorted(dirs):
    print(item + "/")
for item in sorted(files):
    print(item)


aruco_labels/
back_cam/
front_cam/
lidar/
masks/
text_descriptions/
text_labels/
demo.mp4
index.faiss
meta_info.yml
track.csv
track_map.png
