Skip to content

slam-mer/visual_place_localization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Visual Place Recognition

A small C++ library for visual place recognition (VPR) that provides a common Database interface with two backends:

  • MegaLoc using ONNX Runtime for global descriptors and FAISS for search.
  • Optional: Bag-of-Words (BoW) using OpenCV for descriptor aggregation and FAISS for similarity search.

The core is intentionally minimal: you build descriptors (BoW or MegaLoc), add them to a database, and query for nearest neighbors.

If you are using this module as part of SLAM-MER, see the main project README: SLAM-MER README.

Features

  • Single factory API (Database::create) for multiple VPR approaches.
  • GPU-accelerated nearest-neighbor search with FAISS.
  • ONNX Runtime inference wrapper for MegaLoc.
  • Simple CMake build and install.
  • Unit tests (ongoing).

Requirements

  • C++17 toolchain and CMake.
  • OpenCV development headers.
  • FAISS with GPU support.
  • ONNX Runtime (GPU build).
  • CUDA toolkit matching your GPU.

Project Layout

  • visual_place_localization/include public headers.
  • visual_place_localization/src implementation of databases and inference.
  • tests/ unit tests (requires -DBUILD_TESTS=ON).
  • thirdparty/ third-party dependencies (FAISS, ONNX Runtime).

Installation

1. Clone the repository and submodules:

git clone https://github.com/slam-mer/visual_place_localization.git
cd visual_place_localization
git submodule update --init --recursive

2. Set up the virtual environment

mise trust
mise install
uv sync

Optional Python extras:

# Development tooling (pre-commit)
uv sync --extra dev

3. Install prerequisites (only if using BoW)

sudo apt install libopencv-dev

Also install and configure GitHub CLI.

4. Install third-party dependencies

cd thirdparty/faiss
mkdir -p build && cd build
cmake -DBUILD_SHARED_LIBS=ON \
      -DFAISS_ENABLE_GPU=ON \
      -DCUDAToolkit_ROOT=/usr/local/cuda-${CUDA_VERSION} \
      -DCMAKE_CUDA_ARCHITECTURES=${CUDA_ARCH} \
      -DBUILD_TESTING=OFF \
      -DFAISS_ENABLE_PYTHON=OFF \
      -DCMAKE_CUDA_COMPILER=/usr/local/cuda-${CUDA_VERSION}/bin/nvcc \
      -DCMAKE_BUILD_TYPE=Release \
      -DCMAKE_INSTALL_PREFIX=./install \
      -GNinja ..
ninja
ninja install
cd ../../..
cd thirdparty
wget https://github.com/microsoft/onnxruntime/releases/download/v1.22.0/onnxruntime-linux-x64-gpu-1.22.0.tgz
tar -xvf onnxruntime-linux-x64-gpu-1.22.0.tgz
rm onnxruntime-linux-x64-gpu-1.22.0.tgz
cd ..

5. Build the visual place recognition library

cmake -B build -S . \
    -DCMAKE_BUILD_TYPE=Release \
    -DUSE_GPU_DATABASE=ON \
    -DBUILD_TESTS=OFF \
    -DCMAKE_INSTALL_PREFIX=./build/install
cmake --build build --parallel
cmake --install build

We use pre-commit hooks to double-check changes before committing and pushing to the repository. Install and enable them with:

pre-commit install
pre-commit run --all-files

After these steps, pre-commit will run on every git commit.

Usage

Build options:

  • CUDA_VERSION (default: 12.8 if not set)
  • CUDA_ARCH (default: 86 if not set)
  • BUILD_SHARED (default: ON)
  • USE_GPU_DATABASE (default: ON)
  • BUILD_TESTS (default: OFF)

C++ example (factory + add/query)

#include <database.hpp>
#include <opencv2/core.hpp>

using namespace visual_place_localization::database;

int main() {
  Config config;
  config.method = LocalizationApproach::BOW;
  config.vocabularyPath = "/path/to/vocabulary.yml.gz";

  Database::shared_ptr db = Database::create(config);

  // Example descriptor (BoW)
  cv::Mat descriptor(1, 1000, CV_32F, cv::Scalar(0.0f));
  db->add_to_database(descriptor);

  query_results results = db->query_database(0, descriptor, 5);
  return 0;
}

MegaLoc example (image -> descriptor)

#include <database.hpp>
#include <opencv2/imgcodecs.hpp>

using namespace visual_place_localization::database;

int main() {
  Config config;
  config.method = LocalizationApproach::MEGALOC;
  config.onnxModelPath = "/path/to/megaloc.onnx";

  Database::shared_ptr db = Database::create(config);

  cv::Mat image = cv::imread("/path/to/image.png");
  std::vector<cv::Mat> descriptors;
  cv::Mat globalDescriptor;

  db->add_to_database(image, descriptors, globalDescriptor);
  query_results results = db->query_database(0, globalDescriptor, 10);
  return 0;
}

CMake integration (consumer project)

find_package(visual_place_localization REQUIRED)
target_link_libraries(your_target PRIVATE visual_place_localization)

Unit Tests

Make sure to compile with -DBUILD_TESTS=ON.

Tests can be found in the tests/ folder.

For BoW tests, run:

ctest --test-dir build -R bow_tests --output-on-failure

TODOs

  1. Add an executable example
  2. Add a script to create a BoW vocabulary
  3. Add a script to create the MegaLoc ONNX model

We provide two method implementations for visual place recognition:

  • Bag-of-Words (BoW) using OpenCV for descriptor aggregation and FAISS for similarity search.
  • MegaLoc using ONNX Runtime for global descriptors and FAISS for search.

Citation

If you use this repository, please cite:

@inproceedings{merl2026revisiting,
  author = {Piedade, Valter and Manam, Lalit and Yamazaki, Masashi and Miraldo, Pedro},
  title = {Revisiting Monocular SLAM with Spatio-Temporal Scene Modeling},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2026}
}

Contact

Pedro Miraldo
Senior Principal Research Scientist Mitsubishi Electric Research Laboratories miraldo@merl.com

Valter Piedade
PhD Student in Electrical and Computer Engineering Instituto Superior Tecnico, Lisbon (MERL consultant)

License

Released under the AGPL-3.0-or-later license, as described in LICENSE.md.

All files, except those listed in LICENSE-THIRD-PARTY.md, have:

Copyright (C) 2026 Mitsubishi Electric Research Laboratories (MERL)
SPDX-License-Identifier: AGPL-3.0-or-later

About

Module for image re-localization in SLAM-MER

Resources

License

Unknown, Unknown licenses found

Licenses found

Unknown
LICENSE.md
Unknown
LICENSE-THIRD-PARTY.md

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors