A small C++ library for visual place recognition (VPR) that provides a
common Database interface with two backends:
- MegaLoc using ONNX Runtime for global descriptors and FAISS for search.
- Optional: Bag-of-Words (BoW) using OpenCV for descriptor aggregation and FAISS for similarity search.
The core is intentionally minimal: you build descriptors (BoW or MegaLoc), add them to a database, and query for nearest neighbors.
If you are using this module as part of SLAM-MER, see the main project README: SLAM-MER README.
- Single factory API (
Database::create) for multiple VPR approaches. - GPU-accelerated nearest-neighbor search with FAISS.
- ONNX Runtime inference wrapper for MegaLoc.
- Simple CMake build and install.
- Unit tests (ongoing).
- C++17 toolchain and CMake.
- OpenCV development headers.
- FAISS with GPU support.
- ONNX Runtime (GPU build).
- CUDA toolkit matching your GPU.
visual_place_localization/includepublic headers.visual_place_localization/srcimplementation of databases and inference.tests/unit tests (requires-DBUILD_TESTS=ON).thirdparty/third-party dependencies (FAISS, ONNX Runtime).
git clone https://github.com/slam-mer/visual_place_localization.git
cd visual_place_localization
git submodule update --init --recursivemise trust
mise install
uv syncOptional Python extras:
# Development tooling (pre-commit)
uv sync --extra devsudo apt install libopencv-devAlso install and configure GitHub CLI.
cd thirdparty/faiss
mkdir -p build && cd build
cmake -DBUILD_SHARED_LIBS=ON \
-DFAISS_ENABLE_GPU=ON \
-DCUDAToolkit_ROOT=/usr/local/cuda-${CUDA_VERSION} \
-DCMAKE_CUDA_ARCHITECTURES=${CUDA_ARCH} \
-DBUILD_TESTING=OFF \
-DFAISS_ENABLE_PYTHON=OFF \
-DCMAKE_CUDA_COMPILER=/usr/local/cuda-${CUDA_VERSION}/bin/nvcc \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=./install \
-GNinja ..
ninja
ninja install
cd ../../..- ONNX Runtime (https://github.com/microsoft/onnxruntime)
cd thirdparty
wget https://github.com/microsoft/onnxruntime/releases/download/v1.22.0/onnxruntime-linux-x64-gpu-1.22.0.tgz
tar -xvf onnxruntime-linux-x64-gpu-1.22.0.tgz
rm onnxruntime-linux-x64-gpu-1.22.0.tgz
cd ..cmake -B build -S . \
-DCMAKE_BUILD_TYPE=Release \
-DUSE_GPU_DATABASE=ON \
-DBUILD_TESTS=OFF \
-DCMAKE_INSTALL_PREFIX=./build/install
cmake --build build --parallel
cmake --install buildWe use pre-commit hooks to double-check changes before committing and pushing to the repository. Install and enable them with:
pre-commit install
pre-commit run --all-filesAfter these steps, pre-commit will run on every git commit.
CUDA_VERSION(default:12.8if not set)CUDA_ARCH(default:86if not set)BUILD_SHARED(default:ON)USE_GPU_DATABASE(default:ON)BUILD_TESTS(default:OFF)
#include <database.hpp>
#include <opencv2/core.hpp>
using namespace visual_place_localization::database;
int main() {
Config config;
config.method = LocalizationApproach::BOW;
config.vocabularyPath = "/path/to/vocabulary.yml.gz";
Database::shared_ptr db = Database::create(config);
// Example descriptor (BoW)
cv::Mat descriptor(1, 1000, CV_32F, cv::Scalar(0.0f));
db->add_to_database(descriptor);
query_results results = db->query_database(0, descriptor, 5);
return 0;
}#include <database.hpp>
#include <opencv2/imgcodecs.hpp>
using namespace visual_place_localization::database;
int main() {
Config config;
config.method = LocalizationApproach::MEGALOC;
config.onnxModelPath = "/path/to/megaloc.onnx";
Database::shared_ptr db = Database::create(config);
cv::Mat image = cv::imread("/path/to/image.png");
std::vector<cv::Mat> descriptors;
cv::Mat globalDescriptor;
db->add_to_database(image, descriptors, globalDescriptor);
query_results results = db->query_database(0, globalDescriptor, 10);
return 0;
}find_package(visual_place_localization REQUIRED)
target_link_libraries(your_target PRIVATE visual_place_localization)Make sure to compile with -DBUILD_TESTS=ON.
Tests can be found in the tests/ folder.
For BoW tests, run:
ctest --test-dir build -R bow_tests --output-on-failure- Add an executable example
- Add a script to create a BoW vocabulary
- Add a script to create the MegaLoc ONNX model
We provide two method implementations for visual place recognition:
- Bag-of-Words (BoW) using OpenCV for descriptor aggregation and FAISS for similarity search.
- MegaLoc using ONNX Runtime for global descriptors and FAISS for search.
If you use this repository, please cite:
@inproceedings{merl2026revisiting,
author = {Piedade, Valter and Manam, Lalit and Yamazaki, Masashi and Miraldo, Pedro},
title = {Revisiting Monocular SLAM with Spatio-Temporal Scene Modeling},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2026}
}Pedro Miraldo
Senior Principal Research Scientist
Mitsubishi Electric Research Laboratories
miraldo@merl.com
Valter Piedade
PhD Student in Electrical and Computer Engineering
Instituto Superior Tecnico, Lisbon
(MERL consultant)
Released under the AGPL-3.0-or-later license, as described in
LICENSE.md.
All files, except those listed in LICENSE-THIRD-PARTY.md, have:
Copyright (C) 2026 Mitsubishi Electric Research Laboratories (MERL)
SPDX-License-Identifier: AGPL-3.0-or-later