This is the PyTorch implementation of VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization, a simple and efficient neural architecture for Multimodal Camera Localization.
VMLoc provides support for multiple versions of Python and Torch, such as:
python==3.8
pytorch==1.10.1
torchvision==0.11.2
python==2.7
pytorch==0.4.1
torchvision==0.2.0
We support the 7Scenes and Oxford RobotCar datasets right now. You can also write your own PyTorch dataloader for other datasets and put it in the data
directory.
-
For each sequence, you need to download the
stereo_centre
,vo
andgps
tar files from the dataset website. The directory for each 'scene' (e.g.loop
) has .txt files defining the train/test_split. -
To make training faster, we pre-processed the images using
data/process_robotcar.py
. This script undistorts the images using the camera models provided by the dataset, and scales them such that the shortest side is 256 pixels. -
Pixel and Pose statistics must be calculated before any training. Use the
data/dataset_mean.py
, which also saves the information at the proper location. Rre-computed values for RobotCar and 7Scenes could be find Here(7Scenes) and Here(RobotCar).
The executable script is training.py
. For example:
- Directly Concatenation on
chess
from7Scenes
:
python -u -m training --dataset 7Scenes --scene stairs --mode concatenate \
--gpus 0 --color_jitter 0. --train_mean 0
- VMLoc on
chess
from7Scenes
:
python -u -m training --dataset 7Scenes --scene stairs --mode vmloc \
--gpus 0 --color_jitter 0. --train_mean 0 --train_mask 50
- VMLoc on
redkitchen
from7Scenes
:
python -u -m training --dataset 7Scenes --scene redkitchen --mode vmloc \
--gpus 0 --color_jitter 0. --train_mean 0 --train_mask 150
- VMLoc on
loop
fromRobotCar
:
python -u -m training --dataset RobotCar --scene loop --mode vmloc \
--gpus 0 --gamma -3.0 --color_jitter 0.7 --train_mean 0 --train_mask 500
The meanings of various command-line parameters are documented in train.py. The values of various hyperparameters are defined in tools/options.py
.
The trained models for partial experiments presented in the paper could be found here Weights. The inference script is testing.py
. Here are some examples, assuming the models are downloaded in logs
.
- VMLoc on
redkitchen
from7Scenes
:
python testing.py --mode vmloc --dataset 7Scenes --scene redkitchen \
--seed 2019 --weights record/red_kitchen.pth.tar
If you find this code useful for your research, please cite our paper
@inproceedings{zhou2021vmloc,
title={Vmloc: Variational fusion for learning-based multimodal camera localization},
author={Zhou, Kaichen and Chen, Changhao and Wang, Bing and Saputra, Muhamad Risqi U and Trigoni, Niki and Markham, Andrew},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={35},
number={7},
pages={6165--6173},
year={2021}
}