📖 Paper: RA-L
📖 Pre-print: arXiv
📹 Video: Youtube
Control/Robotics Research Laboratory (CRRL), Department of Electrical and Computer Engineering, NYU Tandon School of Engineering
-
SALSA: A novel, lightweight, and efficient framework for LiDAR place recognition that delivers state-of-the-art performance while maintaining real-time operational capabilities.
-
SphereFormer: Utilized for local descriptor extraction with radial and cubic window attention to boost localization performance for sparse distant points.
-
Adaptive Pooling: A self-attention adaptive pooling module to fuse local descriptors into global tokens. It can aggregate arbitrary numbers of points in a point cloud without pre-processing.
-
MLP Mixer Token Aggregator: An MLP mixer-based aggregator to iteratively incorporate global context information to generate a robust global scene descriptor.
Fig. 1: Overview of our SALSA framework to generate scene descriptors from point clouds for place recognition.
conda create --name salsa python=3.10.11
conda activate salsa
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.0.1+cu117.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-2.0.1+cu117.html
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.0.1+cu117.html
pip install -r requirements.txt
pip install --no-deps timm==0.9.7
Install SpTr from source.
The model is trained on Mulran Sejong 01/02 sequences and Apollo Southbay (excluding Sunnyvale). Evaluation is performed on 'easy set': Apollo-Southbay (Sunnyvale), SemanticKITTI, Mulran Sejong, and 'hard set': Mulran DCC1/DCC2, KITTI360, ALITA. The datasets can be downloaded from the following links.
- MulRan dataset: ground truth data (*.csv) and LiDAR point clouds (Ouster.zip).
- Apollo-Southbay dataset.
- SemanticKITTI dataset (velodyne point clouds and calibration data for poses).
- ALITA dataset.
- KITTI-360 dataset (raw velodyne scans, calibrations and vehicle poses).
Create Training Pickle
cd src/data/datasets/
python southbay/generate_training_tuples.py --dataset_root <path_to_southbay_dataset>
python mulran/generate_training_tuples.py --dataset_root <path_to_mulran_dataset>
Create Evaluation Pickle
python mulran/generate_evaluation_sets.py --dataset_root <path_to_mulran_dataset>, --sequence sejong
python mulran/generate_evaluation_sets.py --dataset_root <path_to_mulran_dataset>, --sequence mulran
python southbay/generate_evaluation_sets.py --dataset_root <path_to_southbay_dataset>
python kitti/generate_evaluation_sets.py --dataset_root <path_to_kitti_dataset>
python kitti360/generate_evaluation_sets.py --dataset_root <path_to_kitti360_dataset>
python alita/generate_evaluation_sets.py --dataset_root <path_to_alita_dataset>
Navigate to the base, create a folder inside src named checkpoints to save the trained models directory and start training. To change the model and training parameters, change them in config/model.yaml and config/train.yaml, respectively.
mkdir -p src/checkpoints/SALSA/Model src/checkpoints/SALSA/PCA
python src/train.py
This will train the model on the generated training dataset and store the saved models for each epoch in src/checkpoints.
Learn PCA using trained model
python src/evaluate/pca.py
This will learn a PCA to compress and decorrelate the scene descriptor and store the learned PCA as a pytorch model in src/checkpoints.
python src/evaluate/SALSA/eval_salsa_sgv.py --dataset_root <path_to_dataset> --dataset_type <name_of_dataset> --only_global True
Our pre-trained models can also be downloaded from this link. After downloading, copy the contents into the 'src/checkpoints' directory.
The spreads of the Recall@1 before and after re-ranking for best-performing models are plotted in the following figure.
Fig. 2: Box plot displaying Recall@1 across six datasets, with first to third quartile spans, whiskers for data variability, and internal lines as medians.
Fig. 3: Visualization of areas attended to by different tokens from the adaptive pooling layer. Each token focuses on different geometries: trees and traffic signs (green), road intersections (red), and distant points (blue).
Fig. 4: Point matches between query and target clouds using LoGG3D-Net and SALSA local descriptors. Matching colors indicate correspondences; circles highlight SALSA’s superior performance on sparse distant points.
Fig. 5: Comparison of LiDAR-only odometry and maps: (a) without loop detection, and (b) after online pose graph optimization from SALSA loop detections. The highlighted rectangles emphasize the map and odometry disparities due to loop closures.
If you find our work useful in your research please consider citing our publication:
@article{goswami2024salsa,
title={SALSA: Swift Adaptive Lightweight Self-Attention for Enhanced LiDAR Place Recognition},
author={Goswami, Raktim Gautam and Patel, Naman and Krishnamurthy, Prashanth and Khorrami, Farshad},
journal={IEEE Robotics and Automation Letters},
year={2024},
}