Semantic Indoor Place Recognition

Repo will be cleaned up soon.

Introduction

This repository contains the implementation of AEGIS-Net in PyTorch.

AEGIS-Net is an indoor place recognitino network extended from our previous work CGiS-Net. It is a two-stage network that first learns a semantic encoder-decoder to extract semantic features from coloured point clouds, and then learns a feature embedding module to generate global descriptors for place recognition.

Installation

This implementation has been tested on Ubuntu 18.04, 20.04 and 22.04.

For Ubuntu 18.04 installation, please see the instructions from the official KP-Conv repository INSTALL.md.
For Ubuntu 20.04 and 22.04 installation, the procedure is basically the same except for different versions of packages are used.
- Ubuntu 20.04: PyTorch 1.8.0, torchvision 0.9.0, CUDA 11.1, cuDNN 8.6.0
- Ubuntu 22.04: PyTorch 1.13.0, torchvision 0.14.0, CUDA 11.7

Experiments

Data

The ScanNetPR dataset can be downloaded here

├── ScanNetPR
│   ├── scans                              # folder to hold all the data
│   │   ├── scene0000_00
│   │   │   ├── input_pcd_0mean
│   │   │   │   ├── scene0000_00_0_sub.ply # zero meaned point cloud file stored ad [x, y, z, r, g, b]
│   │   │   │   ├── ...
│   │   │   ├── pose
│   │   │   │   ├── 0.txt                  # pose corresponding to the point cloud
│   │   │   │   ├── ...
│   │   │   ├── scene0000_00.txt           # scene information
│   │   ├── ...
│   ├── views/Tasks/Benchmark              # stores all the data split file from ScanNet dataset
│   ├── VLAD_triplets                      # stores all the files necessary for generating training tuples
├── batch_limits.pkl                       # calibration file for KP-Conv
├── max_in_limits.pkl                      # calibration file for KP-Conv
├── neighbors_limits.pkl                   # calibration file for KP-Conv
└── other ScanNet related files ...

Training stage 1:

In the first stage we train the semantic encodes and decoder on a SLAM-Segmentation task, i.e. semantic segmentation on coloured point clouds within local coordinate system.

Change the self.path variable in the datasets/ScannetSLAM.py file to the path of complete ScanNet dataset.
Run the following to train the semantic encoder and decoder.

python train_ScannetSLAM.py

The training usually takes a day. We also provide our pretrained endocer-decoder here if you want to skip the first training stage.

Please download the folder and put it in the results directory. In the folder Log_2021-06-16_02-31-04 we provide the model trained on the complete ScanNet dataset WITHOUT colour. And in the folder Log_2021-06-16_02-42-30 we provide the model trained on the compltete ScanNet dataset WITH colour.

Training stage 2:

In the second stage, we train the feature embedding module to generate the global descriptors.

Change the self.path variable in the datasets/ScannetTriple.py file to the path of ScanNetPR dataset.
Run the the training file as:

python feature_embedding_main.py --train

Train the model with different setting:

--num_feat change the number of feature layers, default 3, choosing from [3, 1] for attention version, [1, 3, 5] for no attention version;
--optimiser change the optimiser, default Adam, choosing from [SGD, Adam];
--loss change the loss function, default lazy_quadruplet, choosing from [triplet, lazy_triplet, lazy_quadruplet];
--no_att set to use no attention version;
--no_color set to use point clouds without colour ;

Evaluation:

Run the file with an additional --test flag on, perform evaluation with the --eval flag on:

python feature_embedding_main.py --test --evaluate --visualise

Visualisations

Kernel Visualization: Use the script from KP-Conv repository, the kernel deformations can be displayed.

Results

Our AEGIS-Net is compared to a traditional baseline using SIFT+BoW, and 5 deep learning based method NetVLAD, PointNetVLAD, MinkLoc3D, Indoor DH3D and CGiS-Net.

Model \ Average Recall Rate	Top-1	Top-2	Top-3	Epochs/Time Trained
ACGiS-Net (default)	65.09%	74.26%	79.06%	20 epochs (4 days)
ACGiS-Net (no attention)	55.13%	66.19%	71.95%	20 epochs (4 days)
CGiS-Net (default)	56.82%	66.46%	71.74%	20 epochs (7 days)
CGiS-Net (default)	61.12%	70.23%	75.06%	60 epochs (21 days)
SIFT + BoW	16.16%	21.17%	24.38%	-
NetVLAD	21.77%	33.81%	41.49%	-
PointNetVLAD	5.31%	7.50%	9.99%	-
MinkLoc3D	3.32%	5.81%	8.27%	-
Indoor DH3D	16.10%	21.92%	25.30%	-

NOTE: ACGiS-Net (no attention) = CGiS-Net (3 feats, using feat 2, 4, 5)

Acknowledgment

In this project, we use parts of the official implementations of following works:

KP-FCNN (Semantic Encoder-Decoder)
PointNetVLAD-Pytorch (NetVLAD Layer)

Future Work

Test on NAVER Indoor Localisation Dataset Link.
Test on other outdoor datasets (Oxford RobotCar Dataset etc.).
Explore attention module for better feature selection before constructing global descriptors.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
cpp_wrappers		cpp_wrappers
datasets		datasets
doc		doc
example_data		example_data
kernels		kernels
models		models
utils		utils
.gitignore		.gitignore
INSTALL.md		INSTALL.md
README.md		README.md
dataset_stats.py		dataset_stats.py
feature_embedding_main.py		feature_embedding_main.py
test_segmentation.py		test_segmentation.py
train_ScannetSLAM.py		train_ScannetSLAM.py
util_scripts.py		util_scripts.py
visualize_attention.py		visualize_attention.py
visualize_deformations.py		visualize_deformations.py

YuhangMing/AEGIS-Net

Folders and files

Latest commit

History

Repository files navigation

Semantic Indoor Place Recognition

Introduction

Installation

Experiments

Data

Training stage 1:

Training stage 2:

Evaluation:

Visualisations

Results

Acknowledgment

Future Work

About

Resources

Stars

Watchers

Forks

Languages