IMFNet: Interpretable Multimodal Fusion for Point Cloud Registration, 2022

This repository is the implementation of IMFNet: Interpretable Multimodal Fusion for Point Cloud Registration.

The existing state-of-the-art point descriptors relies on structure information only, which omit the texture information. However, texture information is crucial for our humans to distinguish a scene part. Moreover, the current learning-based point descriptors are all black boxes which are unclear how the original points contribute to the fnal descriptors. In this paper, we propose a new multimodal fusion method to generate a point cloud registration descriptors by considering both structure and texture information. Specifcally, a novel attention-fusion module is designed to extract the weighted texture information for the descriptors extraction. In addition, we propose an interpretable module to explain our neural network by visually showing the original points in contributing to the fnal descriptors. We use the descriptors’ channel value as the loss to backpropagate to the target layer and consider the gradient as the signifcance of this point to the fnal descriptors. This paper moves one step further to explainable deep learning in the registration task. Comprehensive experiments on 3DMatch, 3DLoMatch and KITTI demonstrate that the multimodal fusion descriptors achieves state-of-the-art accuracy and improve the descriptors’ distinctiveness. We also demonstrate that our interpretable module in explaining the registration descriptors extraction.

Paper

FMR vs. RR

FMR Table	RR Table

Feature-match recall and Rigistration Recall in log scale on the 3DMatch benchmark.

The framework of IMFNet

The network architecture of the proposed IMFNet. The input is a point cloud and an image, and the output is a point descriptors. Inside the attention-fusion module, W is the weight matrix, FI is the point texture feature. Then, the fusion feature (Ffe) of point structure feature (Fpe) and point texture feature (FI) as an input to the decoder module to get the output descriptors. Final, the descriptors are interpreted by DAM.

The Overall Framework

Please refer to our paper for more details.

Visualization of DAM

Our DAM can visiualize the points contribution distribution of descriptor extraction.

IMFNet	FCGF

Requirements

Ubuntu 18.04.1 or higher
CUDA 11.1 or higher
Python v3.6 or higher
Pytorch v1.8 or higher
MinkowskiEngine v0.5 or higher

Dataset Download

Regarding the 3DMatch and 3DLoMatch, the images are selected for each point cloud based on their covered content to construct a dataset of paired images and point clouds named 3DImageMatch. Our experiments are conducted on this dataset. The dataset construction and training details are attached in the supplement material. Download the 3DImageMatch/Kitti . The code is p2gl.

Please concat the files

# 3DImageMatch
cat x00 x01 ... x17 > 3DImageMatch.zip
# Kitti
cat Kitti01 ... Kitti10 > Kitti.zip

Then, unzip the zip files.

Training

Train the 3DMatch

python train.py train_3DMatch.py

Train the Kitti

python train.py train_Kitti.py

Evaluating

For benchmarking the trained weights, download the pretrain file here . We also provide key points (5000) and some other results, here

Evaluating the 3DMatch or 3DLoMatch

# Generating Descriptors
python generate_desc.py --source <Testing Set Path> --target <Output Path> --model <CheckPoint Path>
# Evaluating 3DMatch
python evaluation_3dmatch.py --pcloud_root <Testing Set Path> --out_root <Output Path> --desc_types ['IMFNet'] --desc_roots ['<Descriptors Path>'] --benchmarks "3DMatch"
# Evaluating 3DLoMatch
python evaluation_3dmatch.py --pcloud_root <Testing Set Path> --out_root <Output Path> --desc_types ['IMFNet'] --desc_roots ['<Descriptors Path>'] --benchmarks "3DLoMatch"

Evaluating the Kitti

# Evaluating Kitti
python evaluation_kitti.py --save_dir <Output Path> --kitti_root <Testing Set Path>

Descriptor Activation Mapping

Visualization the target descriptor

python dam.py --target <target point index>

Citing our work

Please cite the following papers if you use our code:

@article{huang2021imfnet,
  title={IMFNet: Interpretable Multimodal Fusion for Point Cloud Registration},
  author={Xiaoshui Huang, Wentao Qu, Yifan Zuo, Yuming Fang, Xiaowei Zhao},
  journal={IEEE Robotics and Automation Letters},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
assets_ours		assets_ours
benchmarks		benchmarks
config		config
data		data
files		files
lib		lib
model		model
outputs		outputs
outputs_kitti		outputs_kitti
pytorch_dam		pytorch_dam
scripts		scripts
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config_3dmatch.py		config_3dmatch.py
config_kitti.py		config_kitti.py
dam.py		dam.py
requirements.txt		requirements.txt
train_3DMatch.py		train_3DMatch.py
train_Kitti.py		train_Kitti.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMFNet: Interpretable Multimodal Fusion for Point Cloud Registration, 2022

FMR vs. RR

The framework of IMFNet

Visualization of DAM

Requirements

Dataset Download

Training

Evaluating

Descriptor Activation Mapping

Citing our work

About

Releases

Packages

Languages

License

XiaoshuiHuang/IMFNet

Folders and files

Latest commit

History

Repository files navigation

IMFNet: Interpretable Multimodal Fusion for Point Cloud Registration, 2022

FMR vs. RR

The framework of IMFNet

Visualization of DAM

Requirements

Dataset Download

Training

Evaluating

Descriptor Activation Mapping

Citing our work

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages