Multi-Modal 3D Object Detection by Box Matching (FBMNet)

This repository is an official implementation of: FBMNet

Highlights

FBMNet frees up the heavy dependency of a projection matrix adopted for most existing multi-modal 3D detection methods by box matching strategy. Most importantly, FBMNet possesses superior robustness for the following challenging cases:

1) Temporal Asynchronous between LiDAR and Camera sensors;

2) Spatial Misalignment including inaccurate calibration or misaligned sensor placement when deployment;

3) Degenerated Images including dropped images or heavily disturbed images;

Introduction

Multi-modal 3D object detection has received growing attention as the information from different sensors like LiDAR and cameras are complementary. Most fusion methods for 3D detection rely on an accurate alignment and calibration between 3D point clouds and RGB images. However, such an assumption is not reliable in a real-world self-driving system, as the alignment between different modalities is easily affected by asynchronous sensors and disturbed sensor placement. We propose a novel {F}usion network by {B}ox {M}atching (FBMNet) for multi-modal 3D detection, which provides an alternative way for cross-modal feature alignment by learning the correspondence at the bounding box level to free up the dependency of calibration during inference. With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features. Extensive experiments on the nuScenes dataset demonstrate that our method is much more stable in dealing with challenging cases such as asynchronous sensors, misaligned sensor placement, and degenerated camera images than existing fusion methods. We hope that our FBMNet could provide an available solution to dealing with these challenging cases for safety in real autonomous driving scenarios.

Main Results

For more experiments, please refer to our paper.

Acknowledgement

We sincerely thank the authors of mmdetection3d, CenterPoint, TransFusion, MVP, BEVFusion and BEVFusion.

Citation

If you find this work useful in your research, please consider cite:

@article{liu2023multimodal,
      title={Multi-Modal 3D Object Detection by Box Matching}, 
      author={Zhe Liu and Xiaoqing Ye and Zhikang Zou and Xinwei He and Xiao Tan and Errui Ding and Jingdong Wang and Xiang Bai},
      year={2023},
      eprint={2305.07713},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
figs		figs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

figs

figs

README.md

README.md

Repository files navigation

Multi-Modal 3D Object Detection by Box Matching (FBMNet)

Highlights

Introduction

Main Results

Acknowledgement

Citation

About

Releases

Packages

Contributors 2

happinesslz/FBMNet

Folders and files

Latest commit

History

figs

figs

README.md

README.md

Repository files navigation

Multi-Modal 3D Object Detection by Box Matching (FBMNet)

Highlights

Introduction

Main Results

Acknowledgement

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages