pyfu

Deep Sensor Fusion with Pyramid Fusion Networks for 3D Semantic Segmentation

Contributors:

Hannah Schieber*, Fabian Duerr*, Torsten Schoen and Jürgen Beyerer

'*' equal contribution
e-mail: hannah.schieber[at]fau.de

Abstract

Robust environment perception for autonomous vehicles is a tremendous challenge, which makes a diverse sensor set with e.g. camera, lidar and radar crucial. In the process of understanding the recorded sensor data, 3D semantic segmentation plays an important role. Therefore, this work presents a pyramid-based deep fusion architecture for lidar and camera to improve 3D semantic segmentation of traffic scenes. Individual sensor backbones extract feature maps of camera images and lidar point clouds. A novel Pyramid Fusion Backbone fuses these feature maps at different scales and combines the multimodal features in a feature pyramid to compute valuable multimodal, multi-scale features. The Pyramid Fusion Head aggregates these pyramid features and further refines them in a late fusion step, incorporating the final features of the sensor backbones. The approach is evaluated on two challenging outdoor datasets and different fusion strategies and setups are investigated. It outperforms recent range view based lidar approaches as well as all so far proposed fusion strategies and architectures.

Quantitative Results

Results on SemanticKitti

Results on PandaSet

Code

pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html

Part of the Code is borrowed from these repositories EfficientPS and PSPNet. Furthermore, it builds upon the fusion idea of Duerr et. al:

@INPROCEEDINGS{9287974,  author={Duerr, Fabian and Weigel, Hendrik and Maehlisch, Mirko and Beyerer, Jürgen}, 
	 booktitle={2020 Fourth IEEE International Conference on Robotic Computing (IRC)},  
	 title={Iterative Deep Fusion for 3D Semantic Segmentation},   
	year={2020},  
	volume={},  number={},  pages={391-397},  doi={10.1109/IRC.2020.00067}
}

However, we can not provide the full training code, but the important parts of our network are publicly available. A training cicle and prepocessing has to be implemented.

if you cite our work please also consider the previous approach Iterative Deep Fusion for 3D Semantic Segmentation.

@misc{https://doi.org/10.48550/arxiv.2205.13629,
  doi = {10.48550/ARXIV.2205.13629},
  url = {https://arxiv.org/abs/2205.13629},
  author = {Schieber, Hannah and Duerr, Fabian and Schoen, Torsten and Beyerer, Jürgen},
  keywords = {Computer Vision and Pattern Recognition (cs.CV), Artificial Intelligence (cs.AI), Robotics (cs.RO), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Deep Sensor Fusion with Pyramid Fusion Networks for 3D Semantic Segmentation},
  publisher = {arXiv},
  year = {2022},
  copyright = {arXiv.org perpetual, non-exclusive license}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
conf		conf
custom		custom
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conf

conf

custom

custom

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

pyfu

Deep Sensor Fusion with Pyramid Fusion Networks for 3D Semantic Segmentation

Contributors:

Abstract

Quantitative Results

Results on SemanticKitti

Results on PandaSet

Code

Video

About

Releases

Packages

Languages

HannahHaensen/pyfu

Folders and files

Latest commit

History

Repository files navigation

pyfu

Deep Sensor Fusion with Pyramid Fusion Networks for 3D Semantic Segmentation

Contributors:

Abstract

Quantitative Results

Results on SemanticKitti

Results on PandaSet

Code

Video

About

Topics

Resources

Stars

Watchers

Forks

Languages