Camera-based Semantic Scene Completion with Sparse Guidance Network

Camera-based Semantic Scene Completion with Sparse Guidance Network.

[Arxiv]

News

[2023/12]: We release the evaluation results and training code for SSCBench-KITTI-360.
[2023/12]: Our paper is on arxiv.
[2023/08]: SGN achieve the SOTA on Camera-based SemanticKITTI 3D SSC (Semantic Scene Completion) Task with 15.76% mIoU and 45.52% IoU.

Abstract

Semantic scene completion (SSC) aims to predict the semantic occupancy of each voxel in the entire 3D scene from limited observations, which is an emerging and critical task for autonomous driving. Recently, many studies have turned to camera-based SSC solutions due to the richer visual cues and cost-effectiveness of cameras. However, existing methods usually rely on sophisticated and heavy 3D models to process the lifted 3D features directly, which are not discriminative enough for clear segmentation boundaries. In this paper, we adopt the dense-sparse-dense design and propose an end-to-end camera-based SSC framework, termed SGN, to diffuse semantics from the semantic- and occupancy-aware seed voxels to the whole scene based on geometry prior and occupancy information. Firstly, to dynamically select sparse seed voxels and provide occupancy-aware information, we redesign the sparse voxel proposal network to process points generated by depth prediction directly with the coarse-to-fine paradigm. Furthermore, by designing hybrid guidance (sparse semantic and geometry guidance) and effective voxel aggregation for spatial occupancy and geometry priors, we enhance the feature separation between different categories and expedite the convergence of semantic diffusion. Finally, we devise the multi-scale semantic diffusion module for flexible receptive fields while reducing the computation resources. Extensive experimental results on the SemanticKITTI and SSCBench-KITTI-360 datasets demonstrate the superiority of our SGN over existing state-of-the-art methods. And even our light version SGN-L achieves notable scores of 14.80% mIoU and 45.45% IoU on SeamnticKITTI validation with only 12.5 M parameters and 7.16 G training memory.

Method

Figure 1. Overall framework of SGN. The image encoder extracts 2D features to provide the foundation for 3D features lifted by the view transformation. Then auxiliary occupancy head is applied to provide geometry guidance. Before sparse semantic guidance, depth-based occupancy prediction is utilized for voxel proposals of indexing seed features. Afterward, the voxel aggregation layer forms the informative voxel features processed by the multi-scale semantic diffusion for the final semantic occupancy prediction. KT denotes the knowledge transfer layer for geometry prior.

Getting Started

Installation

Please refer to Voxformer to create base environment. Some extra packages are needed to be installed:

spconv-cu111==2.1.25
torch-scatter==2.0.8
tochmetrics>=0.9.0

Prepare Dataset

Please refer to the README in the preprocess folder for details.

Run and Eval

Train SGN with 4 GPUs

./tools/dist_train.sh ./projects/configs/sgn/sgn-T-one-stage-guidance.py 4

Eval SGN with 4 GPUs

./tools/dist_test.sh ./projects/configs/sgn/sgn-T-one-stage-guidance.py ./path/to/ckpts.pth 4

Model Zoo

Backbone	Dataset	Method	IoU	mIoU	Params (M)	Config	Download
R50	Sem.KITTI val/test	SGN-T	46.21/45.42	15.32/15.76	28.2	config	model
R50	KITTI360 val/test	SGN-T	47.50/47.06	19.07/18.25	28.2	config	model
R18	Sem.KITTI val/test	SGN-L	45.45/43.71	14.80/14.39	12.5	config	model
R18	KITTI360 val/test	SGN-L	46.67/46.64	17.11/16.95	12.5	config	model
R50	Sem.KITTI val/test	SGN-S	43.60/41.88	14.55/14.01	28.2	config	model
R50	KITTI360 val/test	SGN-S	46.13/46.22	18.29/17.71	28.2	config	model

Note that we used the checkpoints that performed best on the validation set during training to evaluate SGN on the test sets for both SemanticKITTI and SSCBench-KITTI-360 datasets.

TODO

SemanticKITTI
SSCBench-KITTI-360
Data augmentation

Acknowledgement

Many thanks to these excellent open source projects:

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
preprocess		preprocess
projects		projects
teaser		teaser
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Camera-based Semantic Scene Completion with Sparse Guidance Network

News

Abstract

Method

Getting Started

Installation

Prepare Dataset

Run and Eval

Model Zoo

TODO

Acknowledgement

About

Releases

Packages

Languages

License

Jieqianyu/SGN

Folders and files

Latest commit

History

Repository files navigation

Camera-based Semantic Scene Completion with Sparse Guidance Network

News

Abstract

Method

Getting Started

Installation

Prepare Dataset

Run and Eval

Model Zoo

TODO

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages