Geometry-guided Kernel Transformer

Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer
Shaoyu Chen*, Tianheng Cheng*, Xinggang Wang^†, Wenming Meng, Qian Zhang, Wenyu Liu

(*: equal contribution, †: corresponding author)

[arXiv Preprint]

News

October 14, 2022: We've released code & models for map-view segmentation
June 9, 2022: We've released the tech report for Geometry-guided Kernel Transformer (GKT). This work is still in progress and code/models are coming sonn. Please stay tuned! ☕️

Introduction

We present a novel and efficient 2D-to-BEV transformation, Geometry-guided Kernel Transformer (GKT).

GKT leverages geometric priors to guide the transformers to focus on discriminative regions for generating BEV representation with surrouding-view image features.
GKT is based on kernel-wise attention and much efficient, especially with LUT indexing.
GKT is robust to the deviation of cameras, making the 2D-to-BEV transformation more stable and reliable.

Getting Started

git clone https://github.com/hustvl/GKT.git

Map-view nuScenes Segmentation

Models

Method	Kernel	mIoU (Setting 1)	mIoU (Setting 2)	FPS	model
CVT	-	39.3	37.2	34.1	model
GKT	7x1	41.4	38.0	45.6	model

Note: FPS are measured on one 2080 Ti GPU.

Usage

For map-view nuScenes segmentation, we mainly build the GKT based on the awesome CrossViewTransformer.

# map-view segmentation
cd segmentation

Prerequisites

# install dependencies
pip install -r reuqirements.txt
pip install -e .

Preparing the Dataset

Training / Testing / Benchmarking

Pretrained model

Download the pretrained model efficientnet-b4-6ed6700e.pth

mkdir pretrained_models
cd pretrained_models
# place the pretrained model here

Training

python scripts/train.py +experiment=gkt_nuscenes_vehicle_kernel_7x1.yaml  data.dataset_dir=<path/to/nuScenes> data.labels_dir=<path/to/labels>

Testing

Using the absolute path of the checkpoint is better.

python scripts/eval.py +experiment=gkt_nuscenes_vehicle_kernel_7x1.yaml data.dataset_dir=<path/to/nuScenes> data.labels_dir=<path/to/labels> experiment.ckptt <path/to/checkpoint>

Evalutating Speed

python scripts/speed.py +experiment=gkt_nuscenes_vehicle_kernel_7x1.yaml data.dataset_dir=<path/to/nuScenes> data.labels_dir=<path/to/labels>

3D Object Detection

coming soon.

Acknowledgements

We sincerely appreciate the awesome repos cross_view_transformers and fiery!

License

GKT is released under the MIT Licence.

Citation

If you find GKT is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@article{GeokernelTransformer,
  title={Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer},
  author={Chen, Shaoyu and Cheng, Tianheng and Wang, Xinggang and Meng, Wenming and Zhang, Qian and Liu, Wenyu},
  journal={arXiv preprint arXiv:2206.04584},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
segmentation		segmentation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

segmentation

segmentation

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Geometry-guided Kernel Transformer

News

Introduction

Getting Started

Map-view nuScenes Segmentation

Models

Usage

Prerequisites

Preparing the Dataset

Training / Testing / Benchmarking

3D Object Detection

Acknowledgements

License

Citation

About

Releases

Packages

Contributors 2

Languages

License

hustvl/GKT

Folders and files

Latest commit

History

Repository files navigation

Geometry-guided Kernel Transformer

News

Introduction

Getting Started

Map-view nuScenes Segmentation

Models

Usage

Prerequisites

Preparing the Dataset

Training / Testing / Benchmarking

3D Object Detection

Acknowledgements

License

Citation

About

Resources

License

Stars

Watchers

Forks

Languages