SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation (NeurIPS 2022)

The repository contains official Pytorch implementations of training and evaluation codes and pre-trained models for SegNext.

For Jittor user, https://github.com/Jittor/JSeg is a jittor version.

The paper is in Here.

The code is based on MMSegmentaion v0.24.1.

Citation

If you find our repo useful for your research, please consider citing our paper:

@article{guo2022segnext,
  title={SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation},
  author={Guo, Meng-Hao and Lu, Cheng-Ze and Hou, Qibin and Liu, Zhengning and Cheng, Ming-Ming and Hu, Shi-Min},
  journal={arXiv preprint arXiv:2209.08575},
  year={2022}
}


@article{guo2022visual,
  title={Visual Attention Network},
  author={Guo, Meng-Hao and Lu, Cheng-Ze and Liu, Zheng-Ning and Cheng, Ming-Ming and Hu, Shi-Min},
  journal={arXiv preprint arXiv:2202.09741},
  year={2022}
}


@inproceedings{
    ham,
    title={Is Attention Better Than Matrix Decomposition?},
    author={Zhengyang Geng and Meng-Hao Guo and Hongxu Chen and Xia Li and Ke Wei and Zhouchen Lin},
    booktitle={International Conference on Learning Representations},
    year={2021},
}

Results

Notes: ImageNet Pre-trained models can be found in TsingHua Cloud.

Rank 1 on Pascal VOC dataset: Leaderboard

ADE20K

Method	Backbone	Pretrained	Iters	mIoU(ss/ms)	Params	FLOPs	Config	Download
SegNeXt	MSCAN-T	IN-1K	160K	41.1/42.2	4M	7G	config	TsingHua Cloud
SegNeXt	MSCAN-S	IN-1K	160K	44.3/45.8	14M	16G	config	TsingHua Cloud
SegNeXt	MSCAN-B	IN-1K	160K	48.5/49.9	28M	35G	config	TsingHua Cloud
SegNeXt	MSCAN-L	IN-1K	160K	51.0/52.1	49M	70G	config	TsingHua Cloud

Cityscapes

Method	Backbone	Pretrained	Iters	mIoU(ss/ms)	Params	FLOPs	Config	Download
SegNeXt	MSCAN-T	IN-1K	160K	79.8/81.4	4M	56G	config	TsingHua Cloud
SegNeXt	MSCAN-S	IN-1K	160K	81.3/82.7	14M	125G	config	TsingHua Cloud
SegNeXt	MSCAN-B	IN-1K	160K	82.6/83.8	28M	276G	config	TsingHua Cloud
SegNeXt	MSCAN-L	IN-1K	160K	83.2/83.9	49M	578G	config	TsingHua Cloud

Notes: In this scheme, The number of FLOPs (G) is calculated on the input size of 512 $\times$ 512 for ADE20K, 2048 $\times$ 1024 for Cityscapes by torchprofile (recommended, highly accurate and automatic MACs/FLOPs statistics).

Installation

Install the dependencies and download ADE20K according to the guidelines in MMSegmentation.

pip install timm
cd SegNeXt
python setup.py develop

Training

We use 8 GPUs for training by default. Run:

./tools/dist_train.sh /path/to/config 8

Evaluation

To evaluate the model, run:

./tools/dist_test.sh /path/to/config /path/to/checkpoint_file 8 --eval mIoU

FLOPs

Install torchprofile using

pip install torchprofile

To calculate FLOPs for a model, run:

bash tools/get_flops.py /path/to/config --shape 512 512

Contact

For technical problem, please create an issue.

If you have any private question, please feel free to contact me via gmh20@mails.tsinghua.edu.cn.

Acknowledgment

Our implementation is mainly based on mmsegmentaion, Segformer and Enjoy-Hamburger. Thanks for their authors.

LICENSE

This repo is under the Apache-2.0 license. For commercial use, please contact the authors.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
configs		configs
demo		demo
docker		docker
docs		docs
local_configs		local_configs
mmseg		mmseg
requirements		requirements
resources		resources
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

License

Visual-Attention-Network/SegNeXt

Folders and files

Latest commit

History

Repository files navigation