GitHub - Sci-Epiphany/GFBNext: Global feature-base multimodal semantic segmentation

Global feature-base multimodal semantic segmentation

Introduction

This is an introduction to the initial version of GFBN. GFBN is used for dual-branch multi-modal semantic segmentation. The author's main innovation lies in proposing the Cross-attention correction module and the feature fusion module of large convolution kernel. The code is mainly based on CMNext and CMX. If you want to use this code, please quote the two articles in the last page.

Updates

010/2023, init repository.

Data preparation

Prepare three datasets:

NYU Depth V2, for RGB-Depth semantic segmentation.
MFNet, for RGB-Thermal semantic segmentation.
MCubeS, for multimodal material segmentation with RGB-A-D-N modalities.

Then, all datasets are structured as:

data/
├── NYUDepthv2
│   ├── RGB
│   ├── HHA
│   └── Label
├── MFNet
│   ├── rgb
│   ├── ther
│   └── labels
├── MCubeS
│   ├── polL_color
│   ├── polL_aolp
│   ├── polL_dolp
│   ├── NIR_warped
│   └── SS

Model Zoo

MCubeS

Will come soon!

Training

Before training, please download pre-trained SegFormer, such as checkpoints/pretrained/segformer/mit_b2.pth.

checkpoints/pretrained/segformer
├── mit_b2.pth
└── mit_b4.pth

To train GFBN model, please use change yaml file for --cfg. Several training examples using 4 A100 GPUs are:

cd path/to/GFBN
export PYTHONPATH="path/to/GFBN"
python -m torch.distributed.launch --nproc_per_node=4 --use_env tools/train_mm.py --cfg configs/nyu_rgbd.yaml
python -m torch.distributed.launch --nproc_per_node=4 --use_env tools/train_mm.py --cfg configs/mcubes_rgbadn.yaml

Evaluation

To evaluate GFBN models, please download respective model weights (GoogleDrive) as:

output/
├── MCubeS
│   ├── cmnext_b2_mcubes_rgb.pth
│   ├── cmnext_b2_mcubes_rgba.pth
│   ├── cmnext_b2_mcubes_rgbad.pth
│   └── cmnext_b2_mcubes_rgbadn.pth


Then, modify `--cfg` to respective config file, and run:
```bash
cd path/to/GFBN
export PYTHONPATH="path/to/GFBN"
CUDA_VISIBLE_DEVICES=0 python tools/val_mm.py --cfg configs/mcubes_rgbadn.yaml

License

This repository is under the Apache-2.0 license. For commercial use, please contact with the authors.

Citations

If you use GFBN model, please cite the following works:

DeLiVER & CMNeXt [PDF]

@article{zhang2023delivering,
  title={Delivering Arbitrary-Modal Semantic Segmentation},
  author={Zhang, Jiaming and Liu, Ruiping and Shi, Hao and Yang, Kailun and Reiß, Simon and Peng, Kunyu and Fu, Haodong and Wang, Kaiwei and Stiefelhagen, Rainer},
  journal={arXiv preprint arXiv:2303.01480},
  year={2023}
}

CMX [PDF]

@article{liu2022cmx,
  title={CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers},
  author={Liu, Huayao and Zhang, Jiaming and Yang, Kailun and Hu, Xinxin and Stiefelhagen, Rainer},
  journal={arXiv preprint arXiv:2203.04838},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs		configs
semseg		semseg
tools		tools
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

semseg

semseg

tools

tools

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Global feature-base multimodal semantic segmentation

Introduction

Updates

Data preparation

Model Zoo

MCubeS

Training

Evaluation

License

Citations

About

Releases

Packages

Languages

License

Sci-Epiphany/GFBNext

Folders and files

Latest commit

History

Repository files navigation

Global feature-base multimodal semantic segmentation

Introduction

Updates

Data preparation

Model Zoo

MCubeS

Training

Evaluation

License

Citations

About

Resources

License

Stars

Watchers

Forks

Languages