Skip to content

Sci-Epiphany/GFBNext

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Global feature-base multimodal semantic segmentation

Introduction

This is an introduction to the initial version of GFBN. GFBN is used for dual-branch multi-modal semantic segmentation. The author's main innovation lies in proposing the Cross-attention correction module and the feature fusion module of large convolution kernel. The code is mainly based on CMNext and CMX. If you want to use this code, please quote the two articles in the last page.

Updates

  • 010/2023, init repository.

Data preparation

Prepare three datasets:

  • NYU Depth V2, for RGB-Depth semantic segmentation.
  • MFNet, for RGB-Thermal semantic segmentation.
  • MCubeS, for multimodal material segmentation with RGB-A-D-N modalities.

Then, all datasets are structured as:

data/
├── NYUDepthv2
│   ├── RGB
│   ├── HHA
│   └── Label
├── MFNet
│   ├── rgb
│   ├── ther
│   └── labels
├── MCubeS
│   ├── polL_color
│   ├── polL_aolp
│   ├── polL_dolp
│   ├── NIR_warped
│   └── SS

Model Zoo

MCubeS

Will come soon!

Training

Before training, please download pre-trained SegFormer, such as checkpoints/pretrained/segformer/mit_b2.pth.

checkpoints/pretrained/segformer
├── mit_b2.pth
└── mit_b4.pth

To train GFBN model, please use change yaml file for --cfg. Several training examples using 4 A100 GPUs are:

cd path/to/GFBN
export PYTHONPATH="path/to/GFBN"
python -m torch.distributed.launch --nproc_per_node=4 --use_env tools/train_mm.py --cfg configs/nyu_rgbd.yaml
python -m torch.distributed.launch --nproc_per_node=4 --use_env tools/train_mm.py --cfg configs/mcubes_rgbadn.yaml

Evaluation

To evaluate GFBN models, please download respective model weights (GoogleDrive) as:

output/
├── MCubeS
│   ├── cmnext_b2_mcubes_rgb.pth
│   ├── cmnext_b2_mcubes_rgba.pth
│   ├── cmnext_b2_mcubes_rgbad.pth
│   └── cmnext_b2_mcubes_rgbadn.pth


Then, modify `--cfg` to respective config file, and run:
```bash
cd path/to/GFBN
export PYTHONPATH="path/to/GFBN"
CUDA_VISIBLE_DEVICES=0 python tools/val_mm.py --cfg configs/mcubes_rgbadn.yaml

License

This repository is under the Apache-2.0 license. For commercial use, please contact with the authors.

Citations

If you use GFBN model, please cite the following works:

  • DeLiVER & CMNeXt [PDF]
@article{zhang2023delivering,
  title={Delivering Arbitrary-Modal Semantic Segmentation},
  author={Zhang, Jiaming and Liu, Ruiping and Shi, Hao and Yang, Kailun and Reiß, Simon and Peng, Kunyu and Fu, Haodong and Wang, Kaiwei and Stiefelhagen, Rainer},
  journal={arXiv preprint arXiv:2303.01480},
  year={2023}
}
@article{liu2022cmx,
  title={CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers},
  author={Liu, Huayao and Zhang, Jiaming and Yang, Kailun and Hu, Xinxin and Stiefelhagen, Rainer},
  journal={arXiv preprint arXiv:2203.04838},
  year={2022}
}

About

Global feature-base multimodal semantic segmentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages