Mul-VMamba

Mul-VMamba: Multi-modal semantic segmentation using selection-fusion-based Vision-Mamba

Introduction

In traffic driving environments with complex lighting conditions, integrating multi-modal data (RGB, depth, infrared, and so on) can significantly enhance the accuracy of semantic segmentation, thereby providing precise information for downstream tasks such as autonomous driving. However, existing approaches emphasize segmentation accuracy at the expense of efficiency. To address this trade-off, we propose a multi-modal semantic segmentation network based on the linear complexity Selective State Space Model (S6, a.k.a Mamba), dubbed Mul-VMamba. Mul-VMamba establishes selection-fusion relationships among multi-modal features, enabling semantic segmentation with any input modalities. Specifically, the Mamba Spatial-consistency Selective Module (MSSM) adaptively extracts feature mapping relationships and filters out redundant features at the same spatial locations, preserving the spatial relationships between each modality. Additionally, the Mamba Cross-Fusion Module (MCFM) introduces a Cross Selective State Space Model (Cross-S6), establishing the relationship between s6 and multimodal features, achieving optimal fusion performance.

Network

Dataset

Mcubes Link

The Mcubes_dataset should organized in the following format:

Mcubes_dataset
├── polL_dolp
│   ├──outscene1208_2_0000000150.npy
│   ├──outscene1208_2_0000000180.npy
    ...
├── polL_color
│   ├──outscene1208_2_0000000150.png
│   ├──outscene1208_2_0000000180.png
    ...
├── polL_aolp_sin
├── polL_aolp_cos
├── list_folder
│   ├──all.txt
│   ├──test.txt
│   ├──train.txt
│   └──val.txt
├── SS
├── SSGT4MS
├── NIR_warped_mask
├── NIR_warped
└── GT

DELIVER Link

The DELIVER _dataset should organized in the following format:

DELIVER
├── depth
│   ├── cloud
│   │   ├── test
│   │   │   ├── MAP_10_point102
│   │   │   │   ├── 045050_depth_front.png
│   │   │   │   ├── ...
│   │   ├── train
│   │   └── val
│   ├── fog
│   ├── night
│   ├── rain
│   └── sun
├── event
├── hha
├── img
├── lidar
└── semantic

Usage

Environment

python:3.11.5, torch：2.1.2, CUDA: 11.8

cd semseg/kernels/selective_scan
pip install .

install others requirement the code required.

Training

Download pre-train Vmamba-tiny

cd path/to/Mul-VMamba
conda activate yourenv
export PYTHONPATH="path/to/Mul-VMamba"
python -m torch.distributed.launch --nproc_per_node=4 --use_env tools/train_mm.py --cfg configs/deliver_rgbdelmulmamba.yaml
python -m torch.distributed.launch --nproc_per_node=4 --use_env tools/train_mm.py --cfg configs/mcubes_rgbadnmulmamba.yaml

Evaluation

modify the model path in config.

cd path/to/Mul-VMamba
conda activate yourenv
export PYTHONPATH="path/to/Mul-VMamba"
CUDA_VISIBLE_DEVICES=0 python tools/val_mm.py --cfg configs/deliver_rgbdel.yaml

Model Zoo

DELIVER dataset

Model-Modal	mIoU	weight
Mul-VMamba-RGB-D	66.52	GoogleDrive
Mul-VMamba-RGB-D-E	67.43	GoogleDrive
Mul-VMamba-RGB-D-E-L	68.98	GoogleDrive

Mcubes dataset

Model-Modal	mIoU	weight
Mul-VMamba-RGB-A	52.48	GoogleDrive
Mul-VMamba-RGB-A-D	53.29	GoogleDrive
Mul-VMamba-RGB-A-D-N	54.65	GoogleDrive

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
checkpoints/pretrained/Vmamba		checkpoints/pretrained/Vmamba
configs		configs
image		image
semseg		semseg
tools		tools
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mul-VMamba

Mul-VMamba: Multi-modal semantic segmentation using selection-fusion-based Vision-Mamba

Introduction

Network

Dataset

Mcubes Link

DELIVER Link

Usage

Environment

Training

Evaluation

Model Zoo

DELIVER dataset

Mcubes dataset

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mul-VMamba

Mul-VMamba: Multi-modal semantic segmentation using selection-fusion-based Vision-Mamba

Introduction

Network

Dataset

Mcubes Link

DELIVER Link

Usage

Environment

Training

Evaluation

Model Zoo

DELIVER dataset

Mcubes dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages