Signal: Selective Interaction and Global-local Alignment for Multi-Modal Object Re-Identification

News🍎

Our paper has been accepted by AAAI-2026🌹! Paper is here.

Environment🍊

Our env: python=3.10.13, cuda:11.8.

You can prepare according to the following steps:

conda create -n myenv python=3.10.13
conda activate myenv
cd {your path}
pip install -r requirements.txt

Datasets🍋

RGBNT201 GET
RGBNT100 GET
MSVR310 GET

Pretrained Model🍉

ViT-B-16 GET (code:52fu)

Training🍒

python train.py --config_file configs/RGBNT201/Signal.yml

Our Model🍇

Our model's pth files are here:

dataset	mAP	R-1	pth
RGBNT201	80.3	85.2	Signal_model.pth
RGBNT100	86.3	97.6	Signal_model.pth
MSVR310	53.2	72.4	Signal_model.pth

Test🥝

python test.py --config_file configs/RGBNT201/Signal.yml

Introduction🧅️

To address multi-modal object ReID challenges, we propose Signal, a selective interaction and global-local alignment framework with three components:

Selective Interaction Module (SIM): Selects important patch tokens from multi-modal features via intra-modal and inter-modal token selection.
Global Alignment Module (GAM): Simultaneously aligns multi-modal features by minimizing 3D polyhedra volume in gramian space.
Local Alignment Module (LAM): Refines fine-grained alignment via deformable sampling, handling pixel-level misalignment.

Contributions🥬

We propose a novel selective interaction and global-local alignment framework named Signal for multi-modal object ReID, which effectively addresses the challenges of background interference and multi-modal misalignment.
We propose the Selective Interaction Module (SIM) to leverage inter-modal and intra-modal information for selecting important patch tokens, thereby mitigating background interference in multi-modal fusion.
We propose the Global Alignment Module (GAM) to simultaneously align multi-modal features through minimizing the volume of 3D polyhedra in the gramian space.
We propose the Local Alignment Module (LAM) to align local features in a shift-aware manner, effectively addressing pixel-level misalignment across modalities.
Extensive experiments on three multi-modal object ReID datasets validate the effectiveness of our method.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
READ_image		READ_image
config		config
configs		configs
data		data
engine		engine
layers		layers
modeling		modeling
scripts		scripts
solver		solver
test_RNT201		test_RNT201
tests		tests
utils		utils
visualize		visualize
zablation		zablation
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Signal: Selective Interaction and Global-local Alignment for Multi-Modal Object Re-Identification

News🍎

Environment🍊

Datasets🍋

Pretrained Model🍉

Training🍒

Our Model🍇

Test🥝

Introduction🧅️

Contributions🥬

Overall Framework🍠

GAM

LAM

Results🥂

Performance on RGBNT201

Performance on RGBNT100&MSVR310

Token Visual

Offsets Visual

Notes 🍩

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

010129/Signal

Folders and files

Latest commit

History

Repository files navigation

Signal: Selective Interaction and Global-local Alignment for Multi-Modal Object Re-Identification

News🍎

Environment🍊

Datasets🍋

Pretrained Model🍉

Training🍒

Our Model🍇

Test🥝

Introduction🧅️

Contributions🥬

Overall Framework🍠

GAM

LAM

Results🥂

Performance on RGBNT201

Performance on RGBNT100&MSVR310

Token Visual

Offsets Visual

Notes 🍩

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages