Multispectral Object Detection via Cross-Modal Conflict-Aware Learning

This repo is the official implementation for Multispectral Object Detection via Cross-Modal Conflict-Aware Learning. The paper has been accepted to ACM MM 2023 Oral.

News

[2024.01.10] Code released!

[2023.08.18] Our paper is ready!

Introduction

Abstract. Multispectral object detection has gained significant attention due to its potential in all-weather applications, particularly those involving visible (RGB) and infrared (IR) images. Despite substantial advancements in this domain, current methodologies primarily rely on rudimentary accumulation operations to combine complementary information from disparate modalities, overlooking the semantic conflicts that arise from the intrinsic heterogeneity among modalities. To address this issue, we propose a novel learning network, the Cross-modal Conflict-Aware Learning Network (CALNet), that takes into account semantic conflicts and complementary information within multi-modal input. Our network comprises two pivotal modules: the Cross-Modal Conflict Rectification Module (CCR) and the Selected Cross-modal Fusion (SCF) Module. The CCR module mitigates modal heterogeneity by examining contextual information of analogous pixels, thus alleviating multi-modal information with semantic conflicts. Subsequently, semantically coherent information is supplied to the SCF module, which fuses multi-modal features by assessing intra-modal importance to select semantically rich features and mining inter-modal complementary information. To assess the effectiveness of our proposed method, we develop a two-stream one-stage detector based on CALNet for multispectral object detection. Comprehensive experimental outcomes demonstrate that our approach considerably outperforms existing methods in resolving the cross-modal semantic conflict issue and achieving state-of-the-art accuracy in detection results.

Dataset

DroneVehicle

The dataset is available for download at the following link. Many heartfelt thanks for their meticulous dataset collection and expert labeling efforts!

DroneVehicle

Processed labels for our carefully organized dataset:

-[Label] download password:5sow

Pre-training weights download:

-[Weight] download password:zvi2

CALNet Overview

Visualization of Detection

INSTAllation

I have tested the following versions of OS and softwares：

OS：Ubuntu 18.04
CUDA: 11.3

Install

CUDA Driver Version ≥ CUDA Toolkit Version(runtime version) = torch.version.cuda

a. Create a conda virtual environment and activate it, e.g.,

conda remove -n mmdetection --all
conda activate mmdet
conda install pytorch==1.10.1 cudatoolkit==11.3.1 torchvision==0.11.2 -c pytorch

pip install -r requirements.txt
cd utils/nms_rotated
python setup.py develop  #or "pip install -v -e ."

train

python train.py

Install DOTA_devkit.

Download DOTA_devkit.

-[DOTA_devkit] download password:peoe

(Custom Install, it's just a tool to split the high resolution image and evaluation the obb)

cd yolov5_obb/DOTA_devkit
sudo apt-get install swig
swig -c++ -python polyiou.i
python setup.py build_ext --inplace

test

python valtest.py --save-json --name 'obb_demo6'
python tools/TestJson2VocClassTxt.py --json_path 'runs/val/obb_demo/best_obb_predictions.json' --save_path 'runs/val/obb_demo/obb_predictions_Txt'
python DOTA_devkit-master/dota_evaluation_task1.py

Acknowledgment

Our codes are mainly based on yolov5. Many thanks to the authors!

Citation

If this is useful for your research, please consider cite.

@inproceedings{he2023multispectral,
  title={Multispectral Object Detection via Cross-Modal Conflict-Aware Learning},
  author={He, Xiao and Tang, Chang and Zou, Xin and Zhang, Wei},
  booktitle={Proceedings of the 31st ACM International Conference on Multimedia},
  pages={1465--1474},
  year={2023}
}

@article{he2023object,
  title={Object Detection in Hyperspectral Image via Unified Spectral-Spatial Feature Aggregation},
  author={He, Xiao and Tang, Chang and Liu, Xinwang and Zhang, Wei and Sun, Kun and Xu, Jiangfeng},
  journal={arXiv preprint arXiv:2306.08370},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
__pycache__		__pycache__
data		data
docs		docs
img_readme		img_readme
models		models
sh		sh
tools		tools
utils		utils
weights		weights
Arial.ttf		Arial.ttf
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
HHtwo123.jpeg		HHtwo123.jpeg
LICENSE		LICENSE
README.md		README.md
detect.py		detect.py
export.py		export.py
hubconf.py		hubconf.py
little.py		little.py
requirements.txt		requirements.txt
setup.cfg		setup.cfg
shapes_rgb.txt		shapes_rgb.txt
test.txt		test.txt
train.py		train.py
tutorial.ipynb		tutorial.ipynb
val.py		val.py
valtest.py		valtest.py

License

hexiao0275/CALNet-Dronevehicle

Folders and files

Latest commit

History

Repository files navigation

Multispectral Object Detection via Cross-Modal Conflict-Aware Learning

News

Introduction

Dataset

DroneVehicle

CALNet Overview

Visualization of Detection

INSTAllation

Install

train

Install DOTA_devkit.

Download DOTA_devkit.

test

Acknowledgment

Citation

About

Resources

License

Stars

Watchers

Forks

Languages