Skip to content

hu-xh/CPNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cross-Modal Fusion and Progressive Decoding Network For RGB-D Salient Object Detection

The paper was accepted by the International Journal of Computer Vision on January 11, 2024. The paper link is: - Link

CPNet

Most existing RGB-D salient object detection (SOD) methods tend to achieve higher performance by integrating additional modules, such as feature enhancement and edge generation. There is no doubt that these modules will inevitably produce feature redundancy and performance degradation. To this end, we exquisitely design a crossmodal fusion and progressive decoding network to achieve RGB-D SOD tasks. The designed network structure only includes three indispensable parts: feature encoding, feature fusion and feature decoding. Specifically, in the feature encoding part, we adopt a two-stream Swin Transformer encoder to extract multi-level and multi-scale features from RGB images and depth images respectively to model global information. In the feature fusion part, we design a cross-modal attention fusion module, which can leverage the attention mechanism to fuse multi-modality and multi-level features. In the feature decoding part, we design a progressive decoder to gradually fuse low-level features and filter noise information to accurately predict salient objects. Extensive experimental results on 6 benchmarks demonstrated that our network surpasses 12 state-of-the-art methods in terms of four metrics. In addition, it is also verified that for the RGB-D SOD task, the addition of the feature enhancement module and the edge generation module is not conducive to improving the detection performance under this framework, which provides new insights into the salient object detection task. Our codes will be available at https://github.com/hu-xh/CPNet.

Network Architecture

fig1.png

Results and Saliency maps

We perform quantitative comparisons and qualitative comparisons with 12 RGB-D SOD methods on six RGB-D datasets. fig2.jpg fig3.jpg

Prerequisites

  • Python 3.6
  • Pytorch 1.10.2
  • Torchvision 0.11.3
  • Numpy 1.19.2

Pretrained Model

Download the following pth and put it into main folder

  • Swin-B with the fetch code:ja95.

Datasets

Results

You can download the tested results map at - [Baidu Pan link] (https://pan.baidu.com/s/1PlmqAvlAwSzsH2YGR4VzKQ) with the fetch code:dq2w.

You can download the results pth - [Baidu Pan link] (https://pan.baidu.com/s/1x6wQf-RceapsZanH4PfbGg) with the fetch code:50lu.

Contact

Feel free to send e-mails to me (1558239392@qq.com).

About

The code of paper Cross-Modal Fusion and Progressive Decoding Network For RGB-D Salient Object Detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages