Skip to content

jianwang91/POT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 

Repository files navigation

POT: Prototypical Optimal Transport for Weakly Supervised Semantic Segmentation

The implementation of POT: Prototypical Optimal Transport for Weakly Supervised Semantic Segmentation, CVPR 2025.

Abstract

Weakly Supervised Semantic Segmentation (WSSS) leverages Class Activation Maps (CAMs) to extract spatial information from image-level labels. However, CAMs primarily highlight the most discriminative foreground regions, leading to incomplete results. Prototype-based methods attempt to address this limitation by employing prototype CAMs instead of classifier CAMs. Nevertheless, existing prototype-based methods typically use a single prototype for each class, which is insufficient to capture all attributes of the foreground features due to the significant intra-class variations across different images. Consequently, these methods still struggle with incomplete CAM predictions. In this paper, we propose a novel framework called Prototypical Optimal Transport (POT) for WSSS. POT enhances CAM predictions by dividing features into multiple clusters and activating each cluster using its prototype. In this process, a similarity-aware optimal transport is employed to assign features to the most probable clusters. This similarity-aware strategy ensures the prioritization of significant cluster prototypes, thereby improving the accuracy of feature assignment. Additionally, we introduce an adaptive OT-based consistency loss to refine feature representations. This framework effectively overcomes the limitations of single-prototype methods, providing more complete and accurate CAM predictions. Extensive experimental results on standard WSSS benchmarks (PASCAL VOC and MS COCO) demonstrate that our method significantly improves the quality of CAMs and achieves state-of-the-art performances. The source code will be released https://github.com/jianwang91/POT.

Framework

Environment

  • Python >= 3.8
  • Pytorch >= 1.8.0
  • Torchvision
  • scikit-image
  • numpy
  • opencv-python

Usage

Step 1. Prepare Dataset

Following the previous method, CLIP-ES to prepare the dataset and base CAM npy files, [CLIP-ES]:(https://github.com/linyq2117/CLIP-ES), or directly download the CAM npy files here:(cams.zip : https://pan.baidu.com/s/1S0lzyInvYD4FtxKKLAuepQ?pwd=mn2p Code: mn2p), (cams.zip (Google Drive): https://drive.google.com/file/d/1NCXf4ZHfdpD1yVoIaoIJS-rXsk-8i71E/view?usp=sharing), (cam visual images for comparison: https://drive.google.com/file/d/1OanoQypkbhWry6PhvC1qBK7D0siWQdRr/view?usp=sharing)

Step 2. Train POT

Execute the following script to start the training process.

bash run_voc.sh

We provide a link for current resnet50 version, including weights and log (exp: https://pan.baidu.com/s/1KoEHksu199Xb3T9JnU1dRw?pwd=1qgx Code: 1qgx)

Step 3. Train Fully Supervised Segmentation Models

To train fully supervised segmentation models, we refer to deeplab v2, and seamv1.

bash test.sh

Results

Citation

If you find this work useful for your research, please consider citing our paper:

@inproceedings{wang2025pot,
  title={POT: Prototypical Optimal Transport for Weakly Supervised Semantic Segmentation},
  author={Wang, Jian and Dai, Tianhong and Zhang, Bingfeng and Yu, Siyue and Lim, Eng Gee and Xiao, Jimin},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={15055--15064},
  year={2025}
}

We borrow the code from SIPE and CLIP-ES , Thanks for their excellent work.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors