Railroad is not a Train: Saliency as Pseudo-pxiel Supervision for Weakly Supervised Semantic Segmentation (CVPR 2021)
Seungho Lee1,* , Minhyun Lee1,*, Jongwuk Lee2, Hyunjung Shim1
* indicates an equal contribution
1 School of Integrated Technology, Yonsei University
2 Department of Computer Science of Engineering, Sungkyunkwan University
Existing studies in weakly-supervised semantic segmentation (WSSS) using image-level weak supervision have several limitations: sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. To overcome these challenges, we propose a novel framework, namely Explicit Pseudo-pixel Supervision (EPS), which learns from pixel-level feedback by combining two weak supervisions; the image-level label provides the object identity via the localization map and the saliency map from the off-the-shelf saliency detection model offers rich boundaries. We devise a joint training strategy to fully utilize the complementary relationship between both information. Our method can obtain accurate object boundaries and discard co-occurring pixels, thereby significantly improving the quality of pseudo-masks.
12 Jul, 2021: Initial upload
19 Aug, 2021: Minor update on information about dCRF and the pre-trained model of the segmentation networks
- Please see the issuses: dCRF and pre-trained model
28 Aug, 2021: Major updates about MS-COCO 2014 dataset and minor updates (cleanup)
15 Apr, 2022: Minor update on information about the method setting up 'cls_labels.npy' the for ms-coco 17 dataset
- Please see the issue: coco17
22 Feb, 2023: Minor update on the download link for coco dataset (Masks, Saliency maps)
- Python 3.6
- Pytorch >= 1.0.0
- Torchvision >= 0.2.2
- MXNet
- Pillow
- opencv-python (opencv for Python)
-
PASCAL VOC 2012
- Images
- Saliency maps using PFAN
-
MS-COCO 2014
-
Pretrained models
-
MS-COCO 2017
-
Execute the bash file for training, inference and evaluation.
# Please see these files for the detail of execution. # PASCAL VOC 2012 # Baseline bash script/vo12_cls.sh # EPS bash script/voc12_eps.sh # MS-COCO 2014 # Baseline bash script/coco_cls.sh # EPS bash script/coco_eps.sh
-
We provide checkpoints, training logs, and performances for each method and each dataset.
Please see the details from the script files.
Dataset METHOD Train(mIoU) Checkpoint Training log PASCAL VOC 2012 Base 47.05 Download voc12_cls.log PASCAL VOC 2012 EPS 69.22 Download voc12_eps.log MS-COCO 2014 Base 31.23 Download coco_cls.log MS-COCO 2014 EPS 37.15 Download coco_eps.log -
dCRF hyper-parameters
- We did not use dCRF for our pseudo-masks, but only used for the comparision in the paper.
- We chose the hyper-parameters for dCRF used in ResNet101-based DeepLabV2 among other candidates(OAA, and PSA)
- Please see the official deeplab website for information
CRF parameters: bi_w = 4, bi_xy_std = 67, bi_rgb_std = 3, pos_w = 3, pos_xy_std = 1.
- We utilize DeepLab-V2 for the segmentation network.
- Please see deeplab-pytorch for the implementation in PyTorch.
- We used the pretrained model for VGG16 based network from DeepLab official and for ResNet101-based network from OAA official.
This code is highly borrowed from PSA. Thanks to Jiwoon, Ahn.