Skip to content

Menoly-xin/Hardness-Level-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Not All Pixels Are Equal: Learning Pixel Hardness for Semantic Segmentation

Official PyTorch Implementation

Xin Xiao, Daiguo Zhou, Jiagao Hu, Yi Hu and Yongchao Xu

abstract: —Semantic segmentation has recently witnessed great progress. Despite the impressive overall results, the segmentation performance in some hard areas (e.g., small objects or thin parts) is still not promising. A straightforward solution is hard sample mining, which is widely used in object detection. Yet, most existing hard pixel mining strategies for semantic segmentation often rely on pixel’s loss value, which tends to decrease during training. Intuitively, the pixel hardness for segmentation mainly depends on image structure and is expected to be stable. In this paper, we propose to learn pixel hardness for semantic segmentation, leveraging hardness information contained in global and historical loss values. More precisely, we add a gradient-independent branch for learning a hardness level (HL) map by maximizing hardness-weighted segmentation loss, which is minimized for the segmentation head. This encourages large hardness values in difficult areas, leading to appropriate and stable HL map. Despite its simplicity, the proposed method can be applied to most segmentation methods with no and marginal extra cost during inference and training, respectively. Without bells and whistles, the proposed method achieves consistent/significant improvement (1.37% mIoU on average) over most popular semantic segmentation methods on Cityscapes dataset, and demonstrates good generalization ability across domains.

image image

Usage

To reproduce the results in the paper, we recommend to follow the instructions below. Other versions of Pytorch and mmcv are not tested, but it may work.

Requirements

  • Pytorch == 1.8.2
  • mmcv-full == 1.4.5

Getting started

1. Install dependencies

Step 1: Create a conda environment and activate it.

conda create -n HardnessLevel python=3.7
conda activate HardnessLevel

Step 2: Install PyTorch and torchvision

pip3 install torch==1.8.2 torchvision==0.9.2 --extra-index-url https://download.pytorch.org/whl/lts/1.8/cu111

Step 3: Install mmcv-full

pip install -U openmim 
mim install mmcv-full==1.4.5
pip3 install matplotlib numpy packaging prettytable cityscapesscripts

2. Data preparation

cd mmsegmentation
mkdir data

Please follow the instructions of mmsegmentation for data preparation.

3. Training

For instance, training PSPNet-ResNet101 with HL on Cityscapes with 4 GPUs by:

bash ./tools/dist_train.sh configs/pspnet_hl/pspnet_r101-d8_769x769_40k_cityscapes_hl.py 4

4. Testing

For instance, test PSPNet-ResNet101 with HL on Cityscapes with 4 GPUs by:

bash ./tools/dist_test.sh  configs/pspnet_hl/pspnet_r101-d8_769x769_40k_cityscapes_hl.py /path/pspnet.pth 4 --eval mIoU

BaiduNet Disk (jahh) Google Driver. Note that you should replace /path/pspnet.pth with the path you store the pth file. You are supposed to get 80.65 mIoU on val set.

Results

Training logs can be found here. Experiments are implemented on a device with 8 A100-40GB GPUs. By using the HL map collected from PSPNet, we achieve consistent improvement over the new paradigm of semantic segmentation, Mask2Former.

Extensions

1. Domain generation semantic segmentation

Evaluate the performance for GTAV -> Cityscapes domain generation by:

bash ./tools/dist_test.sh  configs/gta_hl/deeplab_gta2city_res101_hl.py /path/gta_hl.pth 4 --eval mIoU

BaiduNet Disk(ujra) Google Driver. Note that you should replace /path/gta_hl.pth with the path you store the pth file. You are supposed to get 43.06 mIoU on val set. Please refer to DAFormer for more details.

2. Semi-supervised semantic segmentation

Cityscapes: results are obtained by DeepLabv3+ with ResNet-101 backbone.

ResNet-101 1/16 1/8 1/4 1/2
SupOnly 65.7 72.5 74.4 77.8
U2PL (paper) 70.3 74.4 76.5 79.1
U2PL (reproduced) 71.1 75.2 75.9 78.4
U2PL + HL 72.6 76.0 76.6 79.6
UniMatch (paper) 75.7 77.3 78.7 _
UniMatch + HL 76.2 78.2 78.9 _

Note: The results of UniMatch are obtained by the ORIGINAL VERSION (NOT CVPR2023).

License

This project is released under the Apache 2.0 license.

Acknowledgment

This code is built using mmsegmentation repositories. Thanks a lot for their great work!

Citation

@misc{xiao2023pixels,
      title={Not All Pixels Are Equal: Learning Pixel Hardness for Semantic Segmentation}, 
      author={Xin Xiao and Daiguo Zhou and Jiagao Hu and Yi Hu and Yongchao Xu},
      year={2023},
      eprint={2305.08462},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}