This repo contains the code for our IJCAI 2022 Long Oral paper "Beyond the Prototype: Divide-and-conquer Proxies for Few-shot Segmentation" by Chunbo Lang, Binfei Tu, Gong Cheng and Junwei Han.
Abstract: Few-shot segmentation, which aims to segment unseen-class objects given only a handful of densely labeled samples, has received widespread attention from the community. Existing approaches typically follow the prototype learning paradigm to perform meta-inference, which fails to fully exploit the underlying information from support image-mask pairs, resulting in various segmentation failures, e.g., incomplete objects, ambiguous boundaries, and distractor activation. To this end, we propose a simple yet versatile framework in the spirit of divide-and-conquer. Specifically, a novel self-reasoning scheme is first implemented on the annotated support image, and then the coarse segmentation mask is divided into multiple regions with different properties. Leveraging effective masked average pooling operations, a series of support-induced proxies are thus derived, each playing a specific role in conquering the above challenges. Moreover, we devise a unique parallel decoder structure that integrates proxies with similar attributes to boost the discrimination power. Our proposed approach, named divide-and-conquer proxies (DCP), allows for the development of appropriate and reliable information as a guide at the "episode" level, not just about the object cues themselves. Extensive experiments on PASCAL-5i and COCO-20i demonstrate the superiority of DCP over conventional prototype-based approaches (up to 5~10% on average), which also establishes a new state-of-the-art.
- Python 3.8
- PyTorch 1.7.0
- cuda 11.0
- torchvision 0.8.1
- tensorboardX 2.14
- Download the pre-trained backbones from here and put them into the
DCP/initmodel
directory.
-
Change configuration via the
.yaml
files inDCP/config
, then run the.sh
scripts for training and testing. -
Stage1 Meta-training
sh train.sh
-
Stage2 Meta-testing
sh test.sh
Performance comparison with the state-of-the-art approach (i.e., PFENet) in terms of average mIoU across all folds.
-
Backbone Method 1-shot 5-shot VGG16 PFENet 58.00 59.00 DCP (ours) 61.31 (+3.31) 65.84 (+6.84) ResNet50 PFENet 60.80 61.90 DCP (ours) 62.80 (+2.00) 67.80 (+5.90) -
Backbone Method 1-shot 5-shot ResNet101 PFENet 38.50 42.70 ResNet50 DCP (ours) 41.39 (+2.89) 46.48 (+3.78)
This repo is mainly built based on PFENet, SCL and SemSeg. Thanks for their great work!
- Support different backbones
- Multi-GPU training
If you find our work and this repository useful. Please consider giving a star ⭐ and citation 📚.
@InProceedings{lang2022dcp,
title={Beyond the Prototype: Divide-and-conquer Proxies for Few-shot Segmentation},
author={Lang, Chunbo and Tu, Binfei and Cheng, Gong and Han, Junwei},
booktitle={Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI)},
year={2022}
}