The Pytorch implementation of Expand Prototype Modalities
- Python 3.6
- pytorch>=1.6.0
- torchvision
- CUDA>=9.0
- pydensecrf from https://github.com/lucasb-eyer/pydensecrf
- others (opencv-python etc.)
- Clone this repository.
- Data preparation.
Download PASCAL VOC 2012 devkit following instructions in here
It is suggested to make a soft link toward downloaded dataset.
Then download the annotation of VOC 2012 trainaug set (containing 10582 images) from here and place them all as
VOC2012/SegmentationClassAug/xxxxxx.png
. Download the image-level labelscls_label.npy
from here and place it intovoc12/
, or you can generate it by yourself. - Download ImageNet pretrained backbones. We use ResNet-38 for initial seeds generation and ResNet-101 for segmentation training. Download pretrained ResNet-38 from here. The ResNet-101 can be downloaded from here.
Download the trained models and category performance below.
baseline | model | train mIoU | val mIoU | test mIoU | checkpoint (OneDrive) | category performance (test) |
---|---|---|---|---|---|---|
PPC | contrast | 61.5 | 58.4 | - | [download] | |
affinitynet | 69.2 | - | [download] | |||
deeplabv1 | - | 67.7* | 67.4* | [download] | [link] | |
Ours | contrast | 64.4 | 61.3 | - | [download] | |
deeplabv2 | - | 69.0 | 69.6 | [download] | [link] |
* indicates using densecrf.
The training results including initial seeds, intermediate products and pseudo masks can be found here.
.. Trained weight and mask will be published soon
-
Contrast train.
python contrast_clustering_train.py \ --session_name $your_session_name \ --network network.resnet38_contrast_clustering \ --lr 0.01 --num_workers 8 --train_list voc12/train_aug.txt \ --weights pretrained/ilsvrc-cls_rna-a1_cls1000_ep-0001.params \ --voc12_root /home/subin/Datasets/VOC2012/VOCdevkit/VOC2012 \ --tblog_dir ./tblog --batch_size 8 --max_epoches 8
-
Contrast inference.
Train from scratch, set
--weights
and then run:python contrast_infer.py \ --weights $contrast_weight \ --infer_list $[voc12/val.txt | voc12/train.txt | voc12/train_aug.txt] \ --out_cam $your_cam_npy_dir \ --out_cam_pred $your_cam_png_dir \ --out_crf $your_crf_png_dir
-
Evaluation.
Following SEAM, we recommend you to use
--curve
to select an optimial background threshold.python eval.py \ --list VOC2012/ImageSets/Segmentation/$[val.txt | train.txt] \ --predict_dir $your_result_dir \ --gt_dir VOC2012/SegmentationClass \ --comment $your_comments \ --type $[npy | png] \ --curve True
We sincerely thank to the author of PPC Ye Du who opened their work so we could borrow their codebase for this repository. I would like to express my gratitude to them for their work and to the owner of the original repository that they referenced.
I would also like to express my gratitude to Mengyang Zhao for sharing their codebase for faster mean shift using PyTorch. We have adapted their codebase to enable certain scikit-learn based CPU computations to be performed on a GPU.
Without them, we could not finish this work.