Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concern on the details of the comparison results in Table-2 #2

Closed
PkuRainBow opened this issue Oct 30, 2020 · 4 comments
Closed

Concern on the details of the comparison results in Table-2 #2

PkuRainBow opened this issue Oct 30, 2020 · 4 comments

Comments

@PkuRainBow
Copy link

PkuRainBow commented Oct 30, 2020

Really nice paper!

We carefully read your work and find the experimental settings on Pascal-VOC in Table-2 (as shown below) is really interesting: on the last column of Table-2, all the methods only use 92 images as the labeled set and choose the train-aug set (10582) as the unlabeled set according to the code :

wss/core/data_generator.py

Lines 85 to 104 in 8069dbe

_PASCAL_VOC_SEG_INFORMATION = DatasetDescriptor(
splits_to_sizes={
'train': 1464,
'train_aug': 10582,
'trainval': 2913,
'val': 1449,
# Splits for semi-supervised
'4_clean': 366,
'8_clean': 183,
# Balanced 1/16 split
# Sample with rejection, suffix represents the sample index
# e.g., 16_clean_3 represents the 3rd random shuffle to sample 1/16
# split, given a fixed random seed 8888
'16_clean_3': 92,
'16_clean_14': 92,
'16_clean_20': 92,
# More images
'coco2voc_labeled': 91937,
'coco2voc_unlabeled': 215340,
},

and,

split_name=FLAGS.train_split_cls,

Our understanding is that the FLAGS.train_split_cls represents the set of unlabeled images used for training and its value is train_aug by default. So the number of unlabeled images is nearly more than 100x than the number of unlabeled images. Given that the total training iteration number is set as training_number_of_steps=30000, therefore, we will iterate the sampled 92 labeled images for nearly 30000x64/92=20869 epochs. Is my understanding correct?

If my understanding is correct, we are curious about whether training for so many epochs on the 92 labeled images is a good choice. Besides, as the train-aug set (10582) contains the 92 labeled images, so we guess all the methods also apply the pseudo-label based methods/consistency based methods on the labeled images (instead of only on the unlabeled images).

Great thanks and wait for your explanation if my understanding is wrong!

image

@Yuliang-Zou
Copy link

Yuliang-Zou commented Oct 30, 2020

Hi @PkuRainBow

  1. Yes, your understanding is correct. We here use the same number of iterations for all the data splits, this is because we need to iterate through the unlabeled set enough times (if you count the number of epochs based on the unlabeled set, then they are the same).
  2. Yes, those 92 images are also in the unlabeled set. I follow the common practice in SSL classification here.

BTW, we sample those 92 images so that the number of pixels for each class is roughly balanced. You might not always get a good result if you pick arbitrary 92 images (see Appendix C).

@PkuRainBow
Copy link
Author

PkuRainBow commented Nov 2, 2020

@Yuliang-Zou Great thanks for your explanation. We still have a small concern about your experimental settings.

According to your explanation, in fact, your method will train over the 92 images (labeled set) for 20869 epochs, which might cause serious overfitting problems on the supervised loss training part. We also find that the authors of CutMix face the same challenge and we paste the discussion here: Britefury/cutmix-semisup-seg#5 (comment)

So we are really interested in how your experimental setting can address the overfitting problem? Hope for your explanation!

@Yuliang-Zou
Copy link

I don't have a clear answer yet. But I guess it could be related to the training schedule. In the beginning, the supervised loss dominates the optimization; as we train for more and more iterations, the unsupervised loss starts to take effects and gradually dominates the loss. Just for your reference, FixMatch (semi-supervised classification) has an experiment, training cifar10 on 10 images only, but it works quite well.

@PkuRainBow
Copy link
Author

@Yuliang-Zou Thanks for your reply. The balance between the supervised loss and the unsupervised loss might be a good point to avoid this problem. If my understanding is correct, it is very important to ensure the unsupervised loss to dominate in the late stage. However, there seem no explicit mechanisms to ensure such a scheme, therefore, we guess that an explicit re-weighting scheme might address this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants