Questions about self-supervised learning on cifar10 #5

ZezhouCheng · 2020-10-18T23:46:28Z

Thanks for sharing the codes! This work is really interesting to me. My questions are as follows:

I'm trying to reproduce the results in Table 2. Specifically, I trained the models with/without self-supervised pre-training (SSP). However, the baselines (w.o. SSP) consistently outperform those with SSP under different training rules (including None, Resample, and Reweight). The best precisions are presented below. For each experimental setting, I run twice to see if the results are stable, so there're two numbers per cell.

For your reference, I used the following commands:

Train Rotation

python pretrain_rot.py --dataset cifar10  --imb_factor 0.01 --arch resnet32

Train baseline

python train.py --dataset cifar10 --imb_factor 0.01 --arch resnet32 --train_rule None

Train baseline + SSP

python train.py --dataset cifar10 --imb_factor 0.01 --arch resnet32 --train_rule None --pretrained_model xxx

The text was updated successfully, but these errors were encountered:

ZezhouCheng · 2020-10-19T01:28:02Z

I also tried the same experiments on cifar100, the results are as below:

Not sure if there's sth I missed, for example, do you use the same optimization hyperparameters like learning rate for these experiments. I feel like using the self-supervised pretrained feature generally requires different learning rates during the linear evaluation.

YyzHarry · 2020-10-19T04:10:48Z

Hi there, thanks for your interest! I just took a quick look on your results --- it seems a bit weird to me. From my experience, for train_rule as "None" (i.e., vanilla CE) on CIFAR-10-LT, typical number should be around 70% (see also baseline results from this paper and this paper). Also, for those baseline models, using other train_rules should lead to better results than with "None", which is not the case in your results. It seems to me the two rows are kind of like reversed (or something weird is happening). Can you please double check that the results are correct? If so, we might then further look into it and see what happens.

ZezhouCheng · 2020-10-20T13:53:16Z

Thanks for the quick reply! I double-checked my experiments. Sorry that it seems I mistakenly fixed the training rule to 'DRW' when I run vanilla CE on cifar10. Below is the updated results.

However, adding SSP still does not improve the performance. Here are the log files for the experiments on cifar10:

https://www.dropbox.com/sh/ex0oiduxu93u3y0/AACRVJu_bxbjTJg3nc-beKaBa?dl=0

ZezhouCheng · 2020-10-20T15:58:51Z

I'm able to reproduce the baselines using https://github.com/kaidic/LDAM-DRW. I noticed that in train.py, the batch size is set to 256 while the default batch size is 128 in DRW source code. It seems this makes a difference on cifar100. For example, I tested train.py with CE+rand. init.+ Resample: 30% accu with 256 bz --> 34% accu with 128 bz, but there is still a gap with LDAM-DRM which achieves 38.59% under the same setting. Not sure if there are any other factors that cause this gap.

YyzHarry · 2020-10-21T02:58:11Z

the batch size is set to 256 while the default batch size is 128 in DRW source code

That's a good catch. I just checked the setting when I ran the experiments, and the batch size I used is 128 on CIFAR. Thought there might be some inconsistency on the default value when I clean up the code (already updated it).

Regarding your questions, I just quickly ran three experiments. For baseline on CIFAR-10-LT with None, I got 71.03%. By using None + SSP, I got 73.99%. For DRW + SSP, I got 77.45%, which is even slightly higher than the number reported in our paper. I'm using the Rotation checkpoint I provided in this repo, which has 83.07% test acuracy, similar to yours. I also checked your log, which seems fine to me. So currently I'm not sure what causes the difference. I would suggest you try using the Rotation SSP checkpoint I provided, to see if there's any difference.

Otherwise, you may want to check whether the pre-trained weights are loaded correctly, or the exact training setting such as PyTorch version (1.4 for this repo), or how many GPUs used (for CIFAR experiments only 1), etc.

ZezhouCheng · 2020-10-21T16:42:53Z

Thanks for the help! Let me try these ideas.

87nohigher · 2021-10-30T09:45:51Z

I met the same problem.
Did you solve the question?
How did you solve it?
Thanks

Thanks for the help! Let me try these ideas.

ZezhouCheng closed this as completed Nov 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about self-supervised learning on cifar10 #5

Questions about self-supervised learning on cifar10 #5

ZezhouCheng commented Oct 18, 2020

ZezhouCheng commented Oct 19, 2020

YyzHarry commented Oct 19, 2020

ZezhouCheng commented Oct 20, 2020

ZezhouCheng commented Oct 20, 2020

YyzHarry commented Oct 21, 2020 •

edited

ZezhouCheng commented Oct 21, 2020

87nohigher commented Oct 30, 2021

Questions about self-supervised learning on cifar10 #5

Questions about self-supervised learning on cifar10 #5

Comments

ZezhouCheng commented Oct 18, 2020

ZezhouCheng commented Oct 19, 2020

YyzHarry commented Oct 19, 2020

ZezhouCheng commented Oct 20, 2020

ZezhouCheng commented Oct 20, 2020

YyzHarry commented Oct 21, 2020 • edited

ZezhouCheng commented Oct 21, 2020

87nohigher commented Oct 30, 2021

YyzHarry commented Oct 21, 2020 •

edited