Cannot get the same acc? #8

getterk96 · 2020-07-24T08:37:39Z

I appreciate the work you guys done and the contribution is remarkable!

I'm trying to rebuild a stage-1 model from the script named as feat_uniform.yaml that you provided. The only change that I made is changing the batchsize from 512 to 256. After I trained the stage-1 model, I got the accs on ImagetNet_LT as follows:

 Evaluation_accuracy_micro_top1: 0.417 
 Averaged F-measure: 0.376 
 Many_shot_accuracy_top1: 0.650 Median_shot_accuracy_top1: 0.333 Low_shot_accuracy_top1: 0.052

Then, I trained the stage-2 model using the script cls_crt.yaml, trying to get the accuracies recorded in the paper. I got the following results:

 Evaluation_accuracy_micro_top1: 0.476 
 Averaged F-measure: 0.461 
 Many_shot_accuracy_top1: 0.601 Median_shot_accuracy_top1: 0.438 Low_shot_accuracy_top1: 0.258

I also used the pretrained model you provided as the base model for stage-2 training, then I got results approximately same as the paper mentioned. I realise that maybe the stage-1 training configuration is not the optimal one that you guys used to train the pretrained model. If so, could you please update the training script to the version that may reproduce the final accs?

Thanks

The text was updated successfully, but these errors were encountered:

bingykang · 2020-07-24T11:50:55Z

When batch size is changed, the learning rate should be modified accordingly. In your case, the learning rate should be 0.1.

getterk96 · 2020-07-24T13:04:28Z

Thanks for your comments

getterk96 · 2020-08-01T14:28:28Z

Hi! And sorry to bother you guys again,
but I'm still struggling reproducing the performance of the stage-1 model on iNaturalist18.
I noticed the already provided script here. What do I need to modify when I change the num_epochs to 200?

bingykang · 2020-08-01T15:41:19Z

Nothing, just make sure you are using cosine learning rate decay and keep the linear relation between batch size and learning rate.

getterk96 · 2020-08-04T12:28:21Z

Many thanks~

pengzhiliang · 2020-09-17T03:51:01Z

Hello, I am sorry to bother you, @getterk96 . I get the same problem that can't reproduce the accuracy as reported in the paper. when I use the crt stage-2 weight provided in github resnext50_crt_uni2bal by author, without any changes in config file except weight path, I get the accuracy on val set as follows:

 Phase: val 

 Evaluation_accuracy_micro_top1: 0.490 
 Averaged F-measure: 0.478 
 Many_shot_accuracy_top1: 0.610 Median_shot_accuracy_top1: 0.459 Low_shot_accuracy_top1: 0.265

And when I test it on test set, I get:

 Phase: test 

 Evaluation_accuracy_micro_top1: 0.481 
 Averaged F-measure: 0.467 
 Many_shot_accuracy_top1: 0.602 Median_shot_accuracy_top1: 0.445 Low_shot_accuracy_top1: 0.266 

60.2     44.5    26.6    48.1

but the result in Tabel 7 in paper is:

many: 61.8 median: 46.2 few: 27.4, all: 49.6

which is very different from my experimental results.

Therefore, I am very confused about it. I have two gusses, one is some problem in my dataset. and the other is the accuary in the paper maybe on val set because the accuary difference is small than test set in my experiment.

Can you tell me your judgment or your experimental results? And looking forward to your reply @bingykang

getterk96 · 2020-09-17T12:21:33Z

你需要检查一阶段的训练，我的经验是这个东西的效果受一阶段的影响很大，以及imagenet lt上的精度是可以复现的，我这边一阶段44.7，二阶段49.9 On 09/17/2020 11:51, Zhiliang Peng wrote: Hello, I am sorry to bother you, @getterk96 . I get the same problem that can't reproduce the accuracy as reported in the paper. when I use the crt stage-2 weight provided in github resnext50_crt_uni2bal by author, without any changes in config file except weight path, I get the accuracy on val set as follows: Phase: val Evaluation_accuracy_micro_top1: 0.490 Averaged F-measure: 0.478 Many_shot_accuracy_top1: 0.610 Median_shot_accuracy_top1: 0.459 Low_shot_accuracy_top1: 0.265 And when I test it on test set, I get: Phase: test Evaluation_accuracy_micro_top1: 0.481 Averaged F-measure: 0.467 Many_shot_accuracy_top1: 0.602 Median_shot_accuracy_top1: 0.445 Low_shot_accuracy_top1: 0.266 60.2 44.5 26.6 48.1 but the result in Tabel 7 in paper is: many: 61.8 median: 46.2 few: 27.4, all: 49.6 which is very different from my experimental results. Therefore, I am very confused about it. I have two gusses, one is some problem in my dataset. and the other is the accuary in the paper maybe on val set because the accuary difference is small than test set in my experiment. Can you tell me your judgment or your experimental results? And looking forward to your reply @bingykang — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

pengzhiliang · 2020-09-17T12:42:15Z

哦哦，那就是论文里面的acc就是test上面的呗，不是val set上面，只不过方差大不稳定对吧？然后就是github上的权重都不是最优的，我说怎么直接测试作者给的第二阶段的权重的精度都比较差。行，非常感谢！！！ @getterk96

getterk96 closed this as completed Jul 24, 2020

getterk96 reopened this Aug 1, 2020

getterk96 closed this as completed Aug 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot get the same acc? #8

Cannot get the same acc? #8

getterk96 commented Jul 24, 2020

bingykang commented Jul 24, 2020

getterk96 commented Jul 24, 2020

getterk96 commented Aug 1, 2020

bingykang commented Aug 1, 2020

getterk96 commented Aug 4, 2020

pengzhiliang commented Sep 17, 2020

getterk96 commented Sep 17, 2020 via email

pengzhiliang commented Sep 17, 2020

Cannot get the same acc? #8

Cannot get the same acc? #8

Comments

getterk96 commented Jul 24, 2020

bingykang commented Jul 24, 2020

getterk96 commented Jul 24, 2020

getterk96 commented Aug 1, 2020

bingykang commented Aug 1, 2020

getterk96 commented Aug 4, 2020

pengzhiliang commented Sep 17, 2020

getterk96 commented Sep 17, 2020 via email

pengzhiliang commented Sep 17, 2020