Unable to reproduce the results of the paper #49

GuohongLi · 2019-11-16T06:27:31Z

For ImageNet_LT，I just use default config in the code, but cannot reproduce the results in paper Table3(a).

For stage1, my result is（some last logs when training complete）:
Epoch: [30/30] Step: 440 Minibatch_loss_performance: 2.833 Minibatch_accuracy_micro: 0.438
Epoch: [30/30] Step: 450 Minibatch_loss_performance: 2.886 Minibatch_accuracy_micro: 0.379
Phase: val
100%|██████████| 79/79 [01:40<00:00, 1.34it/s]
Phase: val
Evaluation_accuracy_micro_top1: 0.220
Averaged F-measure: 0.175
Many_shot_accuracy_top1: 0.427 Median_shot_accuracy_top1: 0.113 Low_shot_accuracy_top1: 0.007
Training Complete.
Best validation accuracy is 0.220 at epoch 30

Few/Low shot acc 0.7% is better with 0.4% Plain model in Table3(a) .

[Below is IMPORTANT!!!!!]
2.However for stage2, my result is（some last logs when training complete）:
Epoch: [60/60] Step: 440 Minibatch_loss_feature: 0.569 Minibatch_loss_performance: 2.938 Minibatch_accuracy_micro: 0.566
Epoch: [60/60] Step: 450 Minibatch_loss_feature: 0.567 Minibatch_loss_performance: 2.845 Minibatch_accuracy_micro: 0.539
Phase: val
100%|██████████| 79/79 [01:34<00:00, 1.02it/s]
Phase: val
Evaluation_accuracy_micro_top1: 0.340
Averaged F-measure: 0.324
Many_shot_accuracy_top1: 0.401 Median_shot_accuracy_top1: 0.334 Low_shot_accuracy_top1: 0.197
Training Complete.
Best validation accuracy is 0.341 at epoch 48

However Many, Median and Few/Low shot acc are 40.1%, 33.4% and 19.7%, which are a little diff with 43.2%, 35.1% and 18.5% in "Ours" model in Table3(a) .
And I retrained for several times, the Many-shot acc always some lower than 43.2%.

Are there any tricks not released?

zhmiao · 2019-12-19T13:35:21Z

@GuohongLi Thank you very much for asking and sorry for the late reply. As replied in your last issue (#50 (comment)), we finally debugged the published code and current open set performance is:

============
Phase: test

Evaluation_accuracy_micro_top1: 0.361
Averaged F-measure: 0.501
Many_shot_accuracy_top1: 0.442 Median_shot_accuracy_top1: 0.352 Low_shot_accuracy_top1: 0.175

==========

This is higher than we reported in the paper. We updated some of the modules with clone() method, and set use_fc in the first stage to False. These changes will lead us to the proper results. Please have a try. Thank you very much again.

For Places, the current config won't work either. The reason why we could not get the reported results is that we forget that on the first stage, we actually did not freeze the weights. We only freeze the weights on the second stage. We will update the corresponding code as soon as possible.

zhmiao · 2020-02-11T21:52:47Z

@GuohongLi Hello, we just updated new configuration files for places, and the newly implemented results are a little better than reported. Please check out the updates. Thanks!

zhmiao closed this as completed Dec 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to reproduce the results of the paper #49

Unable to reproduce the results of the paper #49

GuohongLi commented Nov 16, 2019 •

edited

Loading

zhmiao commented Dec 19, 2019

zhmiao commented Feb 11, 2020

Unable to reproduce the results of the paper #49

Unable to reproduce the results of the paper #49

Comments

GuohongLi commented Nov 16, 2019 • edited Loading

zhmiao commented Dec 19, 2019

zhmiao commented Feb 11, 2020

GuohongLi commented Nov 16, 2019 •

edited

Loading