Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to reproduce the results of the paper #49

Closed
GuohongLi opened this issue Nov 16, 2019 · 2 comments
Closed

Unable to reproduce the results of the paper #49

GuohongLi opened this issue Nov 16, 2019 · 2 comments

Comments

@GuohongLi
Copy link

GuohongLi commented Nov 16, 2019

For ImageNet_LT,I just use default config in the code, but cannot reproduce the results in paper Table3(a).

  1. For stage1, my result is(some last logs when training complete):
    Epoch: [30/30] Step: 440 Minibatch_loss_performance: 2.833 Minibatch_accuracy_micro: 0.438
    Epoch: [30/30] Step: 450 Minibatch_loss_performance: 2.886 Minibatch_accuracy_micro: 0.379
    Phase: val
    100%|██████████| 79/79 [01:40<00:00, 1.34it/s]
    Phase: val
    Evaluation_accuracy_micro_top1: 0.220
    Averaged F-measure: 0.175
    Many_shot_accuracy_top1: 0.427 Median_shot_accuracy_top1: 0.113 Low_shot_accuracy_top1: 0.007
    Training Complete.
    Best validation accuracy is 0.220 at epoch 30

Few/Low shot acc 0.7% is better with 0.4% Plain model in Table3(a) .


[Below is IMPORTANT!!!!!]
2.However for stage2, my result is(some last logs when training complete):
Epoch: [60/60] Step: 440 Minibatch_loss_feature: 0.569 Minibatch_loss_performance: 2.938 Minibatch_accuracy_micro: 0.566
Epoch: [60/60] Step: 450 Minibatch_loss_feature: 0.567 Minibatch_loss_performance: 2.845 Minibatch_accuracy_micro: 0.539
Phase: val
100%|██████████| 79/79 [01:34<00:00, 1.02it/s]
Phase: val
Evaluation_accuracy_micro_top1: 0.340
Averaged F-measure: 0.324
Many_shot_accuracy_top1: 0.401 Median_shot_accuracy_top1: 0.334 Low_shot_accuracy_top1: 0.197
Training Complete.
Best validation accuracy is 0.341 at epoch 48

However Many, Median and Few/Low shot acc are 40.1%, 33.4% and 19.7%, which are a little diff with 43.2%, 35.1% and 18.5% in "Ours" model in Table3(a) .
And I retrained for several times, the Many-shot acc always some lower than 43.2%.


Are there any tricks not released?

@zhmiao
Copy link
Owner

zhmiao commented Dec 19, 2019

@GuohongLi Thank you very much for asking and sorry for the late reply. As replied in your last issue (#50 (comment)), we finally debugged the published code and current open set performance is:

============
Phase: test

Evaluation_accuracy_micro_top1: 0.361
Averaged F-measure: 0.501
Many_shot_accuracy_top1: 0.442 Median_shot_accuracy_top1: 0.352 Low_shot_accuracy_top1: 0.175

==========

This is higher than we reported in the paper. We updated some of the modules with clone() method, and set use_fc in the first stage to False. These changes will lead us to the proper results. Please have a try. Thank you very much again.

For Places, the current config won't work either. The reason why we could not get the reported results is that we forget that on the first stage, we actually did not freeze the weights. We only freeze the weights on the second stage. We will update the corresponding code as soon as possible.

@zhmiao zhmiao closed this as completed Dec 19, 2019
@zhmiao
Copy link
Owner

zhmiao commented Feb 11, 2020

@GuohongLi Hello, we just updated new configuration files for places, and the newly implemented results are a little better than reported. Please check out the updates. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants