Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accuracy results on cifar100 #4

Closed
shuikehuo opened this issue Sep 14, 2019 · 4 comments
Closed

Accuracy results on cifar100 #4

shuikehuo opened this issue Sep 14, 2019 · 4 comments

Comments

@shuikehuo
Copy link

this paper reports 75.30 accuracy on the clean test set. But I obatin 78.16 accuracy on the same test set, I use resnet50 with SGD + momentum optimizer trained for 350 epoch.

@rohan-anil
Copy link
Collaborator

rohan-anil commented Sep 14, 2019

Hi Shuikehuo,

We used Resnet-56 without batch norm from [1] which explains the accuracy difference (and a weaker baseline). And it was trained with SGD optimizer for 50k steps with batch size 128.

The experiment shows the effect of noisy labels on the test accuracy when trained with logistic loss, and with bi-tempered logistic loss. We expect that the results in terms of accuracy delta to remain similar even when trained with the Resnet-50 (with batch norm) model or models of similar capacity. We will make available the code for Resnet-56 model without batch norm from [1] soon to reproduce the results.

Thanks,

[1] Identity Matters in Deep Learning, Moritz Hardt, Tengyu Ma, https://arxiv.org/pdf/1611.04231.pdf

@Charles-Xie
Copy link

Hi Shuikehuo,

We used Resnet-56 without batch norm from [1] which explains the accuracy difference (and a weaker baseline). And it was trained with SGD optimizer for 50k steps with batch size 128.

The experiment shows the effect of noisy labels on the test accuracy when trained with logistic loss, and with bi-tempered logistic loss. We expect that the results in terms of accuracy delta to remain similar even when trained with the Resnet-50 (with batch norm) model or models of similar capacity. We will make available the code for Resnet-56 model without batch norm from [1] soon to reproduce the results.

Thanks,

[1] Identity Matters in Deep Learning, Moritz Hardt, Tengyu Ma, https://arxiv.org/pdf/1611.04231.pdf

@rohan-anil Is there any reason to use resnet56 without batch normalization? This network seems not to be used a lot in experiments.

When I use resnet110 with BN (as introduced in ResNet v1 paper), the accuracy delta (improvement) does not seems to be very obvious, for clean or noisy labels.

result

@eamid
Copy link
Collaborator

eamid commented Jan 15, 2020

Hi Chi,

Thank you for your interest in our method.

We used the Resnet-56 model because we had the baseline easily available (Moritz was at google, and we used his codebase). I noticed that the bi-tempered loss still gives some improvements in your case. You might achieve even more improvement by tuning t1 and t2 (I would suggest trying a larger t2 value).

Ehsan

@Charles-Xie
Copy link

@eamid
Thanks a lot!

@eamid eamid closed this as completed Apr 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants