Reproduction of the performance on "worst" label of "CE". #5

Seojin-Kim · 2022-09-13T18:47:06Z

Hello,

I'm trying to reproduce the performance of "CE" in the paper.

In "worst" label noise, the paper reports the test accuracy of 77.69 (in CIFAR-10-N).

However, when I run the provided code on my machine, the last test epoch accuracy is only 67.89 and it seems overfitting occurs to the training noisy labels.

Did you use validation set for evaluation? Or could you point out if I'm missing something?

Also, there's a discrepancy on the learning rate scheduling between the paper and code.

Learning rate decay is applied in 60th epoch in the code, but the paper says that it is applied in 50th epoch.

Could you check about it?

Thank you.

weijiaheng · 2023-03-16T02:32:34Z

Hi,

We shall clarify that the reported numbers for each method are based on the setting:
(1) Best-achieved test accuracy among 100 epochs;
(2) Each method is trained on the whole noisy label set (with 50K train images);
We did not select the model by referring to either a clean or noisy held-out validation set. We aim to compare the potential of each method, thus, we train each method on the whole noisy train set and the best-achieved test accuracy is reported for each method.

As for the learning rate decay, you may adopt our empirical implementation (the learning rate decay is applied in 60-th epoch).

weijiaheng closed this as completed Mar 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproduction of the performance on "worst" label of "CE". #5

Reproduction of the performance on "worst" label of "CE". #5

Seojin-Kim commented Sep 13, 2022

weijiaheng commented Mar 16, 2023

Reproduction of the performance on "worst" label of "CE". #5

Reproduction of the performance on "worst" label of "CE". #5

Comments

Seojin-Kim commented Sep 13, 2022

weijiaheng commented Mar 16, 2023