New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how is robustness calculated? #5
Comments
Hi, thank you for your support! CIFAR-{10, 100}-C and ImageNet-C consist of 75 datasets (= data corrupted by 15 different types with 5 levels of intensity each). The robustness in this paper is the average of the accuracies on these 75 corrupted datasets. In particular, I recommend that you measure the robustness as follows:
To avoid confusion: rigorously, we do not use the following types of datasets for evaluation: The batch size is 256 by default, but I believe the robustness is independent of the batch size. |
Closing this issue based on the comment above. Please feel free to reopen this issue if the problem still exists. |
Sure thing, please close the issue. |
Thank you for your constructive feedback. I agree with your comments that releasing intermediate results would be helpful, because evaluating pretrained models on 75 datasets can be resource intensive. I will release robustness sheets as intermediate results for some models, and make the pretrained models easily accessible. |
Hi,
thank you for this wonderful work on vision transformers and how to understand them. I have some simple questions which I must apologize for.
I tried to reproduce figure 12 independently of your code base. I struggle a bit to understand the code. Is is correct that you define robustness as
robustness = mean(accuracy(y_val_true, y_val_pred))
?Related to this, do I understand correctly that you compute this accuracy on batches of the validation dataset? These batches are of size
256
, right?Thanks.
The text was updated successfully, but these errors were encountered: