-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
why is it that using the pre-training model you provided, without any changes in the code, the test results vary greatly, even up to 10 point fluctuations? #1
Comments
Hi there, thanks for your question. Can I ask which pretrained model do you use? I don't provide the pretrained model for URT. Do you mean the pretrained backbones provided from SUR? |
Yes, i used the pretrained backbones provided from SUR. |
For SUR, please refer to their repo for more details: https://github.com/dvornikita/SUR For URT, our result is evaluated on the average of three runs while I don't observe such big fluctuations of 10 percent. Hope this answers your question. |
Do you have updated Traffic Sign results with fixed loader issue? What we get using your repo for Traffic Sign with latest Meta_dataset loader is about 50%. Could you please help to confirm that? |
Hi, model \ data sur-paper sur-exp urt ok-06/10 |
I used the resnet features released in the repo and got the same results as in the paper for all the datasets except for traffic sign and MNIST. Thanks for releasing the features! |
Thanks for raising this issue. Just a kind reminder that because of a shuffling issue as described here: google-research/meta-dataset#54, the result has been affected especially for traffic signs and the new result has been updated in the open review system. |
Yeah, I am aware of the bug, thanks for letting me know though, but I am still using the old buggy dataloader just to see if I can get the same results as you guys got. I am just trying to calibrate my meta-dataset setup with your code, I think there is something not right about my meta-dataset setup. Would it be possible to release the tf-records you guys used ? Of course I understand that is a lot of work, but it would be super helpful. Any help would be much appreciated. |
One more question: are the standard deviations mentioned in the paper calculated from 3 different runs or are they calculated from the 600 test tasks within a run ? Thank you for your time ! |
Hi, Thank you for sharing your code. But why is it that using the pre-training model you provided, without any changes in the code, the test results vary greatly, even up to 10 point fluctuations?May I ask how the test results provided in your paper can be determined as the final result when the performance fluctuates so much? Looking forward to your reply.
The text was updated successfully, but these errors were encountered: