You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Table 7 of the paper, there are results showing Wise-FT with a linear classifier and the ViT/B-16 backbone can get 73% accuracy on a 16-shot imagenet dataset. It was mentioned that the learning rate was 10e-5 and it was trained for 10 epochs, but even with this information, I still cannot replicate the result shown in the paper. I was wondering if I could be provided with an exact command, or additional hyperparameters (e.g. batch size, number of warmup steps, etc.) so that this result can be replicated?
The text was updated successfully, but these errors were encountered:
Thanks a lot for the question. In the paper we write 10^{-5} which is actually 1e-5 not 10e-5, hopefully this is the issue! We use the default hparams mentioned of batch size 512 and this aug for training.
@mitchellnw
Thanks for your reply!
I have an additional question: Is the learning rate of learning rate of WiSE-FT(linear classifier) the same with that of WiSE-FT(end-2-end) in Table 7 ? And same question of Figure 16 vs Figure 17.
In Table 7 of the paper, there are results showing Wise-FT with a linear classifier and the ViT/B-16 backbone can get 73% accuracy on a 16-shot imagenet dataset. It was mentioned that the learning rate was 10e-5 and it was trained for 10 epochs, but even with this information, I still cannot replicate the result shown in the paper. I was wondering if I could be provided with an exact command, or additional hyperparameters (e.g. batch size, number of warmup steps, etc.) so that this result can be replicated?
The text was updated successfully, but these errors were encountered: