Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicating few-shot results #5

Closed
samuelyu2002 opened this issue Jul 16, 2022 · 3 comments
Closed

Replicating few-shot results #5

samuelyu2002 opened this issue Jul 16, 2022 · 3 comments

Comments

@samuelyu2002
Copy link

In Table 7 of the paper, there are results showing Wise-FT with a linear classifier and the ViT/B-16 backbone can get 73% accuracy on a 16-shot imagenet dataset. It was mentioned that the learning rate was 10e-5 and it was trained for 10 epochs, but even with this information, I still cannot replicate the result shown in the paper. I was wondering if I could be provided with an exact command, or additional hyperparameters (e.g. batch size, number of warmup steps, etc.) so that this result can be replicated?

@mitchellnw
Copy link
Contributor

Thanks a lot for the question. In the paper we write 10^{-5} which is actually 1e-5 not 10e-5, hopefully this is the issue! We use the default hparams mentioned of batch size 512 and this aug for training.

@samuelyu2002
Copy link
Author

Thanks!

@guozix
Copy link

guozix commented Jan 3, 2023

@mitchellnw
Thanks for your reply!
I have an additional question: Is the learning rate of learning rate of WiSE-FT(linear classifier) the same with that of WiSE-FT(end-2-end) in Table 7 ? And same question of Figure 16 vs Figure 17.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants