Replicating few-shot results #5

samuelyu2002 · 2022-07-16T06:05:53Z

In Table 7 of the paper, there are results showing Wise-FT with a linear classifier and the ViT/B-16 backbone can get 73% accuracy on a 16-shot imagenet dataset. It was mentioned that the learning rate was 10e-5 and it was trained for 10 epochs, but even with this information, I still cannot replicate the result shown in the paper. I was wondering if I could be provided with an exact command, or additional hyperparameters (e.g. batch size, number of warmup steps, etc.) so that this result can be replicated?

mitchellnw · 2022-07-17T01:32:25Z

Thanks a lot for the question. In the paper we write 10^{-5} which is actually 1e-5 not 10e-5, hopefully this is the issue! We use the default hparams mentioned of batch size 512 and this aug for training.

samuelyu2002 · 2022-07-17T19:03:51Z

Thanks!

guozix · 2023-01-03T07:38:23Z

@mitchellnw
Thanks for your reply!
I have an additional question: Is the learning rate of learning rate of WiSE-FT(linear classifier) the same with that of WiSE-FT(end-2-end) in Table 7 ? And same question of Figure 16 vs Figure 17.

samuelyu2002 closed this as completed Jul 16, 2022

samuelyu2002 reopened this Jul 16, 2022

samuelyu2002 closed this as completed Jul 16, 2022

samuelyu2002 reopened this Jul 16, 2022

samuelyu2002 closed this as completed Jul 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicating few-shot results #5

Replicating few-shot results #5

samuelyu2002 commented Jul 16, 2022

mitchellnw commented Jul 17, 2022

samuelyu2002 commented Jul 17, 2022

guozix commented Jan 3, 2023

Replicating few-shot results #5

Replicating few-shot results #5

Comments

samuelyu2002 commented Jul 16, 2022

mitchellnw commented Jul 17, 2022

samuelyu2002 commented Jul 17, 2022

guozix commented Jan 3, 2023