difference between 'ViT-B-32' and 'ViT-B-32-quickgelu' #444

JiachengCheng96 · 2023-02-20T20:04:26Z

JiachengCheng96
Feb 20, 2023

Hi, thank you again for your amazing work!

I noticed that there are multiple ViT-B/32 models pretrained models: ('ViT-B-32-quickgelu', 'laion400m_e32') and ('ViT-B-32', 'laion400m_e32').
According to your description in README, it seems 'laion400m_e32' was trained with quickgelu, and the only difference between ('ViT-B-32-quickgelu', 'laion400m_e32') and ('ViT-B-32', 'laion400m_e32') is in activation. I am not sure if my understanding is correct?

Thanks again for your work!

JiachengCheng96 · 2023-02-21T00:05:54Z

JiachengCheng96
Feb 21, 2023
Author

Also, I am curious whether final accuracy of the model will be affected if nn.GELU instead of QuickGELU is used during training. It would be greatly appreciated if there is any information you can share.

0 replies

rwightman · 2023-02-21T00:48:04Z

rwightman
Feb 21, 2023
Maintainer

@JiachengCheng96 the B/32 was the last model trained with the original quickgelu (it's actually slower than the native pytorch GELU that was added after the origin OpenAI models). There is an accuraccy difference which is why both were released, so if you need the full accuraccy for zero-shot the quickgelu is needed, but it's not a huge difference so if you were fine-tuning I'd use the non quick version.

In any case the LAION2B weights are much better so I'd use those and not worry about this at all :)

1 reply

JiachengCheng96 Feb 23, 2023
Author

@rwightman Thanks for the suggestion! Just to confirm, quickgelu wasn't used for your B/16 model since there is no ViT-B-16-quickgelu released model? Please correct me if I misunderstood. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

difference between 'ViT-B-32' and 'ViT-B-32-quickgelu' #444

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

difference between 'ViT-B-32' and 'ViT-B-32-quickgelu' #444

JiachengCheng96 Feb 20, 2023

Replies: 2 comments · 1 reply

JiachengCheng96 Feb 21, 2023 Author

rwightman Feb 21, 2023 Maintainer

JiachengCheng96 Feb 23, 2023 Author

JiachengCheng96
Feb 20, 2023

Replies: 2 comments 1 reply

JiachengCheng96
Feb 21, 2023
Author

rwightman
Feb 21, 2023
Maintainer

JiachengCheng96 Feb 23, 2023
Author