Expand the size of CLIP #48

Shuaizhang7 · 2024-04-01T03:05:21Z

Hello, thank you very much for your work. I would like to ask whether expanding the scale of pre-trained CLIP will help improve the final results? For example, change ViT-B-16 to ViT-L-14. After I changed CLIP to ViT-L-14, the results did not improve, so I don't know if there is a problem with my change or if this change is methodologically useless.

timojl · 2024-04-03T13:04:36Z

Thanks for your interest in our work. I'd expect ViT-L-14 to yield better results but hyperparameters might need to be changed. In particular, avoiding overfitting could be more challenging due to the internal dimension of 1024 in ViT-L.

timojl closed this as completed Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand the size of CLIP #48

Expand the size of CLIP #48

Shuaizhang7 commented Apr 1, 2024

timojl commented Apr 3, 2024

Expand the size of CLIP #48

Expand the size of CLIP #48

Comments

Shuaizhang7 commented Apr 1, 2024

timojl commented Apr 3, 2024