Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚨 fix(SigLip): remove spurious exclusion of first vision output token #30952

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

transmissions11
Copy link

@transmissions11 transmissions11 commented May 22, 2024

Looks like the first token is spuriously excluded from the average pooling because this code was copy & pasted from modeling_clip, which does prepend a [CLS] token. However SigLip has no such special token, so it would seem we are needlessly excluding a token from being included?

p.s. why not use the pooled_output from the model's map head vs averaging the last hidden states?

@NielsRogge
Copy link
Contributor

There's another PR which suggested to use the pooled output rather than averaging the patch tokens: #30373.

The reason I went with the latter was because of this paper (follow-up paper by the ViT authors which showed that it works better than the CLS token). However averaging patch tokens might work better for ViTs pre-trained using (self-)supervised image classification. For contrastive models like CLIP and SigLIP, perhaps it makes more sense to use the pooled output. I'm happy to change it, although we would need to make it backwards compatible

@amyeroberts amyeroberts changed the title fix(SigLip): remove spurious exclusion of first vision output token 🚨 fix(SigLip): remove spurious exclusion of first vision output token Jun 13, 2024
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

Only thing we need to do is run the model slow tests. Pushing the following should trigger the necessary CI run:
git commit --allow-empty -m "[run-slow] siglip"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants