Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Learnable group tokens fine-tuning on out-of-domain datasets #52

Open
AhmedBourouis opened this issue Jan 25, 2023 · 1 comment
Open

Comments

@AhmedBourouis
Copy link

Hi! Thank you for the great work and neat implementation.

In section C.4 of the paper you reported results from testing GroupViT pretrained model on COCO dataset, which were quit impressive but not as good as the ones on PASCAL VOC. This is probably due to the domain shift between PASCAL VOC and COCO images, classes/text descriptions.

I was wondering if it's possible to fine-tune GroupViT on COCO dataset (and out-of-domain datasets in general) by freezing the model's weights and training only the learnable group tokens on the new datasets in a few-shot manner.

If it's the case, I'm ready to implement this with you're guidance.

@xvjiarui
Copy link
Contributor

Hi @AhmedBourouis

Thanks for your interest in our work.

Regarding domain shift, the GroupViT is trained on neither PASCAL VOC nor COCO datasets. So the domain shift to Pascal and COCO should be roughly the same.

Nevertheless, it is totally possible to fine-tuning the group token only. You may need to freeze the text encoder as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants