Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions on the training of v2l. #2

Closed
WenLinLliu opened this issue Dec 11, 2023 · 2 comments
Closed

Questions on the training of v2l. #2

WenLinLliu opened this issue Dec 11, 2023 · 2 comments

Comments

@WenLinLliu
Copy link

Very nice work! One question is that, according to the paper, ClipCAP is pre-trained on COCO-captions using frozen RegionCLIP. However, there may exist domain gaps between COCO images and the datasets used in the paper, especially those stylized images. Does the gaps effect the pre-trained ClipCAP? Besides, it would be best to provide the complete configuration and the related code on the the training of ClipCAP with RegionCLIP.

@sinamalakouti
Copy link
Owner

@WenLinLliu
Thank you so much!

There may exist domain gaps between COCO images and the datasets used in the paper, especially those stylized images.

This is true. There is a domain gap between the COCO and the stylized images. So, as shown in the paper, the model doesn't produce meaningful captions on the stylized images in the beginning. Our goal is to resolve this by making the vision encoder robust. So, further training the RegionCLIP using the proposed approach to produce robust embedding for the image and its stylized version such that an arbitrary image-captioning model (in our case CLIPCAP) can produce meaningful captions.

It would be best to provide the complete configuration and the related code on the training of ClipCAP with RegionCLIP

Thanks for your comment. I have provided the parameters of the v2l module using the CLIPCAP code (https://drive.google.com/drive/folders/1agMffWa69paFGYs3s7ADT8VxStCWSTEt).

I also just included the instructions for running the CLIPCap code.

Please feel free to let me know if you have further questions!

@WenLinLliu
Copy link
Author

great thanks for your reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants