Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training time and grid pseudo label extracting time #5

Closed
9115jin opened this issue Mar 29, 2023 · 2 comments
Closed

Training time and grid pseudo label extracting time #5

9115jin opened this issue Mar 29, 2023 · 2 comments

Comments

@9115jin
Copy link

9115jin commented Mar 29, 2023

Hello, I saw the results of your paper and they were truly outstanding.
I have a few questions.

  1. Could you tell me how long it takes to do pretraining and fine-tuning for the coco image-to-text retrieval?
  2. Also, from what I read in your paper, obtaining the grid pseudo label using CLIP takes around 8 hours. Could I understand that the grid pseudo label is a corpus that is extracted to provide positional information through prompts?

Thank you😁!

@FingerRec
Copy link
Collaborator

Hi 9115jin:

  1. The training time is include in the training logs.
    For example, on 8 NVIDIA A100 GPUs:
    The pretrain time is Training time 1 day, 2:32:57.
    The ft time for coco retrieval is Training time 6:53:15.

  2. Exactly. Use CLIP feedforward only to extract most similar keywords/phrases is very fast.

@9115jin
Copy link
Author

9115jin commented Mar 30, 2023

Thank you for your prompt and accurate response!

I'm planning to start researching image-to-text retreival(TR),
and i believe your PTP-BLIP project will be very helpful.

@9115jin 9115jin closed this as completed Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants