How do we perform pre-training/fine-tuning for visual quesion answering task on custom dataset. #23

kirito-0512 · 2024-02-26T09:04:45Z

I would greatly value your assistance in offering guidance for initiating pre-training/fine-tuning on the Visual Question Answering (VQA) task, specifically in the following aspects:

The necessary format for the required dataset.
Minimum hardware requirements for its execution.

Please note that while this question might be straightforward and potentially addressed by reviewing the model documentation, I am seeking an expert opinion on this matter.

Thank you sincerely.

lorenmt · 2024-02-27T22:40:46Z

Hello, we have released VQA checkpoints in this repo, you can try it out first to see if it works within your needs. Otherwise, you should just follow the instructions in the documentation, i.e. getting the expert labels ready and modify the training config scripts.

kirito-0512 · 2024-03-01T06:46:37Z

Thank you!, will work on it would be higly appreciated if you could also provide any additonal resources.

kirito-0512 closed this as completed Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do we perform pre-training/fine-tuning for visual quesion answering task on custom dataset. #23

How do we perform pre-training/fine-tuning for visual quesion answering task on custom dataset. #23

kirito-0512 commented Feb 26, 2024

lorenmt commented Feb 27, 2024

kirito-0512 commented Mar 1, 2024

How do we perform pre-training/fine-tuning for visual quesion answering task on custom dataset. #23

How do we perform pre-training/fine-tuning for visual quesion answering task on custom dataset. #23

Comments

kirito-0512 commented Feb 26, 2024

lorenmt commented Feb 27, 2024

kirito-0512 commented Mar 1, 2024