VQA input construction #48

fangpang20 · 2022-03-21T02:51:28Z

Hi, guys:
Thank you for your diligent work. I'm trying to prepare VQA input for single sample inference.
I'm not sure about the architecture of the VQA model, Such as the "decoder_prompts" , "prefix_tokens" in the autonomously constructed "sample".
and following sentence in the readme description about VQA is vague to me:
"we transform original VQA training questions with multiple golden answers into multiple training samples. "
Do you have any suggestions?

yangapku · 2022-03-21T09:25:54Z

We use decoder_prompts and prefix_tokens for better VQA finetuning performance. Specifically, for VQA we have an hyper-parameter option called --prompt-type, which determines whether to add the question before the answer in the input sequence of the decoder during finetuning & evaluation. The question has already been input in the encoder, here we consider whether to feed it into the decoder again. If the --prompt-type is not none, then the decoder_prompts and prefix_tokens will record the prepended question to construct the decoder input sequence during evaluation. The decoder_prompts is used for all-candidate evaluation and the prefix_tokens is used for beam-search generative evaluation. In our experiments, we found concatenating the question with the answer in the decoder input sequence improves the accuracy somewhat, compared with not performing concatenation.

For the other question, note that in the original VQAv2 dataset, most questions are annotated with more than one ground-truth answers. However, OFA is a seq2seq model which requires one source sequence (image & question) paired with only one target sequence (ground-truth answer) during training. In this case, we split the original sample with one question paired with more than one answers into multiple seq2seq samples, each consists of the question paired with one of the ground-truth answer.

yangapku self-assigned this Mar 21, 2022

yangapku closed this as completed Mar 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VQA input construction #48

VQA input construction #48

fangpang20 commented Mar 21, 2022

yangapku commented Mar 21, 2022 •

edited

Loading

VQA input construction #48

VQA input construction #48

Comments

fangpang20 commented Mar 21, 2022

yangapku commented Mar 21, 2022 • edited Loading

yangapku commented Mar 21, 2022 •

edited

Loading