Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional issues trying to finetune on custom (VQA-like) dataset (VizWiz) #105

Closed
Velcorn opened this issue May 13, 2022 · 10 comments
Closed

Comments

@Velcorn
Copy link

Velcorn commented May 13, 2022

Hello, first I'd like to thank you for your amazing work and especially all the detailed answers to issues.

I've been following the different issues on the finetuning on a custom dataset (VizWiz) and produced the .tsv files according to your format. You stated in issue #76 that the trainval_ans2label.pkl file is not used when using beam-search evaluation - is this correct?

I've skipped its creation and training does run for the first epoch. However, upon validation on the valid subset, I get an assertion error in the sequence_generator.py - I've tracked down the error and I can "fix" it by removing the one extra step that is for the EOS marker, but my understanding of how to properly fix that error is limited.

To give some more information of how the .tsv files look, I have attached an image for the train and val subset.

Thank you very much for any kind of input in advance!
image

@yangapku
Copy link
Member

Hi, I'm afraid there is a misunderstanding. The trainval_ans2label.pkl is not used during the inference in beam-search evaluation, but it is actually needed during training. Please prepare it according to your dataset and try again.

@Velcorn
Copy link
Author

Velcorn commented May 13, 2022

Thank you for the very quick response. I figured that might be the case, thanks for the clarification. Is there maybe a specific resource on how to prepare the file for a custom dataset?

Another minor question: In my created .tsv files, the index starts at 0 for each subset. Is that fine or do I have to add an offset for the other subsets?

@yangapku
Copy link
Member

yangapku commented May 13, 2022

The trainval_ans2label.pkl a pickled Python-dict which maps the candidate answer-text to label-id. It should be conformed with your dataset, otherwise the overflow problem mentioned in #59 will arise during training if an answer unseen in the trainval_ans2label.pkl is encountered in your dataset. You can open our provided trainval_ans2label.pkl with pickle for reference. The label-ids (0 to number of different answers-1) can be randomly assigned to each answer from the candidates.

The index of samples from each subset can be overlapped. The most important thing is to keep the index of the test samples conformed with the original dataset to make sure you get the correct evaluation score when submitting to the official evaluation server.

@Velcorn
Copy link
Author

Velcorn commented May 13, 2022

Thanks again for the extensive answer. So, the contents of the pickle file are the most frequent answers from both the train and validation subset (combined), right? Is there a guideline on how many frequent answers to include?

(Small correction for anyone reading this in the future: the overflow problem is mentioned in issue #59)

@yangapku
Copy link
Member

Thanks a lot for noticing the wrong reference of issue! ❤️ I've corrected my comment above. On VQA, our choice of candidate answer set is mainly following the common practice in previous works (VLMo, UNITER, etc.), which have employed a fixed set containing 3,129 frequent answers. I would recommend to refer to previous works for Vizwiz on how to determine the proper size of the frequent candidate set. Please check the answers in the pre-processed (maybe filtered) training samples all be covered in the pickle dict to avoid the overflow training issue.

@Velcorn
Copy link
Author

Velcorn commented May 14, 2022

Many thanks for all the help. It seems to be running pretty well now, given the small size of the dataset.

@CCYChongyanChen
Copy link

Many thanks for all the help. It seems to be running pretty well now, given the small size of the dataset.

Hi Velcorn,
Could you kindly share the code or the .tsv files for the VizWiz-VQA dataset? Our VizWiz group will release a new dataset (VizWiz-Therapy) recently and would like to benchmark this algorithm. Thank you so much in advance!

@Velcorn
Copy link
Author

Velcorn commented Nov 1, 2022

Many thanks for all the help. It seems to be running pretty well now, given the small size of the dataset.

Hi Velcorn, Could you kindly share the code or the .tsv files for the VizWiz-VQA dataset? Our VizWiz group will release a new dataset (VizWiz-Therapy) recently and would like to benchmark this algorithm. Thank you so much in advance!

Hey, sorry for the late answer, I've written this script to generate the .pkl file and .tsv files from the VizWiz-VQA dataset: https://github.com/Velcorn/OFA/blob/main/dataset/preprocess_vizwiz.py

@CCYChongyanChen
Copy link

Many thanks for all the help. It seems to be running pretty well now, given the small size of the dataset.

Hi Velcorn, Could you kindly share the code or the .tsv files for the VizWiz-VQA dataset? Our VizWiz group will release a new dataset (VizWiz-Therapy) recently and would like to benchmark this algorithm. Thank you so much in advance!

Hey, sorry for the late answer, I've written this script to generate the .pkl file and .tsv files from the VizWiz-VQA dataset: https://github.com/Velcorn/OFA/blob/main/dataset/preprocess_vizwiz.py

Thank you so much Velcorn for sharing! It helps a lot!

@Velcorn
Copy link
Author

Velcorn commented Nov 2, 2022

Many thanks for all the help. It seems to be running pretty well now, given the small size of the dataset.

Hi Velcorn, Could you kindly share the code or the .tsv files for the VizWiz-VQA dataset? Our VizWiz group will release a new dataset (VizWiz-Therapy) recently and would like to benchmark this algorithm. Thank you so much in advance!

Hey, sorry for the late answer, I've written this script to generate the .pkl file and .tsv files from the VizWiz-VQA dataset: https://github.com/Velcorn/OFA/blob/main/dataset/preprocess_vizwiz.py

Thank you so much Velcorn for sharing! It helps a lot!

You're welcome. Let me know if you have any questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants