Skip to content

usydnlp/vdoc

Repository files navigation

V-Doc : Visual questions answers with Documents

This repository contains code for the paper V-Doc : Visual questions answers with Documents. The videos can be accessed by this link.

Ding, Y.*, Huang, Z.*, Wang, R., Zhang, Y., Chen, X., Ma, Y., Chung, H., & Han, C. (CVPR 2022)
V-Doc : Visual questions answers with Documents

Dataset in Dataset Storage Module

The dataset we used to trained the model is provided in following links:

PubVQA Dataset for training Mac-Network.

Dataset for training LayoutLMv2(FUNSD-QA).

Dataset Generation

To run the scene based question generation code, we need to fetch the JSON files from the source.

Extract OCR information

python3 ./document_collection.py

After the step above, a new folder called ./input_ocr will be generated.

Generate questions

python3 ./scene_based/pdf_generate_question.py

To limit the number of generated questions, you can change the code in pdf_generate_question.py line 575 and line 591-596

After the steps above, you can see a json file under the ./output_qa_dataset.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages