GitHub - HLR/SpartQA_generation: Generating SpartQA dataset

Note:

We suggest researchers to use SpaRTUN which is an updated version of SpartQA-Auto and contains more relation types and rules GitHub. Another Human generated dataset provided by this group on real-world spatial reasoning question answering is ReSQ.

SpartQA

We take advantage of the ground truth of NLVR images, design CFGs to generate stories, and use spatial reasoning rules to ask and answer spatial reasoning questions. This automatically generated data is called SpartQA-Auto.

Without processing the raw text, which is error-prone, we generate questions by only looking at the stored data.

There are four types of questions (q-types).

(1)FR: find relation between two objects.

(2) FB: find the block that contains certain object(s).

(3)CO: choose between two objects mentioned in the question that meets certain criteria.

(4) YN: a yes/no question that tests if a claim on spatial relationship holds.

All four types of questions are formulated as multiple-choice questions.

Specifically, FB, FR, and CO questions receive a list of candidate answers, and YN questions are choosing from Yes, No, or DK''~(Do not Know). The DK'' option is due to the open-world assumption of the stories, where if something is not described in the text, it is not considered false.

For generating the data samples:

Create a Dataset and NLVR directory.
Put the NLVR train.json, test.json. dev.json in the NLVR directory.

Use below arguments on Dataset_gen.py for generting the data:

 "--num_image",  help="Number of image, 6660 for train, 1000 for other", type= int, default=1000

 "--story_per_image",  help="How many story do you want to create for each image", type= int, default=2

 "--num_question",  help="number of question for each question type.", type= int, default=2

 "--question_type",  help="name of the question types: all, YN, FB, FR, CO", type= str, default='all'

 "--no_save",  help="just testing generation phase", action='store_true', default = False

 "--seed_num", help="add seed number for random choices.", type= int, default=None

 "--skip_except", help="skip all examples expcept story X", type= int, default=None

for example:

  python3 Dataset_gen.py --name dev --nlvr_data dev

or :

  python3 Dataset_gen.py --name train --nlvr_data train --num_image 6660

There are three types of annotation for each set.

Annotation: has the scene graph of each scene
SpRL: has the annotation regrading the Spatial Role and Relation extraction
The main file which has stories, questions, answers, candidate answers, consistency and contrast set (if appliable).

Download SpartQA_Auto

Download SpartQA_Human

To see the implemented baselines on this dataset check : https://github.com/HLR/SpartQA-baselines

Citing SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning:

  @inproceedings{mirzaee-etal-2021-spartqa,
      title = "{SPARTQA}: A Textual Question Answering Benchmark for Spatial Reasoning",
      author = "Mirzaee, Roshanak  and
        Rajaby Faghihi, Hossein  and
        Ning, Qiang  and
        Kordjamshidi, Parisa",
      booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
      month = jun,
      year = "2021",
      address = "Online",
      publisher = "Association for Computational Linguistics",
      url = "https://aclanthology.org/2021.naacl-main.364",
      doi = "10.18653/v1/2021.naacl-main.364",
      pages = "4582--4598",
  }

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
Annotation.py		Annotation.py
Change_Words.py		Change_Words.py
Creating_questions.py		Creating_questions.py
Creating_story.py		Creating_story.py
Dataset_gen.py		Dataset_gen.py
Find_Relations.py		Find_Relations.py
LICENSE.md		LICENSE.md
Random_choice.py		Random_choice.py
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Note:

SpartQA

Citing SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning:

About

Releases

Packages

Languages

License

HLR/SpartQA_generation

Folders and files

Latest commit

History

Repository files navigation

Note:

SpartQA

Citing SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages