CGR

Code for our EMNLP Findings 2021 paper,

Exploiting Reasoning Chains for Multi-hop Science Question Answering

Weiwen Xu, Yang Deng, Huihui Zhang, Deng Cai and Wai Lam.

Data Preparation

We present the results on OpenBookQA and ARC-Challenge in our paper. Due to the license issue, please directly download the datasets from their corresponding websites.

Data Annotation

We use this repo as our hypothesis generator and AMR-gs as our AMR parser. Please follow their instructions to annotate hypothesis and AMR for the datasets respectively.

Once annotated, please organize the annotated files in the following directory (e.g. OpenBookQA)

- Data/
    - OpenBook/
        - train-complete.jsonl (train/dev/test original datasets)
        - dev-complete.jsonl
        - test-complete.jsonl
        - openbook.txt
        - ARC_Corpus.txt
        - train-hypo.txt (train/dev/test hypotheses)
        - dev-hypo.txt
        - test-hypo.txt
        - train-amr.txt (train/dev/test AMRs)
        - dev-amr.txt
        - test-amr.txt
        - core-amr.txt (core fact AMRs from open-book)
        - comm-amr.txt (common fact AMRs from ARC-Corpus)

Please use scripts/clean_corpus.py to clean the ARC-Corpus to remove noisy sentences.

Preprocessing

Add hypothesis to the original datasets:

bash enhance_hypo.sh

Add AMR to the hypothesis-enhanced datasets as well as cache all facts AMR:

bash enhance_AMR.sh

Cache all dense vectors for evidence facts:

bash cache_vector.sh

Once get all preprocessed, you will get the following directory:

- Data/
    - begin/
        - obqa/
            - train.jsonl
            - dev.jsonl
            - test.jsonl
            - core.dict (AMR file for core facts)
            - core.npy (vector file for core facts)
            - obqa.dict
            - obqa.npy

Training

bash finetune.sh

Citation

If you find this work useful, please star this repo and cite our paper as follows:

@article{xu2021exploiting,
  title={Exploiting Reasoning Chains for Multi-hop Science Question Answering},
  author={Xu, Weiwen and Deng, Yang and Zhang, Huihui and Cai, Deng and Lam, Wai},
  journal={arXiv preprint arXiv:2109.02905},
  year={2021}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CGR

Data Preparation

Data Annotation

Preprocessing

Training

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
CGR		CGR
scripts		scripts
README.md		README.md
cache_vector.sh		cache_vector.sh
enhance_AMR.sh		enhance_AMR.sh
enhance_hypo.sh		enhance_hypo.sh
finetune.sh		finetune.sh

wwxu21/CGR

Folders and files

Latest commit

History

Repository files navigation

CGR

Data Preparation

Data Annotation

Preprocessing

Training

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages