Skip to content

Alab-NII/multi-hop-analysis

Repository files navigation

This is the repository for the paper: Analyzing the Effectiveness of the Underlying Reasoning Tasks in Multi-hop Question Answering - EACL 2023 (Findings).

Dataset Information

We use two datasets in our experiments: 2WikiMultihopQA and HotpotQA-small

We follow the steps in https://github.com/yuwfan/HGN to obtain file .gz data from raw data.

How to Run the Code

Set up environment

bash install_packages.sh

Prepare data for training

Training

python3 main.py

For evaluation on dev file

python3 predictor.py $checkpoint $data_file

python3 postprocess.py $prediction_file $processed_data_file $original_data_file

python3 official_evaluation.py path/to/prediction path/to/gold

Reproduce the results

  • Download our checkpoints
  • Run file predict_dev_all_settings.sh (Note: if you want to use this file for the test set in 2Wiki, comment line #25 about evaluation)

References

  • We base on HGN for data preprocessing.
  • We re-use the class Example from the HGN model and update it to work with our dataset.