Leveraging Context Information for Natural Question Generation

This repository contains the code for our paper Leveraging Context Information for Natural Question Generation

The code is developed under TensorFlow 1.4.1

Update about data split-1

Split-1 was originally released by Du et al., which we can't directly use as there is no information about answer positions. As a result, we use their provided doclist-xxx.txt files to generate our own data (provided along this repository). We mistakenly report their train/dev/test split in our paper.

Data

We release our data here

Data format

The current input data format for our system is in JSON style demonstrated with the following sample:

[{"text1":"IBM is headquartered in Armonk , NY .", "annotation1": {"toks":"IBM is headquartered in Armonk , NY .", "POSs":"NNP VBZ VBN IN NNP , NNP .","NERs":"ORG O O O LOC O LOC ."},
 {"text2":"Where is IBM located ?", "annotation2": {"toks":"Where is IBM located ?", "POSs":"WRB VBZ NNP VBN .","NERs":"O O ORG O O"},
 {"text3":"Armonk , NY", "annotation3": {"toks":"Armonk , NY", "POSs":"NNP , NNP","NERs":"LOC O LOC"}
}]

where "text1" and "annotation1" correspond to the text and rich annotations for the passage. Similarly, "text2" and "text3" correspond to the question and answer parts, respectively.

Please note that the rich annotation isn't necessary for our system, so you can simply modify the data loading code to not requiring the "annotation" fields.

Important update on data format

Now annotations fields are not required in our latest system. So you can feed it with data sample like:

[{"text1":"IBM is headquartered in Armonk , NY .", 
 {"text2":"Where is IBM located ?", 
 {"text3":"Armonk , NY"
}]

Training

For model training, simply execute

python NP2P_trainer.py --config_path config.json

where config.json is a JSON file containing all hyperparameters. We attach a sample config file along with our repository.

Decoding

For decoding, simply execute

python NP2P_beam_decoder.py --model_prefix xxx --in_path yyy --out_path zzz --mode beam

Cite

If you like our work, please cite:

@inproceedings{song2018leveraging,
  title={Leveraging Context Information for Natural Question Generation},
  author={Song, Linfeng and Wang, Zhiguo and Hamza, Wael and Zhang, Yue and Gildea, Daniel},
  booktitle={Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  pages={569--574},
  year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
logs		logs
src		src
README.md		README.md
config.json		config.json
decode.sh		decode.sh
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Leveraging Context Information for Natural Question Generation

Update about data split-1

Data

Data format

Important update on data format

Training

Decoding

Cite

About

Releases

Packages

Languages

freesunshine0316/MPQG

Folders and files

Latest commit

History

Repository files navigation

Leveraging Context Information for Natural Question Generation

Update about data split-1

Data

Data format

Important update on data format

Training

Decoding

Cite

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages