NQG_ASs2s

Implementation of <Improving Neural Question Generation Using Answer Separation> by Yanghoon Kim et al.

Notice (2020.07.18)

The source code still needs to be modified

There will be some updates after mid-September: (maybe pre-trained weight, Named-entity replacement func, etc)

Sorry for the delay

Model
- Embedding
  - Pretrained GloVe embeddings
  - Randomly initialized embeddings
- Answer-separated seq2seq
  - Answer-separated encoder
  - Answer-separated decoder
    - Keyword-net
    - Retrieval style word generator
- Named Entity Replacement (To be updated)
- Post processing
  - Remove duplicates
Dataset

Processed data provided by Linfeng Song et al.

Extra tools
- Parameter Search

Requirements

python 2.7
numpy
Tensorflow 1.4
nltk
tqdm

Usage

Data preprocessing

# Extract dataset
$ tar -zxvf data/mpqg_data/nqg_data.tgz -C data/mpqg_data

# Process data
$ cd data
$ python process_mpqg_data.py # Several settings can be modified inside the source code (data path, vocab_size, etc)

Download & process GloVe

$ mkdir GloVe # data/GloVe
$ wget http://nlp.stanford.edu/data/glove.840B.300d.zip -P GloVe/
$ unzip GloVe/glove.840B.300d.zip -d GloVe/
$ python process_embedding.py # This will take a couple of minutes

Run a single model

# Train
$ bash run.sh [dataset] train [checkpoint name] [epochs] # define dataset name inside run.sh
# EXAMPLE: $ bash run.sh squad train firstmodel 15

# Test
$ bash run.sh [dataset] pred [checkpoint name] [epochs] # enter random number in [epochs]
# EXAMPLE: $ bash run.sh squad pred firstmodel 1

(*Optional) Parameter search(Training)

$ bash search_params.sh [dataset]
# EXAMPLE: $ bash search_params.sh squad

# Tip
# You can refer to the file 'assets/file_generation_for_search_params.ipynb' to automatically generate the contents of search_params.sh and params.py

(*Optional) Remove duplicates (Post-processing)

$ python remove_duplicates.py --source_file [predicted_file] --out_file [post_processed_file] --ngram [scalar]
# EXAMPLE: $ python remove_duplicates.py --source_file result/predictions.txt --out_file result/predictions.rmv --ngram 4

Evaluation

$ python qgevalcap/eval.py -out [output filename] -src [input filename(sentence)] -tgt [target filename(question)]
# EXAMPLE: $ python qgevalcap/eval.py -out result/predictions.txt -src data/processed/mpqg_substitute_a_vocab_include_a/filtered_txt/test_sentence_origin.txt -tgt data/processed/mpqg_substitute_a_vocab_include_a/filtered_txt/test_question.txt

Acknowledgment

The data is adapted from MPQG. The evaluation scripts are adapted from NQG.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NQG_ASs2s

Notice (2020.07.18)

Contents

Requirements

Usage

Acknowledgment

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
assets		assets
data		data
qgevalcap		qgevalcap
submodule		submodule
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
model.py		model.py
params.py		params.py
remove_duplicates.py		remove_duplicates.py
run.sh		run.sh
search_params.sh		search_params.sh

License

yanghoonkim/NQG_ASs2s

Folders and files

Latest commit

History

Repository files navigation

NQG_ASs2s

Notice (2020.07.18)

Contents

Requirements

Usage

Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages