Skip to content

Latest commit

 

History

History
48 lines (33 loc) · 1.32 KB

README.md

File metadata and controls

48 lines (33 loc) · 1.32 KB

Automatic Question Generation with Encoder-Decoder LSTM

This is an implementation of Encoder-Decoder LSTM with Bahdanau Attention to automatic question generation task. SQuAD is used in training and testing.

How to run

First, download SQuAd from here. Then, download GloVe from here here

Code consists of several modules. You can type the folowing command to learn about the parameters of the scripts:

python3 modules/{filename}.py --help

You have these modules in the following order:

Preprocessing the dataset:

python3 modules/squad.py --input path-to-squad-json-file --out desired-output-path --out_format pkl_or_csv

Building the tokenizers:

python3 modules/tokenizer.py --input path-to-preprocessed-squad-file

You can customize the tokenizing process. Refer to the help command to know about the parameters you can play with.

Training:

python3 modules/train.py --train_config path-to-train-config-file

You can define the training parameters (batch size, learning rate etc.) by modifying the train_config.json file.

Inference:

python3 modules/inference.py --input path-to-tokenized-test-data

You can decide the decoding method and other things by defining the parameters as well.