This repository contains code for our ACL19's paper Argument Generation with Retrieval, Planning, and Realization.
Note: Main modification to this repository includes using torch.utils.Dataset
for data loading; tensorboard logging; and support for newer version of python and pytorch.
- python 3.7
- PyTorch 1.6.0
- numpy 1.15
- tensorboardX 2.1
- tqdm
The dataset we used is currently held on Google drive, which can be accessed in this link.
As described in the paper, we pre-train the encoder and realization decoder with extra data from changemyview. The pre-trained weights can be downloaded here: encoder; decoder
We assume the data to be loaded under ./data/
directory, and the pre-trained Glove embedding at ./embeddings/glove.6B.300d.txt
. The following snippet trains the model:
python train.py \
--exp-name=demo \
--batch-size=16 \
--max-epochs=30 \
--save-freq=2
Model checkpoints will be saved to ./checkpoints/[exp-name]/
, and tensorboard logs will be saved to ./runs/[exp-name]/
.
We implement greedy decoding for sentence planning (phrase selection and sentence type prediction), and beam search for word decoding. The following sample script run decoding based on the model checkpoint from demo
, with epoch_id=30
. Notice that by specifying --use-goldstandard-plan
, the goldstandard sentence planning will be used (instead of greedy search). If option --quiet
is not set, the intermediate logs will be printed to console.
python generate.py \
--epoch-id=30 \
--exp-name=demo \
--max-token-per-sentence=30 \
--beam-size=5 \
--max-phrase-selection-time=2 \
--block-ngram-repeat=4 \
[--use-goldstandard-plan \]
[--quiet]
Xinyu Hua (hua.x [at] northeastern.edu)
See the LICENSE file for details.