This is the code for AAAI2021 paper Converse, Focus and Guess - Towards Multi-Document Driven Dialogue.
- Ubuntu 16.04
- Python >= 3.6.0
- PyTorch >= 1.3.0
We build a benchmark GuessMovie dataset for MD3 task on the base of the dataset WikiMovies (Miller et al. 2016). It includes 16,881 documents with 6 different attributes (i.e. directed_by, release_year, written_by, starred_actors, has_genre, in_language). The dataset can be downloaded from the link, and should be decompressed to the data directory of this repository.
To preprocess, we provide several methods.
python preprocess.py --[vocab/split]
vocab
: splitting words, making vocabulary, word to id and filtering GloVe embeddings.split
: splitting the dataset to two parts. The former is for training Doc-Rep and NLU. And the latter is for training dialogue policy and simulating dialogue.
To obtain attribute-aware document representation, just run the following command.
./train_doc_rep.sh
You can modify the bash script to change directory of input and output.
We use imitation learning to train NLU module separately on the former part of dataset.
python train_nlu.py --out [OUT_INFIX] --train --test
We use reinforce learning to train Policy module on the latter part of dataset.
python run_game.py --in [IN_INFIX] --out [OUT_INFIX] --train --test
IN_INFIX
: directory name of NLU checkpointsOUT_INFIX
: directory name of output checkpoints for NLU and Policy
Testing the dialog with 5k simulations on the latter part of dataset.
python run_game.py --in [IN_INFIX] --test
IN_INFIX
: directory name of NLU and Policy checkpoints
You can change the
agent_type
inconfig/game_config.yaml
to test different agents.
Some tools are in tests
directory.
If you consider our work useful, please cite the paper:
@inproceedings{liu2021converse,
title={Converse, Focus and Guess - Towards Multi-Document Driven Dialogue},
author={Liu, Han and Yuan, Caixia and Wang, Xiaojie and Yang, Yushu and Jiang, Huixing and Wang, Zhongyuan},
booktitle={Thirty-Fifth AAAI Conference on Artificial Intelligence},
year={2021}
}