Multi-Document Driven Dialogue (MD3)

This is the code for AAAI2021 paper Converse, Focus and Guess - Towards Multi-Document Driven Dialogue.

Requirements

Ubuntu 16.04
Python >= 3.6.0
PyTorch >= 1.3.0

Dataset: GuessMovie

We build a benchmark GuessMovie dataset for MD3 task on the base of the dataset WikiMovies (Miller et al. 2016). It includes 16,881 documents with 6 different attributes (i.e. directed_by, release_year, written_by, starred_actors, has_genre, in_language). The dataset can be downloaded from the link, and should be decompressed to the data directory of this repository.

Preprocess

To preprocess, we provide several methods.

python preprocess.py --[vocab/split]

vocab: splitting words, making vocabulary, word to id and filtering GloVe embeddings.
split: splitting the dataset to two parts. The former is for training Doc-Rep and NLU. And the latter is for training dialogue policy and simulating dialogue.

Training Doc-Rep

To obtain attribute-aware document representation, just run the following command.

./train_doc_rep.sh

You can modify the bash script to change directory of input and output.

Training NLU

We use imitation learning to train NLU module separately on the former part of dataset.

python train_nlu.py --out [OUT_INFIX] --train --test

Training Policy

We use reinforce learning to train Policy module on the latter part of dataset.

python run_game.py --in [IN_INFIX] --out [OUT_INFIX] --train --test

IN_INFIX: directory name of NLU checkpoints
OUT_INFIX: directory name of output checkpoints for NLU and Policy

Testing Game

Testing the dialog with 5k simulations on the latter part of dataset.

python run_game.py --in [IN_INFIX] --test

IN_INFIX: directory name of NLU and Policy checkpoints

You can change the agent_type in config/game_config.yaml to test different agents.

Others

Some tools are in tests directory.

Reference

If you consider our work useful, please cite the paper:

@inproceedings{liu2021converse,
  title={Converse, Focus and Guess - Towards Multi-Document Driven Dialogue},
  author={Liu, Han and Yuan, Caixia and Wang, Xiaojie and Yang, Yushu and Jiang, Huixing and Wang, Zhongyuan},
  booktitle={Thirty-Fifth AAAI Conference on Artificial Intelligence},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
config		config
data		data
game		game
imgs		imgs
outputs		outputs
tests		tests
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build_doc_rep.py		build_doc_rep.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
run_game.py		run_game.py
train_doc_rep.py		train_doc_rep.py
train_doc_rep.sh		train_doc_rep.sh
train_nlu.py		train_nlu.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Document Driven Dialogue (MD3)

Requirements

Dataset: GuessMovie

Preprocess

Training Doc-Rep

Training NLU

Training Policy

Testing Game

Others

Reference

About

Releases

Packages

Languages

License

laddie132/MD3

Folders and files

Latest commit

History

Repository files navigation

Multi-Document Driven Dialogue (MD3)

Requirements

Dataset: GuessMovie

Preprocess

Training Doc-Rep

Training NLU

Training Policy

Testing Game

Others

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages