DIALKI

This repo provides the training and inference code for the DIALKI model in our EMNLP 2021 paper: DIALKI: Knowledge Identification in Conversational Systems through Dialogue-Document Contextualization

Setup

The code has been tested on CUDA 11.0+.

Run conda env create -f environment.yml and conda activate dialki
To train on doc2dial dataset, first create a folder ./dialdoc and put original data files from here into a subfolder ./dialdoc/raw_data.
If you want to train on wow instead, skip step 2. Create a folder ./wow and change the path variable in ./path file to ./wow.
Run bash setup.sh.

Data Preparation and Training

The default parameters were used to run on 2 NVIDIA Quadro Q6000 GPUs. Each training process took about 18 hours for 20 epochs (default).

Simply run bash run.sh dialdoc or bash run.sh wow depending on which dataset you want to run.

Some important parameters to change if not enough memory for training:

Setting --adv_loss_weight=0.0 in scripts/train.sh disables the posterior regularization, which helps save memory during training, but at the cost of model performance. --passages_per_question can also be set smaller to save memory. Setting --decision_function=0 disables the knowledge contextualization component.

Inference and Evaluation

After you finish training, run bash run_eval.sh [dataname] [checkpoint_path] [inference_output_path] to run inference. dataname can be either dialdoc or wow. The checkpoint_path can be either the best model from your training or our provided model for each dataset. inference_output_path is where you want the inference results to be saved. The console will print out the evaluation results during inference.

Currently, the inference will run for dev set by default. If you want to change to test sets (note that you need to contact dialdoc authors to get their test set), go to script/eval.sh and change the --dev_file path.

Cite

@inproceedings{wu-etal-2021-dialki,
    title = "{DIALKI}: Knowledge Identification in Conversational Systems through Dialogue-Document Contextualization",
    author = "Wu, Zeqiu  and
      Lu, Bo-Ru  and
      Hajishirzi, Hannaneh  and
      Ostendorf, Mari",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.140",
    pages = "1852--1863",
    abstract = "Identifying relevant knowledge to be used in conversational systems that are grounded in long documents is critical to effective response generation. We introduce a knowledge identification model that leverages the document structure to provide dialogue-contextualized passage encodings and better locate knowledge relevant to the conversation. An auxiliary loss captures the history of dialogue-document connections. We demonstrate the effectiveness of our model on two document-grounded conversational datasets and provide analyses showing generalization to unseen documents and long dialogue contexts.",
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data_utils		data_utils
models		models
prepro		prepro
scripts		scripts
utils		utils
README.md		README.md
config.py		config.py
download_hf_model.py		download_hf_model.py
environment.yml		environment.yml
eval.py		eval.py
gen_data.py		gen_data.py
path		path
run.sh		run.sh
run_eval.sh		run_eval.sh
setup.sh		setup.sh
train_reader.py		train_reader.py

ellenmellon/DIALKI

Folders and files

Latest commit

History

Repository files navigation

DIALKI

Setup

Data Preparation and Training

Some important parameters to change if not enough memory for training:

Inference and Evaluation

Cite

About

Resources

Stars

Watchers

Forks

Languages