Response-Reranking

Code repository for the paper:

Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue Systems by Songbo Hu, Ivan Vulić, Fangyu Liu, and Anna Korhonen.

This response reranker is a simple yet effective model which aims to select high-quality items from the lists of responses initially over-generated by any end-to-end task-oriented dialogue system.

Environment

The code is tested with python 3.8. Firstly, install Pytorch 1.11.0 from the official website. Then, clone this repository and install the dependencies:

>> git clone git@github.com:cambridgeltl/response_reranking.git
>> pip install -r requirements.txt

Data Preprocessing

Before training and evaluating our reranking models, unzip data.zip in the repository root directory. It contains three files and a folder: 0.7_train.json, 0.7_dev.json, 0.7_test.json, and multi-woz-processed.

>> unzip data.zip

Each JSON file contains overgenerated responses from the MinTL System. It is a list of candidate response pairs with the following fields:

"context_text" denotes the lexicalised dialogue context.
"resp_text" and "resp_nodelex" are the ground truth delexicalised/lexicalised responses to the given dialogue context.
"resp_gen" is the generated delexicalised response based on greedy search given the dialogue context.
"over_gen" is a list of 20 overgenerated delexicalised responses based on top-p sampling given the given dialogue context.

We used the preprocess script (setup.sh) from DAMD to perform delexicalisation and produce files in multi-woz-processed.

Generating Similarity Scores

Generating the cosine similarity scores with the all-mpnet-v2 encoder between the overgenerated responses and the ground truth responses:

>> PYTHONPATH=$(pwd) python ./src/generate_similarity_scores.py

Generating the cosine similarity scores with the all-mpnet-v2 encoder between the greedy search responses and the ground truth responses:

>> PYTHONPATH=$(pwd) python ./src/generate_similarity_scores_greedy.py

Training

For stage 1: response selection training:

>> PYTHONPATH=$(pwd) python ./src/train_response_selection_cross_encoder.py

For stage 2: similarity-based response reranking training:

>> PYTHONPATH=$(pwd) python ./src/train_similarity_reranking.py

For stage 2: classification-based response reranking training:

>> PYTHONPATH=$(pwd) python ./src/train_classification_reranking.py

Testing

For testing the similarity-based response reranking models:

>> PYTHONPATH=$(pwd) python ./src/eval_similarity_reranking.py

For testing the classification-based response reranking models:

>> PYTHONPATH=$(pwd) python ./src/eval_classification_reranking.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
media		media
myDataset		myDataset
myMetrics		myMetrics
output		output
src		src
LICENSE		LICENSE
README.md		README.md
data.zip		data.zip
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

media

media

myDataset

myDataset

myMetrics

myMetrics

output

output

src

src

LICENSE

LICENSE

README.md

README.md

data.zip

data.zip

requirements.txt

requirements.txt

Repository files navigation

Response-Reranking

Environment

Data Preprocessing

Generating Similarity Scores

Training

Testing

About

Releases

Packages

Languages

License

cambridgeltl/response_reranking

Folders and files

Latest commit

History

Repository files navigation

Response-Reranking

Environment

Data Preprocessing

Generating Similarity Scores

Training

Testing

About

Resources

License

Stars

Watchers

Forks

Languages