Skip to content

Latest commit

 

History

History
5 lines (4 loc) · 777 Bytes

README.md

File metadata and controls

5 lines (4 loc) · 777 Bytes

DERP : Deep Evaluation for Response Predictors

Akshay Kumar Gupta, Shantanu Kumar, Surag Nair, Barun Patra

Description

Automatic evaluation of dialogue response generation systems has been a fundamentally difficult task faced by researchers in the field. It has been shown that most automatic metrics that are used either do not correlate or correlate very weakly with human scoring of a dialogue system. We propose a novel automatic method of evaluation that uses a trained deep learning model for the task. We hope that this method addresses the issues faced by traditional evaluation systems, and aligns better with human scoring.