DERP : Deep Evaluation for Response Predictors

Akshay Kumar Gupta, Shantanu Kumar, Surag Nair, Barun Patra

Description

Automatic evaluation of dialogue response generation systems has been a fundamentally difficult task faced by researchers in the field. It has been shown that most automatic metrics that are used either do not correlate or correlate very weakly with human scoring of a dialogue system. We propose a novel automatic method of evaluation that uses a trained deep learning model for the task. We hope that this method addresses the issues faced by traditional evaluation systems, and aligns better with human scoring.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

DERP : Deep Evaluation for Response Predictors

Description

Files

README.md

Latest commit

History

README.md

File metadata and controls

DERP : Deep Evaluation for Response Predictors

Description