Skip to content

Latest commit

 

History

History
9 lines (6 loc) · 1.09 KB

README.md

File metadata and controls

9 lines (6 loc) · 1.09 KB

Study of automatic evaluation metrics applied to story generation in relation to human metrics

Project created by Clémence Millet and Vinciane Desbois, 2023

In this project we were interested in performance metrics of text generation algorithms. The notebook related to our article is available under the name project_nlp_similarity.ipynb. The paper is available on OpenReview : https://openreview.net/pdf?id=b-2xX-oOmUn

Abstract :

Automatic story generation is a complex branch of NLP whose evaluation techniques have been less studied than for summarization or data-to-text. In this analysis, we will focus on the relevance of the different existing automatic metrics, both traditional and more recent, to evaluate this type of task. With the help of a dataset annotated by human evaluators, we compare automatic metrics to human metrics, look for correlations between them and observe the performance of automatic metrics in predicting some human metrics. Our results mainly show a high similarity between all automatic metrics and their difficulty in predicting human metrics, even when combined.