You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Assessing the quality of natural language generation systems through humanannotation is very expensive. Additionally, human annotation campaigns aretime-consuming and include non-reusable human labour. In practice, researchersrely on automatic metrics as a proxy of quality. In the last decade, manystring-based metrics (e.g., BLEU) have been introduced. However, such metricsusually rely on exact matches and thus, do not robustly handle synonyms. Inthis paper, we introduce InfoLM a family of untrained metrics that can beviewed as a string-based metric that addresses the aforementioned flaws thanksto a pre-trained masked language model. This family of metrics also makes useof information measures allowing the adaptation of InfoLM to various evaluationcriteria. Using direct assessment, we demonstrate that InfoLM achievesstatistically significant improvement and over $10$ points of correlation gainsin many configurations on both summarization and data2text generation.
AkihikoWatanabe
changed the title
InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation, Colombo+, AAAI'22
InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation, Pierre Colombo+, N/A, arXiv'21
Aug 13, 2023
AkihikoWatanabe
changed the title
InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation, Pierre Colombo+, N/A, arXiv'21
InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation, Pierre Colombo+, N/A, AAAI'22
Aug 13, 2023
URL
Affiliations
Abstract
Translation (by gpt-3.5-turbo)
Summary (by gpt-3.5-turbo)
The text was updated successfully, but these errors were encountered: