Skip to content
main
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
fig
Jun 22, 2021

This is the Repo for the paper: BARTScore: Evaluating Generated Text as Text Generation

Updates

  • 2021.06.28 Release online evaluation Demo
  • 2021.06.25 Release online Explainable Leaderboard for Meta-evaluation
  • 2021.06.22 Code will be released soon

Background

There is a recent trend that leverages neural models for automated evaluation in different ways, as shown in Fig.1.

(a) Evaluation as matching task. Unsupervised matching metrics aim to measure the semantic equivalence between the reference and hypothesis by using a token-level matching functions in distributed representation space (e.g. BERT) or discrete string space (e.g. ROUGE).

(b) Evaluation as regression task. Regression-based metrics (e.g. BLEURT) introduce a parameterized regression layer, which would be learned in a supervised fashion to accurately predict human judgments.

(c) Evaluation as ranking task. Ranking-based metrics (e.g. COMET) aim to learn a scoring function that assigns a higher score to better hypotheses than to worse ones.

(d) Evaluation as generation task. In this work, we formulate evaluating generated text as a text generation task from pre-trained language models.

About

BARTScore: Evaluating Generated Text as Text Generation

Resources

Releases

No releases published

Packages

No packages published