Skip to content

bbc/gourmet-sentence-pairs-evaluation

master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Sentence Pairs Evaluation Tool - Direct Assessment

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825299.

This tool was build as part of the GoURMET Project to complete Direct Assessment evaluation on machine translation models and is open sourced under GPL v3. Issues should be raised via the GitHub issues. Code changes can be proposed by opening a pull request.

Contents

  1. What is Direct Assessment?
  2. Admin Guide
  3. Developer Guide
  4. User Guide

What is Direct Assessment

Direct Assessment is a standard evaluation approach used in academic research to assess the quality of translation. This approach differs from automatic evaluation such as BLEU as the evaluation is carried out by a human rather than an algorithm. The goal of Direct Assessment is to evaluate a translation model by asking a human to compare the quality of a machine translated sentence to a human translated sentence where the human translation is assumed to be the gold standard. For each case there must be a set of three sentences.

  1. A sentence in the source language
  2. The same sentence translated into the target language by a human
  3. The same sentence translated into the target language by a machine

The evaluator will be shown the human translated sentence and the machine translated sentence and asked to rate on a scale from 0 to 100

  1. If the machine translated sentence adequately expresses the meaning of the human translated sentence.
  2. The machine translated sentence is a well-written phrase or sentence that is grammatically and idiomatically correct

A more in-depth explanation of Direct Assessment can be found in the papers Continuous Measurement Scales in Human Evaluation of Machine Translation and Is all that Glitters in Machine Translation Quality Estimation really Gold?.

About

Tool to evaluate the quality of the GoURMET Translation models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published