# BLEU Score

The Bilingual Evaluation Understudy Score, or BLEU for short, is used to evaluate a generated sentence against a reference sentence. 

A perfect match gives a score of 1.0, whereas a perfect mismatch gives a score of 0.0. 

The metric was developed for evaluating the predictions made by Neural Machine Translation (NMT) models. Although not perfect, it offers 5 benefits:
* It is quick and easy to calculate.
* It is easily understandable.
* It is language independent.
* It correlates highly with human evaluation.
* It has been widely adopted. (For example in 'Attention is All You Need'

The approach works by counting matching n-grams in the candidate translation to n-grams in the reference text, where a 1-gram (or unigram) would be each token and a bigram comparison would be each token pair. The comparison is made regardless of word order. 

The counting of matching n-grams is modified to ensure that it takes the occurrence of the words in the reference text into account, not rewarding cadidate translation that generates an abundance of reasonable words. This is often referred to as modified n-gram precision.

A perfect score is not possible in practise as a translation would have to macth the reference exactly, which is not even possible by human translators in most cases. The number and quality of the references used to calculate the BLEU score is clearly a key factor, and means comparing scores across datasets can be troublesome. 

In addition to translation, we can use the BLEU score for other language generation problems with deep learning methods such as:

* Language generation.
* Image caption generation.
* Text summarization.
* Speech recognition.


The NLTK provides an implementation of the BLEU score that can be used to evaluate generated text against a reference. 

In [1]:
from nltk.translate.bleu_score import sentence_bleu
reference = [['this', 'is', 'a', 'test'], ['this', 'is' 'test']]
candidate = ['this', 'is', 'a', 'test']
score = sentence_bleu(reference, candidate)
print(score)

1.0
