Encoder: 
A network model that reads the photograph input and encodes the content
into a fixed-length vector using an internal representation.

Decoder: 
A network model that reads the encoded photograph and generates the textual
description output.

# BLEU

BLEU, or the Bilingual Evaluation Understudy, is a score for comparing a candidate translation of text to one or more reference translations. Although developed for translation, it can be used to evaluate text generated for a suite of natural language processing tasks.

#### How to calculate BLEU Score using sentence_bleu

In [1]:
from nltk.translate.bleu_score import sentence_bleu
reference = [[ 'this', 'is', 'a', 'test'], ['this', 'is' 'test' ]]
candidate = [ 'this', 'is', 'a', 'test']
score = sentence_bleu(reference, candidate)
print(score)

1.0


#### How to calculate BLEU Score using corpus_bleu

In [2]:
# two references for one document
from nltk.translate.bleu_score import corpus_bleu
references = [[['this', 'is', 'a', 'test'], ['this' , 'is' 'test']]]
candidates = [['this', 'is', 'a', 'test']]
score = corpus_bleu(references, candidates)
print(score)

1.0


#### 1-gram individual BLEU

In [4]:
from nltk.translate.bleu_score import sentence_bleu
reference = [[ ' this ' , ' is ' , ' small ' , ' test ' ]]
candidate = [ ' this ' , ' is ' , ' a ' , ' test ' ]
score = sentence_bleu(reference, candidate, weights=(1, 0, 0, 0))
print(score)

0.75


#### n-gram individual BLEU

In [5]:
from nltk.translate.bleu_score import sentence_bleu
reference = [[ ' this ' , ' is ' , ' a ' , ' test ' ]]
candidate = [ ' this ' , ' is ' , ' a ' , ' test ' ]
print('Individual 1-gram: %f ' % sentence_bleu(reference, candidate, weights=(1,0,0,0)))
print('Individual 2-gram: %f ' % sentence_bleu(reference, candidate, weights=(1,1,0,0)))
print('Individual 3-gram: %f ' % sentence_bleu(reference, candidate, weights=(1,0,1,0)))
print('Individual 4-gram: %f ' % sentence_bleu(reference, candidate, weights=(1,0,0,1)))



 Individual 1-gram: 1.000000 
 Individual 2-gram: 1.000000 
 Individual 3-gram: 1.000000 
 Individual 4-gram: 1.000000 


#### 4-gram cumulative BLEU

In [7]:
from nltk.translate.bleu_score import sentence_bleu
reference = [['this', 'is', 'small', 'test']]
candidate = ['this', 'is', 'a', 'test']
score = sentence_bleu(reference, candidate, weights=(0.25, 0.25, 0.25, 0.25))
print(score)


1.0547686614863434e-154


#### cumulative BLEU scores

In [8]:
from nltk.translate.bleu_score import sentence_bleu
reference = [['this', 'is', 'small', 'test']]
candidate = ['this', 'is', 'a', 'test']
print('Cumulative 1-gram: %f' % sentence_bleu(reference, candidate, weights=(1, 0, 0, 0)))
print('Cumulative 2-gram: %f' % sentence_bleu(reference, candidate, weights=(0.5, 0.5, 0, 0)))
print('Cumulative 3-gram: %f' % sentence_bleu(reference, candidate, weights=(0.33, 0.33, 0.33, 0)))
print('Cumulative 4-gram: %f' % sentence_bleu(reference, candidate, weights=(0.25, 0.25, 0.25, 0.25)))


Cumulative 1-gram: 0.750000
Cumulative 2-gram: 0.500000
Cumulative 3-gram: 0.000000
Cumulative 4-gram: 0.000000


# Work Examples

In [9]:
# prefect match
from nltk.translate.bleu_score import sentence_bleu
reference = [['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']]
candidate = ['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']
score = sentence_bleu(reference, candidate)
print(score)

1.0


In [10]:
# one word different
from nltk.translate.bleu_score import sentence_bleu
reference = [['the', 'quick', 'brown', 'fox' , 'jumped', 'over', 'the', 'lazy', 'dog']]
candidate = ['the', 'fast', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']
score = sentence_bleu(reference, candidate)
print(score)

0.7506238537503395


In [11]:
# two words different
from nltk.translate.bleu_score import sentence_bleu
reference = [['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']]
candidate = ['the', 'fast', 'brown', 'fox', 'jumped', 'over', 'the', 'sleepy', 'dog']
score = sentence_bleu(reference, candidate)
print(score)

0.4854917717073234


In [12]:
# all words different
from nltk.translate.bleu_score import sentence_bleu
reference = [['the', 'quick', 'brown' , 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']]
candidate = ['a', 'b', 'c' , 'd', 'e' , 'f', 'g', 'h', 'i']
score = sentence_bleu(reference, candidate)
print(score)

0


In [13]:
# shorter candidate
from nltk.translate.bleu_score import sentence_bleu
reference = [['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']]
candidate = ['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the']
score = sentence_bleu(reference, candidate)
print(score)


0.7514772930752859


In [14]:
# longer candidate
from nltk.translate.bleu_score import sentence_bleu
reference = [['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']]
candidate = ['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog','from', 'space']
score = sentence_bleu(reference, candidate)
print(score)


0.7860753021519787


In [15]:
# very short
from nltk.translate.bleu_score import sentence_bleu
reference = [['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']]
candidate = ['the', 'quick']
score = sentence_bleu(reference, candidate)
print(score)


4.5044474950870215e-156
