# Evaluation

This notebook is dedicated to the evalution of the data. The commonly used Bleu algorithm is used for this, which can be found [here](https://www.aclweb.org/anthology/P02-1040.pdf).

## Import
Below are all the libraries used for this notebook.

In [93]:
import numpy as np
from nltk.util import ngrams

## Algorithm

In [138]:
def bleu_text(target_text: list, reference_text: list, n_precision = 4, smoothing = 0):
    """takes a target_text, and a reference_text, of the same length, as lists within lists, where each inner list is a
    tokenized sentence, n_precision, which determines up to which n-gram the bleu value is computed, and a possible
    smoothing"""
    temp = np.zeros((n_precision, 2)) #each row is an n(-gram), column 0 is counted matches, column 1 is total n-grams
    for i in range(0, len(target_text)):
        for j in range(0, n_precision):
            count = 0
            target = list(ngrams(target_text[i], j+1))
            reference = list(ngrams(reference_text[i], j+1))
            for x in target:
                #to make sure no n-gram is used as a match twice, it is removed from the reference list
                if x in reference:
                    reference.remove(x)
                    count += 1
            #if a smoothing is specified, and for a certain n-gram 0 counts were found, it is smoothed over
            if smoothing != 0 and count == 0:
                count = smoothing
            temp[j] += [count, len(target)]
    #the amount of matches will be divided by the total amount of n-grams
    result = temp[:,0]/temp[:,1]
    #the final result will be the geometric mean of the different n-gram results
    return result.prod()**(1/len(result))