## CS5803 NLP
### Assignment 1
#### Tanmay Garg, Tanmay Goyal, Tanay Yadav
#### Roll no: CS20BTECH11063, AI20BTECH11021, AI20BTECH11026

### **BLEU Score**

Let $x$ and $y$ be two sentences we wish to compare. Then, we define the modified N-gram as:

$$p_n = \frac{\sum\limits_{n-gram \in x\cap y}min(count_x(n-gram) , count_y(n-gram))}{\sum\limits_{n-gram \in x}count_x(n-gram)}$$

Here, $y$ is the ground truth and $x$ is the machine translated text. 

Define the BLEU Score as:

$$BLEU = BP \times exp\left(\sum\limits_{n=1}^N w_n \log(p_n)\right)$$
where $N = 4$ , $w_n = \frac{1}{N}$ and BP is the Brevity Penalty, which we set to 1.

1. Implement the BLEU score metric and pre-process the text by lower-casing the text and removing all punctuations.

2. Use this implementation to find the BLEU score when $x$ = "The boys were playing happily on the ground" and $y$ = "The boys were playing football on the field."

3. Explain why we take a minimum in the numerator.

4. Use the implmentation to find the BLEU score for 5 pairs of sentences and explain the disadvantage of the BLEU score.

In [20]:
import re
from collections import defaultdict
from math import log, exp  
import numpy as np

def preprocess_text(text):
    '''
    Function to preprocess the text by removing punctuations and converting to lower case
    '''
    # [^\w\s] -> ^ means except , \w refers to any alphanumeric character and \s refers to whitespace
    text = re.sub(r'[^\w\s]', '', text).lower()
    return text

def n_gram_dict(text , n):
    '''
    Function to create a dictionary consisting of the n-grams and their counts
    '''
    text_list = text.split(' ')
    dict = {}    
    # we also check for duplicates
    for i in range(n-1 , len(text_list)):
        key = tuple(text_list[i-n+1 : i+1])
        dict[key] = 1 if key not in dict.keys() else dict[key] + 1        
    return dict

def modified_ngram_precision(n_gram_dict_x, n_gram_dict_y):
    '''
    Calculates the modified n-gram precision given the n-gram dictionaries for x and y
    '''
    numerator = 0
    denominator = 0
    for n_gram in n_gram_dict_x.keys():
        denominator += n_gram_dict_x[n_gram]
        if n_gram in n_gram_dict_y.keys():
            numerator += min(n_gram_dict_x[n_gram], n_gram_dict_y[n_gram])
            
    return numerator/denominator
    
def bleu_score(x, y, N = 4):
    '''
    Function to calculate the BLEU score
    '''
    # preprocessing the text
    x = preprocess_text(x)
    y = preprocess_text(y)
    
    BP = 1
    weights = [1/N for i in range(N)] 

    modified_n_gram_list = []

    for i in range(1, N+1):
        # creating the n-gram dictionaries
        n_gram_dict_x = n_gram_dict(x, i)
        n_gram_dict_y = n_gram_dict(y, i)
        modified_n_gram_list.append(modified_ngram_precision(n_gram_dict_x , n_gram_dict_y))
        
    score = 0
    for (w , p) in zip(weights , modified_n_gram_list):
        score += w * log(p)
    return BP * exp(score)

x = "The boys were playing happily on the ground."  
y = "The boys were playing football on the field."

print(bleu_score(x, y , 4))


1
{('the',): 2, ('boys',): 1, ('were',): 1, ('playing',): 1, ('happily',): 1, ('on',): 1, ('ground',): 1}
{('the',): 2, ('boys',): 1, ('were',): 1, ('playing',): 1, ('football',): 1, ('on',): 1, ('field',): 1}
2
{('the', 'boys'): 1, ('boys', 'were'): 1, ('were', 'playing'): 1, ('playing', 'happily'): 1, ('happily', 'on'): 1, ('on', 'the'): 1, ('the', 'ground'): 1}
{('the', 'boys'): 1, ('boys', 'were'): 1, ('were', 'playing'): 1, ('playing', 'football'): 1, ('football', 'on'): 1, ('on', 'the'): 1, ('the', 'field'): 1}
3
{('the', 'boys', 'were'): 1, ('boys', 'were', 'playing'): 1, ('were', 'playing', 'happily'): 1, ('playing', 'happily', 'on'): 1, ('happily', 'on', 'the'): 1, ('on', 'the', 'ground'): 1}
{('the', 'boys', 'were'): 1, ('boys', 'were', 'playing'): 1, ('were', 'playing', 'football'): 1, ('playing', 'football', 'on'): 1, ('football', 'on', 'the'): 1, ('on', 'the', 'field'): 1}
4
{('the', 'boys', 'were', 'playing'): 1, ('boys', 'were', 'playing', 'happily'): 1, ('were', 'playin