# CodeBert Grid Experiment Evaluation

Nice to see you around! Have a seat.
Would you like a drink? Maybe a cigar?

Make sure to have all required dependencies installed - they are listed in the [environment.yml](./environment.yml). 
You create a conda environment from the yml using 

```
conda env create -f environment.yml
conda activate Lampion-Codebert-Evaluation
```

Make sure to run your Jupyter Notebook from that environment! 
Otherwise you are (still) missing the dependencies. 

**OPTIONALLY** you can use the environment in which your jupter notebook is already running, with starting a new terminal (from jupyter) and run 

```
conda env update --prefix ./env --file environment.yml  --prune
```

In [None]:
import os
import pandas as pd
import matplotlib.pyplot as plt

import nltk
nltk.download("punkt")
# Homebrew Imports (python-file next to this)
import bleu_evaluator as foreign_bleu

## Data-Loading / Preparation

Make sure that your dataset looks like described in the [Readme](./README.md), that is 

```
./data
    /GridExp_XY
        /configs
            /reference
                test_0.gold
                test_0.output
                bleu.txt (optional, can be created below)
            /config_0
                config.properties
                test_0.gold
                test_0.output
                bleu.txt (optional, can be created below)
            /config_1
                config.properties
                test_0.gold
                test_0.output
                bleu.txt (optional, can be created below)
    ...
```

where the configs **must** be numbered to be correctly detected. 

In [None]:
# This runs the bleu-score upon the config files, creating the bleu.txt's 
# If your data package was provided including the txt you dont need to do this. 
# Existing bleu.txt's will be overwritten. 

#!./metric_runner.sh ./data/PreliminaryResults/

In [None]:
data_directory = "./data/PreliminaryResults"

print(f"looking for results in {data_directory}" )

results={}

for root,dirs,files in os.walk(data_directory):
    for name in files:
        if ".gold" in name:
            directory = os.path.basename(root)
            results[directory]={}
            
            results[directory]["result_file"]=os.path.join(root,"test_0.output")
            results[directory]["gold_file"]=os.path.join(root,"test_0.gold")
            results[directory]["bleu_file"]=os.path.join(root,"bleu.txt")
            if os.path.exists(os.path.join(root,"config.properties")):
                results[directory]["property_file"]=os.path.join(root,"config.properties")

In [None]:
def load_properties(filepath, sep='=', comment_char='#'):
    """
    Read the file passed as parameter as a properties file.
    """
    props = {}
    with open(filepath, "rt") as f:
        for line in f:
            l = line.strip()
            if l and not l.startswith(comment_char):
                key_value = l.split(sep)
                key = key_value[0].strip()
                value = sep.join(key_value[1:]).strip().strip('"') 
                props[key] = value 
    return props

print("reading in property-files")

for key in results.keys():
    if "property_file" in results[key].keys():
        results[f"{key}"]["properties"]=load_properties(results[key]["property_file"])

print("done reading the properties")

In [None]:
print("reading in result-files")

for key in results.keys():
    result_file = results[key]["result_file"]
    f = open(result_file)
    lines=f.readlines()
    results[key]["results"]={}
    for l in lines:
        num = int(l.split("\t")[0])
        content = l.split("\t")[1]
        content = content.strip()
        results[key]["results"][num] = content
    f.close()
    
    gold_file = results[key]['gold_file']
    gf = open(gold_file)
    glines=gf.readlines()
    results[key]["gold_results"]={}
    for gl in glines:
        num = int(gl.split("\t")[0])
        content = gl.split("\t")[1]
        content = content.strip()
        results[key]["gold_results"][num] = content
    gf.close()

print("done reading the result files")
# Comment this in for inspection of results
#results

In [None]:
print("reading in the bleu-scores")

for key in results.keys():
    bleu_file = results[key]["bleu_file"]
    f = open(bleu_file)
    score=f.readlines()[0]
    results[key]["bleu"]=float(score)
    f.close()
    
print("done reading the bleu-scores")

#results["config_0"]["bleu"]

## Bleu-Scores

In the following, the BLEU-scores will be calculated using the foreign libary. 
While there have been minor changes to standard-BLEU, it is the same as used in the original experiment.

The aggregated BLEU-Scores will be stored to the results.

In [None]:
#Deprecated
"""
print("BLEU-Score of the un-altered Set:")
ref_bleu = foreign_bleu.bleuFromMaps(results["gold"]["results"],results["reference"]["results"])
print(ref_bleu[0])
print("BLEU-Score of the if-true config:")
c0_bleu = foreign_bleu.bleuFromMaps(results["gold"]["results"],results["config_0"]["results"])
print(c0_bleu[0])
print("BLEU-Score of the gold set:")
gold_bleu = foreign_bleu.bleuFromMaps(results["gold"]["results"],results["gold"]["results"])
print(gold_bleu[0],"(Should be 100)")

print("calculating bleu scores for all configs")
for key in results.keys():
    if "config" in key or key == "reference":
        bleu = foreign_bleu.bleuFromMaps(results[key]["gold_results"],results[key]["results"])
        results[key]["bleu"]=bleu
        
results["config_5"]["bleu"]
"""

In [None]:
config_names = {0:"if-true", 1:"add-var(pseudo)", 2:"add-neutral",3:"add-var(random)"}
bleu_plots = {}

num_property_files = len([1 for x in results.keys() if "property_file" in results[x].keys()])

i = 0
while i < num_property_files:
    config_type = config_names[i // 3]
    if config_type in bleu_plots.keys():
        c = results[f"config_{i}"]
        bleu_plots[config_type][int(c["properties"]["transformations"])] = c["bleu"]
    else:
        bleu_plots[config_type]={}
        bleu_plots[config_type][0]=results["reference"]["bleu"]
        c = results[f"config_{i}"]
        bleu_plots[config_type][int(c["properties"]["transformations"])] = c["bleu"]
    i = i+1
    
bleu_df = pd.DataFrame.from_dict(bleu_plots)
bleu_df

In [None]:
plt.ylabel("BLEU-Score")
plt.xlabel("# Transformations")

plot = plt.plot(bleu_df,marker="o")

plt.legend(bleu_df.columns)
plt.show()

## Samples


In [None]:
u = list(results["reference"]["gold_results"].values())[0:10]
v = list(results["reference"]["results"].values())[0:10]
w = list(results["config_0"]["results"].values())[0:10]

In [None]:
i = 0

while i < 10:
    print(i,u[i],v[i],w[i])
    i = i + 1 

In [None]:
print(results["config_2"]["gold_results"][1] )
print(results["config_2"]["results"][1])

In [None]:
def jaccard_wrapper(sentenceA,sentenceB,ngram=1):
    tokensA = nltk.word_tokenize(sentenceA)
    tokensB = nltk.word_tokenize(sentenceB)

    ngA_tokens = set(nltk.ngrams(tokensA, n=ngram))
    ngB_tokens = set(nltk.ngrams(tokensB, n=ngram))
    
    return nltk.jaccard_distance(ngA_tokens, ngB_tokens)

def closest_jaccard_match(sentence,references,ngram=1):
    bestMatch = ""
    bestJacc = 2
    for r in references:
        jacc = jaccard_wrapper(sentence,r)
        if jacc < bestJacc:
            bestJacc = jacc
            bestMatch = r
    return (bestMatch,bestJacc)


In [None]:
goldies = list(results["reference"]["gold_results"].values())

sample_index = 120

sample = results["config_4"]["results"][sample_index]

print(sample)
print(closest_jaccard_match(sample,goldies))