# CodeBert Grid Experiment Evaluation

Nice to see you around! Have a seat.
Would you like a drink? Maybe a cigar?

Make sure to have all required dependencies installed - they are listed in the [environment.yml](./environment.yml). 
You create a conda environment from the yml using 

```
conda env create -f environment.yml
conda activate Lampion-Codebert-Evaluation
```

Make sure to run your Jupyter Notebook from that environment! 
Otherwise you are (still) missing the dependencies. 

**OPTIONALLY** you can use the environment in which your jupter notebook is already running, with starting a new terminal (from jupyter) and run 

```
conda env update --prefix ./env --file environment.yml  --prune
```

In [None]:
import os
import pandas as pd
import matplotlib.pyplot as plt
# Homebrew Imports (python-file next to this)
import bleu_evaluator as foreign_bleu

## Data-Loading / Preparation

Make sure that your dataset looks like described in the [Readme](./README.md), that is 

```
./data
    /GridExp_XY
        reference.gold
        reference.output
        /config_0
            config.properties
            test_0.output
        /config_1
            config.properties
            test_0.output
    ...
```

where the configs **must** be numbered to be correctly detected. 

In [None]:
data_directory = "./data/PreliminaryResults"

print(f"looking for results in {data_directory}" )

property_files={}
result_files={}

for root,dirs,files in os.walk(data_directory):
    for name in files:
        if "reference.gold" in name:
            result_files["gold"]=os.path.join(root,name)
        elif "reference.output" in name:
            result_files["reference"]=os.path.join(root,name)
        elif "config_" in root and ".output" in name:
            result_files[os.path.basename(root)]=os.path.join(root,name)
        elif "config_" in root and ".properties" in name:
            property_files[os.path.basename(root)]=os.path.join(root,name)

print(f"There were {len(result_files)} result- and {len(property_files)} property-files found.")
# Sanity Checks 
if "gold" not in result_files.keys():
    print("There was no Gold-file for the results found!")
if "reference" not in result_files.keys():
    print("There was no reference-file for the results found!")

In [None]:
# initialize bare results
results = {}
# Fill them with file-paths
for (key,value) in result_files.items():
    results[key] = {}
    results[key]["result_file"] = value

for (key,value) in property_files.items():
    results[key]["property_file"] = value

In [None]:
def load_properties(filepath, sep='=', comment_char='#'):
    """
    Read the file passed as parameter as a properties file.
    """
    props = {}
    with open(filepath, "rt") as f:
        for line in f:
            l = line.strip()
            if l and not l.startswith(comment_char):
                key_value = l.split(sep)
                key = key_value[0].strip()
                value = sep.join(key_value[1:]).strip().strip('"') 
                props[key] = value 
    return props


In [None]:
print("reading in property-files")

for key in property_files.keys():
    prop_file = results[key]["property_file"]
    if prop_file:
        results[f"{key}"]["properties"]=load_properties(prop_file)

In [None]:
print("reading in result-files")

for key in results.keys():
    result_file = results[key]["result_file"]
    if result_file:
        f = open(result_file)
        lines=f.readlines()
        results[key]["results"]={}
        for l in lines:
            num = int(l.split("\t")[0])
            content = l.split("\t")[1]
            content = content.strip()
            results[key]["results"][num] = content
        f.close()

# Comment this in for inspection of results
#results

## Bleu-Scores

In the following, the BLEU-scores will be calculated using the foreign libary. 
While there have been minor changes to standard-BLEU, it is the same as used in the original experiment.

The aggregated BLEU-Scores will be stored to the results.

In [None]:
print("BLEU-Score of the un-altered Set:")
ref_bleu = foreign_bleu.bleuFromMaps(results["gold"]["results"],results["reference"]["results"])
print(ref_bleu[0])
print("BLEU-Score of the if-true config:")
c0_bleu = foreign_bleu.bleuFromMaps(results["gold"]["results"],results["config_0"]["results"])
print(c0_bleu[0])
print("BLEU-Score of the gold set:")
gold_bleu = foreign_bleu.bleuFromMaps(results["gold"]["results"],results["gold"]["results"])
print(gold_bleu[0],"(Should be 100)")

In [None]:
%%time
print("calculating bleu scores for all configs")
for key in results.keys():
    if "gold" not in key:
        bleu = foreign_bleu.bleuFromMaps(results["gold"]["results"],results[key]["results"])
        results[key]["bleu"]=bleu

In [None]:
config_names = {0:"if-true", 1:"add-var(pseudo)", 2:"add-neutral",3:"add-var(random)"}
i = 0
bleu_plots = {}
while i < len(property_files.keys()):
    config_type = config_names[i // 3]
    if config_type in bleu_plots.keys():
        c = results[f"config_{i}"]
        bleu_plots[config_type][int(c["properties"]["transformations"])] = c["bleu"][0]
    else:
        bleu_plots[config_type]={}
        bleu_plots[config_type][0]=results["reference"]["bleu"][0]
        c = results[f"config_{i}"]
        bleu_plots[config_type][int(c["properties"]["transformations"])] = c["bleu"][0]
    i = i+1

In [None]:
ubi = pd.DataFrame.from_dict(bleu_plots)
ubi

In [None]:
plt.ylabel("BLEU-Score")
plt.xlabel("# Transformations")

plot = plt.plot(ubi,marker="o")

plt.legend(ubi.columns)
plt.show()