# Guided-MT Code2Vec Evaluation

This Notebook runs over the experiment-outputs, extracts data and makes plots.

Expected Layout:

```
.
├── README.md
├── data
│   └── random-MRR-max
│       ├── seed-2880
│       │   ├── data
│       │   │   ├── gen0
│       │   │   │   ├── 3b2459
│       │   │   │   ├── 3b2459.json
│       │   │   │   ├── 447e22
│       │   │   │   ├── 447e22.json
│       │   │   │   ├── 4495c7
│       │   │   │   ├── 4495c7.json
│       │   │   │   ├── 52667b
│       │   │   │   ├── 52667b.json
│       │   │   │   ├── 6855ba
│       │   │   │   ├── 6855ba.json
│       │   │   │   ├── 68ec75
│       │   │   │   ├── 68ec75.json
│       │   │   │   ├── 6cc14d
│       │   │   │   ├── 6cc14d.json
│       │   │   │   ├── 6d6845
│       │   │   │   ├── 6d6845.json
│       │   │   │   ├── 7a2d67
│       │   │   │   ├── 7a2d67.json
│       │   │   │   ├── ed0dd9
│       │   │   │   └── ed0dd9.json
│       │   │   ├── gen1
│       │   │   ├── ...
│       │   │   ├── gen8
│       │   │   ├── ...
│       │   │   ├── generation_0
│       │   │   │   ├── Some.java
│       │   │   │   ├── ...
│       │   │   │   ├── Other.java
│       │   │   │   └── Different.java
│       │   │   └── initialGen
│       │   │       └── 3bf9ce
│       │   └── results.txt
│       ├── seed-5142
│           └── results.txt
│       ...
├── evaluation.ipynb
└── requirements.txt
```

## Data Loading

Most of this is done in the nearby extract script, but we also extract some highlevel variables.

In [None]:
import pandas as pd
import seaborn as sbn
import matplotlib.pyplot as plt
import extract

# Important: Specify Directory without / at the end!
directory:str = "./data"

In [None]:
%%time
df = extract.make_df(directory)

In [None]:
all_metrics = ["F1","MRR","EDITDIST","PMRR","REC","PREC"]
all_transformers = extract.get_known_transformers()
all_experiments = set(df["experiment"])
all_seeds = set(df["seed"])

In [None]:
df.head(5)

## Per Experiment Plots

In [None]:
broader_grouped_df = df.groupby(["experiment","generation"]).mean().reset_index()
broader_grouped_df.head(5)

In [None]:
sbn.relplot(data=df,x="generation",y="F1", hue="algorithm")

In [None]:
sbn.relplot(data=df,x="generation",y="MRR", hue="algorithm")

In [None]:
sbn.relplot(data=broader_grouped_df,x="generation",y="F1", hue="experiment")

In [None]:
sbn.relplot(data=broader_grouped_df,x="generation",y="F1", hue="experiment")

In [None]:
fig, axs = plt.subplots(nrows=2,sharex=True)

sbn.scatterplot(data=broader_grouped_df,x="generation",y="F1", hue="experiment",ax=axs[0],legend=None)
axs[0].set_ylim([0.3,0.8])

sbn.scatterplot(data=broader_grouped_df,x="generation",y="MRR", hue="experiment",ax=axs[1])
axs[1].set_ylim([0.3,0.8])

#plt.legend(loc='center left')
plt.legend(title="Experiment",bbox_to_anchor=(1.05, 2))

In [None]:
for exp in all_experiments:
    sbn.relplot(data=broader_grouped_df[broader_grouped_df["experiment"]==exp],x="generation",y="F1",kind="line")
    plt.title(f"F1 Score for {exp}")
    plt.show()


for exp in all_experiments:
    sbn.relplot(data=broader_grouped_df[broader_grouped_df["experiment"]==exp],x="generation",y="MRR",kind="line")
    plt.title(f"MRR Score for {exp}")
    plt.show()