# Research Question 3

## Can we assess the quality of generated commit messages by comparing their style?

Strategy: Evaluate each given cluster (kmeans for the number of styles from 2) on quality by a Quality Model.  

Print statistics for each (mean, std, min, max, quantiles) to check whether the style embedding provides information on quality.

https://arxiv.org/pdf/2006.00843.pdf

In [22]:
import torch
import numpy as np
import pandas as pd
from tqdm import tqdm
from sklearn.metrics.pairwise import euclidean_distances

import warnings
warnings.filterwarnings('ignore')

import sys
sys.path.append("..")
from util.style_model import StyleModel

In [7]:
test_data = pd.read_pickle('../data/04c_Test_Set.pkl')

In [8]:
model = StyleModel()
model.load_state_dict(torch.load('../model/StyleModel.pt'))

<All keys matched successfully>

In [9]:
messages = test_data["message"].tolist()

vectors = []

for message in tqdm(messages):
    vectors.append(model(message).squeeze().detach().numpy())

vectors = np.array(vectors)

100%|██████████| 10684/10684 [02:50<00:00, 62.69it/s]


In [11]:
authors_centroids = {}

for group in test_data.groupby('author_email'):
    author_embeddings = np.array(vectors[group[1].index])
    authors_centroids[group[0]] = np.mean(author_embeddings, axis = 0)

In [12]:
centroids_array = [value for value in authors_centroids.values()]

In [66]:
good_message_embedding = model("MINOR Removed unused jQuery.dialog creation in CMSMain.AddFor.js, which causes mem leaks (now uses dedicated pages/add UI) ").detach().numpy()
bad_message_embedding = model("Update files").detach().numpy()
worst_message_embedding = model("12345").detach().numpy()

In [67]:
distances = pd.DataFrame()

distances["Good Message: \"MINOR Removed unused ...\""] = pd.DataFrame(euclidean_distances(centroids_array, good_message_embedding))
distances["Bad Message: \"Update files\""] = pd.DataFrame(euclidean_distances(centroids_array, bad_message_embedding))
distances["Worst Message: \"12345\""] = pd.DataFrame(euclidean_distances(centroids_array, worst_message_embedding))

distances.index = ['Author 1', 'Author 2', 'Author 3', 'Author 4', 'Author 5', 'Author 6', 'Author 7']

distances.style.background_gradient(cmap='coolwarm', axis=None).set_precision(3)

Unnamed: 0,"Good Message: ""MINOR Removed unused ...""","Bad Message: ""Update files""","Worst Message: ""12345"""
Author 1,0.887,1.138,1.457
Author 2,0.921,1.151,1.464
Author 3,0.944,1.0,1.347
Author 4,0.915,1.155,1.459
Author 5,0.929,1.112,1.433
Author 6,0.734,1.147,1.452
Author 7,0.695,1.203,1.501
