#<p><center> **COMMONSENSE KNOWLEDGE BASES: MERONYMIC RELATIONSHIPS ACQUISITION AND THEIR VERIFICATION WITH LARGE LANGUAGE MODELS**

---

<p><center> Master thesis - MS in Language Analysis and Proccessing
<p><center> Julia Fidalgo Mariño
<p><center> Supervised by German Rigau</center></p>


### **FIRST OPTION FOR EVALUATING: SENT2VEC**


In [1]:
%%capture
!pip install scikit-learn
!pip install sent2vec
!pip install nltk
!pip install transformers
!pip install torch
!pip install numpy
!pip install scikit-learn

In [2]:
import nltk
nltk.download('wordnet')

[nltk_data] Downloading package wordnet to /root/nltk_data...


True

In [3]:
import sent2vec.vectorizer

In [4]:
from sent2vec.vectorizer import Vectorizer

sentences = [
    "a movable organ for flying (one of a pair)",
    "one of the two or more external appendages that are used for flying",
    "a structure that protrudes from an object, typically an aircraft, bird, or insect, to provide lift and support during flight.",
    "a limb of an animal, most typically a bird, which enables it to fly"
]
vectorizer = Vectorizer(pretrained_weights='distilbert-base-uncased')
vectorizer.run(sentences)
vectors = vectorizer.vectors

Initializing Bert distilbert-base-uncased
Vectorization done on cpu


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

We strongly recommend passing in an `attention_mask` since your input_ids may be padded. See https://huggingface.co/docs/transformers/troubleshooting#incorrect-output-when-padding-tokens-arent-masked.


In [6]:
from scipy import spatial

dist_1 = spatial.distance.cosine(vectors[0], vectors[1])
dist_2 = spatial.distance.cosine(vectors[0], vectors[2])
dist_3 = spatial.distance.cosine(vectors[0], vectors[3])

print("Distance between WN and GPT4:",dist_1)
print("Distance between WN and LLama3.1:",dist_2)
print("Distance between WN and Phi3:",dist_3)

Distance between WN and GPT4: 0.0323824732462904
Distance between WN and LLama3.1: 0.21206462905566126
Distance between WN and Phi3: 0.05690950923189397


###**SECOND OPTION FOR EVALUATING: SENTENCE TRANSFORMERS**

In [7]:
%%capture
!pip install -U sentence-transformers

In [8]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("distilbert-base-uncased")

#sentences from the definitions of wordnet, gpt, llama and phi
sentences = [
    "a movable organ for flying (one of a pair)",
    "one of the two or more external appendages that are used for flying",
    "a structure that protrudes from an object, typically an aircraft, bird, or insect, to provide lift and support during flight.",
    "a limb of an animal, most typically a bird, which enables it to fly"
]

embeddings = model.encode(sentences)
print(embeddings.shape)

similarities = model.similarity(embeddings, embeddings)
print(similarities)



(4, 768)
tensor([[1.0000, 0.8200, 0.7504, 0.7869],
        [0.8200, 1.0000, 0.8392, 0.8537],
        [0.7504, 0.8392, 1.0000, 0.9154],
        [0.7869, 0.8537, 0.9154, 1.0000]])


###**METRICS**

In [10]:
import numpy as np
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score, accuracy_score

#real data
y_true = [1] * 106 + [0] * 44
#example of prediction for phi3
y_pred = [1] * 93 + [0] * 57

for i in range(11):
    y_pred[i] = 0
for i in range(106, 112):
    y_pred[i] = 1

conf_matrix = confusion_matrix(y_true, y_pred)
print("Confussion matrix:")
print(conf_matrix)

#metrics for phi3
precision = precision_score(y_true, y_pred)
recall = recall_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred)
accuracy = accuracy_score(y_true, y_pred)

print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1 Score: {f1:.2f}")
print(f"Accuracy: {accuracy:.2f}")


Confussion matrix:
[[38  6]
 [24 82]]
Precision: 0.93
Recall: 0.77
F1 Score: 0.85
Accuracy: 0.80
