## Metrics used in evaluations

References:
- https://www.geeksforgeeks.org/f1-score-in-machine-learning/
- https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MultiLabelBinarizer.html

### Precision
- Emphasizes Quality - it measures the accuracy of the positive predictions
- `TruePositives / (TruePositives + FalsePositives)`

### Recall
- Emphasizes Quantity - how well the model captures all relevant instances
- `TruePositives / (TruePositives + FalseNegatives)`

In [7]:
text1 = "The cat sat on the mat."
text2 = "The cat sat on the big mat."

split = lambda t: set(t.lower().split())

word_set1 = split(text1)
word_set2 = split(text2)

true_positives = len(word_set1.intersection(word_set2))
false_positives = len(word_set2 - word_set1)
false_negatives = len(word_set1 - word_set2)

precision = true_positives / (true_positives + false_positives)
recall = true_positives / (true_positives + false_negatives)

print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")

Precision: 0.83
Recall: 1.00


In [16]:
# Calculating precision and recall using sklearn
from sklearn.metrics import precision_score, recall_score
from sklearn.preprocessing import MultiLabelBinarizer

vocabulary = list(word_set1.union(word_set2))
mlb = MultiLabelBinarizer(classes=vocabulary)  # Binarize the texts: presence (1) or absence (0) of words
print(f"Classes: {mlb.classes}")

binary_text1 = mlb.fit_transform([word_set1])[0]
binary_text2 = mlb.fit_transform([word_set2])[0]
print(f"Binarized text1: {binary_text1}")
print(f"Binarized text2: {binary_text2}")

precision = precision_score(binary_text1, binary_text2)
recall = recall_score(binary_text1, binary_text2)

print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")

Classes: ['cat', 'sat', 'big', 'the', 'on', 'mat.']
Binarized text1: [1 1 0 1 1 1]
Binarized text2: [1 1 1 1 1 1]
Precision: 0.83
Recall: 1.00


### F1 score
- The F1 Score is the harmonic mean of precision and recall, providing a balance between the two, especially when they are in conflict.
- `TruePositives / (TruePositives + FalseNegatives)`

In [21]:
f1 = 2 * (precision * recall) / (precision + recall)
print(f"F1 score: {f1:.2f}")

from sklearn.metrics import f1_score
print(f"F1 score: {f1_score(binary_text1, binary_text2):.2f}")

F1 score: 0.91
F1 score: 0.91
