# Implement the classification metrics from scratch

To this exercise you will need to implement the precision, recall, and f1-measure without using scikit-learn or any other library that already implements such metrics.

Your algorithm should take as input the predictions made on the test set (y_pred) and the actual class values of such set (y_test).

You will need to find at least the TP, FP, and FN to compute the three metrics.

You can use this part of code to help your implementation or you can define your own code.

In [24]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
import numpy as np

data = load_breast_cancer()
X, y = data.data, data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

def precision(y_true, y_pred):
    """
    Calculate the precision score
    :param y_true: True labels
    :param y_pred: Predicted labels
    :return: Precision score
    """
    # Calculate the true positives and false positives 
    true_positives = np.sum((y_true == 1) & (y_pred == 1))
    false_positives = np.sum((y_true == 0) & (y_pred == 1))
    
    # Calculate the precision: (Precision = TP / (TP + FP))
    precision_score = true_positives / (true_positives + false_positives)
    
    return precision_score

def recall(y_true, y_pred):
    """
    Calculate the recall score
    :param y_true: True labels
    :param y_pred: Predicted labels
    :return: Recall score
    """
   # Calculate the true positives and false negatives
    true_positives = np.sum((y_true == 1) & (y_pred == 1))
    false_negatives = np.sum((y_true == 1) & (y_pred == 0))
    
    # Calculate the recall: (Recall = TP / (TP + FN))
    recall_score = true_positives/ (true_positives + false_negatives)
    
    return recall_score

def f1_score(y_true, y_pred):
    """
    Calculate the F1 score
    :param y_true: True labels
    :param y_pred: Predicted labels
    :return: F1 score
    """
   # Calculate the precision and recall
    precision_score = precision(y_true, y_pred)
    recall_score = recall(y_true, y_pred)
    
    # Calculate the F1 score: (2 * (Precision * Recall) / Precision + Recall)
    f1_score = 2 * (precision_score * recall_score) / (precision_score + recall_score)
    
    return f1_score

# Fit a model and make predictions
model = LogisticRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

# Calculate precision, recall and F1-score
p = precision(y_test, y_pred)
r = recall(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

print("Precision: {:.2f}".format(p))
print("Recall: {:.2f}".format(r))
print("F1-Score: {:.2f}".format(f1))


Precision: 0.97
Recall: 0.93
F1-Score: 0.95


STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
