## Evaluation metrics
Evaluation metrics are used to measure the quality of the model. One of the most important topics in machine learning is how to evaluate your model. When you build your model, it is very crucial to measure how accurately it predicts your expected outcome.

In this lab, we will cover Evaluation metrics for Classification. There are many ways to measure classification performance: Accuracy, and confusion matrices, are some of the most popular metrics. Precision & recall are widely used metrics for classification problems.

Let's start with some definitions:

#### Confusion Matrix: 
A confusion matrix is a table that summarizes the performance of a classifier by showing the number of true positive, true negative, false positive, and false negative predictions. It is useful to understand the strengths and weaknesses of a classifier, and also provides information about the distribution of errors made by the classifier:

    - TP (True Positive)**- The model correctly predicted the positive class.
    **- TN (True Negative)** - The model correctly predicted the negative class.
    **- FP (False Positive)** - The model predicted the positive class, but it's actually negative.
    **- FN (False Negative)** - The model predicted the negative class, but it's actually positive.

#### Accuracy: 
This metric is defined as the ratio of correctly predicted instances to the total number of instances in the dataset. It measures how often the classifier is correct. However, it can be misleading in imbalanced datasets, as it may give high accuracy scores even if the classifier is not able to accurately predict the minority class. Based on the confusion matrix shows, the accuracy is computed as:
        TP+TN/TP+TN+FP+FN
        


#### Precision: 
Precision is the ratio of correctly predicted positive instances to the total number of instances predicted as positive. It is a measure of the classifier's ability to correctly identify positive instances and avoid false positives.
        TP/TP+FN

#### Recall (or Sensitivity): 
Recall is the ratio of correctly predicted positive instances to the total number of actual positive instances. It is a measure of the classifier's ability to detect all positive instances.
        TP/TP+FN

#### F1 Score: 
The F1 Score is the harmonic mean of precision and recall. It balances precision and recall and gives an overall performance score for the classifier.
2 xPrecision x recall/precision+recall

In [None]:
TP, TN, FP, FN = 4, 91, 1, 4
accuracy = (TP + TN)/(TP + TN + FP + FN)
print(accuracy)

In [1]:
TP, TN, FP, FN = 0, 95, 5, 0
accuracy = (TP + TN)/(TP + TN + FP + FN)
print(accuracy)

0.95


In [2]:
TP, FP = 114, 14
precision=TP/(TP+FP)
print(precision)

0.890625


In [3]:
TP, FN = 114, 0
recall=TP/(TP+FN)
print(recall)

1.0


In [4]:
TP, FP, FN = 2.00,  1.00, 90.00

Precision=TP/(TP+FP)
Recall=TP/(TP+FN)

f1_score=2*Precision*Recall/(Precision+Recall)
print(f1_score)


0.042105263157894736


In [5]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score,confusion_matrix

y_true = [1, 0, 1, 1, 0, 1]
y_pred = [1, 0, 1, 0, 0, 1]

conf_matrix = confusion_matrix(y_true, y_pred)
acc = accuracy_score(y_true, y_pred)
prec = precision_score(y_true, y_pred)
rec = recall_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred)

print(conf_matrix)
print("Accuracy: ", acc)
print("Precision: ", prec)
print("Recall: ", rec)
print("F1-score: ", f1)

[[2 0]
 [1 3]]
Accuracy:  0.8333333333333334
Precision:  1.0
Recall:  0.75
F1-score:  0.8571428571428571


In [6]:
import numpy as np
import pandas as pd
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix

# True labels of the data
y_true = [0, 1, 0, 1, 1, 0, 1, 1, 0, 0]

# Predicted labels of the data
y_pred = [0, 1, 0, 1, 0, 1, 1, 0, 1, 0]

# Calculate accuracy
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy: ", accuracy)

# Calculate precision
precision = precision_score(y_true, y_pred)
print("Precision: ", precision)

# Calculate recall
recall = recall_score(y_true, y_pred)
print("Recall: ", recall)

# Calculate F1 Score
f1 = f1_score(y_true, y_pred)
print("F1 Score: ", f1)

# Calculate Confusion Matrix
conf_mat = confusion_matrix(y_true, y_pred)
print("Confusion Matrix: \n", conf_mat)

Accuracy:  0.6
Precision:  0.6
Recall:  0.6
F1 Score:  0.6
Confusion Matrix: 
 [[3 2]
 [2 3]]


In [7]:
X_train = [[4,2,1],[3,4,6],[5,6,7],[8,9,7]]
y_train = [1,2,1,2]
X_test = [[4,3,1],[2,4,3],[5,6,1],[5,9,9]]
y_test = [1,2,2,2]

In [8]:
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import precision_score
# Train the logistic regression model
log_reg = LogisticRegression(random_state=0)
log_reg.fit(X_train, y_train)

# Make predictions with the logistic regression model
y_pred_log_reg = log_reg.predict(X_test)

# Calculate the evaluation metrics for logistic regression
prec_log_reg = precision_score(y_test, y_pred_log_reg, average="weighted")

# Print the evaluation metrics for logistic regression
print("Precision: ", prec_log_reg)


Precision:  0.8333333333333334


In [9]:
X_train = [[4,2,1],[3,4,6],[5,6,7],[8,9,7]]
y_train = [1,2,1,2]
X_test = [[4,3,1],[2,4,3],[5,6,1],[5,9,9]]
y_test = [1,2,2,2]

In [10]:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import  recall_score

# Train the decision tree model
dt = DecisionTreeClassifier(random_state=0)
dt.fit(X_train, y_train)

# Make predictions with the decision tree model
y_pred_dt = dt.predict(X_test)

# Calculate the evaluation metrics for decision tree
rec_dt = recall_score(y_test, y_pred_dt, average="weighted")

# Print the evaluation metrics for logistic regression
print("Recall: ", rec_dt)

Recall:  0.75


##### Fit a Decision tree and a Logistic Regression models using the given input and output patterns X and Y, respectively. Once fitted, determine which models presents the best performance in term of F1-score. 

In [12]:
from sklearn.datasets import make_blobs

X_train, y_train = make_blobs(n_samples=100, centers=2,
                  random_state=0, cluster_std=2.3)

X_test, y_test = make_blobs(n_samples=10, centers=2,
                  random_state=0, cluster_std=4.5)

In [13]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
# Train the decision tree model
dt = DecisionTreeClassifier(random_state=0)
dt.fit(X_train, y_train)

# Make predictions with the decision tree model
y_pred_dt = dt.predict(X_test)

from sklearn.metrics import f1_score
f1_dt=f1_score(y_test, y_pred_dt)
print("DT F1 Score:", f1_dt)

lr=LogisticRegression(random_state=0)
lr.fit(X_train, y_train)

y_pred_lr=lr.predict(X_test)

from sklearn.metrics import f1_score
f1_lr=f1_score(y_test, y_pred_lr)
print("LR F1 Score:",f1_lr)

DT F1 Score: 0.8
LR F1 Score: 0.7272727272727273
