## Confusion Matrix 

#### Hemant Thapa

In [1]:
import numpy as np
import pandas as pd
from sklearn.metrics import confusion_matrix

- True Positive (TP):

A True Positive occurs when a model or classifier correctly predicts a positive or "yes" class when the true label is indeed positive. In other words, it correctly identifies the presence of a condition or event when it is actually present.

- True Negative (TN):

A True Negative occurs when a model or classifier correctly predicts a negative or "no" class when the true label is indeed negative. It correctly identifies the absence of a condition or event when it is actually absent.

- False Positive (FP):

A False Positive occurs when a model or classifier incorrectly predicts a positive or "yes" class when the true label is actually negative. It means that the model has falsely identified the presence of a condition or event when it is not present. False Positives are also known as Type I errors.

- False Negative (FN):

A False Negative occurs when a model or classifier incorrectly predicts a negative or "no" class when the true label is actually positive. It means that the model has falsely identified the absence of a condition or event when it is actually present. False Negatives are also known as Type II errors.

#### Example 1

In [2]:
#creating dataset of y true valyes and predicted
data = {'y_true': [0, 1, 0, 1],
        'y_pred': [1, 1, 1, 0]
       }

In [3]:
#calculating confusion matrix
cm = confusion_matrix(data['y_true'], data['y_pred'])

In [4]:
print("Confusion Matrix: \n")
print(cm)

Confusion Matrix: 

[[0 2]
 [1 1]]


In [5]:
#indexing each value for truth table
tn = cm[0, 0]
fp = cm[0, 1]
fn = cm[1, 0]
tp = cm[1, 1]

In [6]:
print(f"True Positives (TP): {tp}")
print(f"True Negatives (TN): {tn}")

True Positives (TP): 1
True Negatives (TN): 0


True Positive (TP): Correctly identifying a positive case.

True Negative (TN): Correctly identifying a negative case.

In [7]:
print(f"False Positives (FP): {fp}")
print(f"False Negatives (FN): {fn}")

False Positives (FP): 2
False Negatives (FN): 1


False Positive (FP): Incorrectly identifying a positive case (Type I error).

False Negative (FN): Incorrectly identifying a negative case (Type II error).

#### Example 2

In [8]:
#creating image dataset for true value and predicted values
image_data = { 'y_true': ["cat", "ant", "cat", "cat", "ant", "bird"],
               'y_pred': ["ant", "ant", "cat", "cat", "ant", "cat"]
             }

In [9]:
#labelling values
labels = ["ant", "bird", "cat"]
#confusion matrix
cm = confusion_matrix(image_data['y_true'], image_data['y_pred'], labels=labels)
print("Confusion Matrix: \n")
print(cm)

Confusion Matrix: 

[[2 0 0]
 [0 0 1]
 [1 0 2]]


In [10]:
#dictionaries for each class
metrics = {label: {'TP': 0, 'FP': 0, 'FN': 0, 'TN': 0} for label in labels}

#TP, FP, FN, TN 
for i, label in enumerate(labels):
    TP = cm[i, i]
    FN = cm[i, :].sum() - TP
    FP = cm[:, i].sum() - TP
    TN = cm.sum() - (FP + FN + TP)
    
    metrics[label]['TP'] = TP
    metrics[label]['FN'] = FN
    metrics[label]['FP'] = FP
    metrics[label]['TN'] = TN


| True Class (y_true) | Predicted Class (y_pred) | Outcome  |
|---------------------|---------------------------|----------|
|         cat         |            ant            | Incorrect |
|         ant         |            ant            | Correct  |
|         cat         |            cat            | Correct  |
|         cat         |            cat            | Correct  |
|         ant         |            ant            | Correct  |
|         bird        |            cat            | Incorrect |


In [11]:
cm_result = pd.DataFrame(metrics).T

In [12]:
cm_result

Unnamed: 0,TP,FP,FN,TN
ant,2,1,0,3
bird,0,0,1,5
cat,2,1,1,2


In [13]:
#cat -> 0, ant -> 1 and bird -> 2
image_data_binary = { 'y_true': [0, 1, 0, 0, 1, 2],
                      'y_pred': [1, 1, 0, 0, 1, 0]
                     }

In [14]:
cm = confusion_matrix(image_data_binary['y_true'], image_data_binary['y_pred'], labels=[1, 2, 0])

In [15]:
print("Confusion Matrix: \n")
print(cm)

Confusion Matrix: 

[[2 0 0]
 [0 0 1]
 [1 0 2]]


In [16]:
#mapping and label order
label_mapping = {0: "cat", 1: "ant", 2: "bird"}
labels = [1, 2, 0]
cm = confusion_matrix(image_data_binary['y_true'], image_data_binary['y_pred'], labels=labels)

In [17]:
#dictionaries to hold each class
metrics = {label_mapping[label]: {'TP': 0, 'FP': 0, 'FN': 0, 'TN': 0} for label in labels}
#TP, FP, FN, TN 
for i, label in enumerate(labels):
    TP = cm[i, i]
    FN = sum(cm[i, :]) - TP
    FP = sum(cm[:, i]) - TP
    TN = cm.sum() - (TP + FP + FN)

    metrics[label_mapping[label]]['TP'] = TP
    metrics[label_mapping[label]]['FN'] = FN
    metrics[label_mapping[label]]['FP'] = FP
    metrics[label_mapping[label]]['TN'] = TN

| True Class (y_true) | Predicted Class (y_pred) | Outcome  |
|---------------------|---------------------------|----------|
|         cat         |            ant            | Incorrect |
|         ant         |            ant            | Correct  |
|         cat         |            cat            | Correct  |
|         cat         |            cat            | Correct  |
|         ant         |            ant            | Correct  |
|         bird        |            cat            | Incorrect |


In [18]:
cm_result = pd.DataFrame(metrics).T
cm_result

Unnamed: 0,TP,FP,FN,TN
ant,2,1,0,3
bird,0,0,1,5
cat,2,1,1,2


- [Semi-supervised learning on digits with Label Propagation](https://scikit-learn.org/stable/auto_examples/semi_supervised/plot_label_propagation_digits.html#sphx-glr-auto-examples-semi-supervised-plot-label-propagation-digits-py)
- [ConfusionMatrixDisplay documentation](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.ConfusionMatrixDisplay.html)
- [confusion_matrix function documentation](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html)
