# Confusion Matrix For Binary Class Classification

A Confusion Matrix is a performance measurement for machine learning classification problem where output can be two or more classes. It is a table with 4 different combinations of predicted and actual values.

![image-3.png](attachment:image-3.png)

It is extremely useful for measuring Recall, Precision, Specificity, Accuracy, and most importantly AUC-ROC curves.

***Let’s understand TP, FP, FN, TN in terms of pregnancy analogy.***

![image-4.png](attachment:image-4.png)

### True Positive:

**Interpretation**: You predicted positive and it’s true.

    You predicted that a woman is pregnant and she actually is.

### True Negative:

**Interpretation**: You predicted negative and it’s true.

    You predicted that a man is not pregnant and he actually is not.

### False Positive: (Type 1 Error)

**Interpretation**: You predicted positive and it’s false.

    You predicted that a man is pregnant but he actually is not.

### False Negative: (Type 2 Error)

**Interpretation**: You predicted negative and it’s false.

    You predicted that a woman is not pregnant but she actually is.

***Just Remember, We describe predicted values as Positive and Negative and actual values as True and False.***

![image-5.png](attachment:image-5.png)

### How to Calculate Confusion Matrix for a 2-class classification problem?

Let’s understand the confusion matrix through math.

![image-6.png](attachment:image-6.png)

## Recall

![image-7.png](attachment:image-7.png)

The above equation can be explained by saying, from all the positive classes, how many we predicted correctly.

##### Recall should be high as possible.

## Precision

![image-8.png](attachment:image-8.png)

The above equation can be explained by saying, from all the classes we have predicted as positive, how many are actually positive.

##### Precision should be high as possible.

## Accuracy

From all the classes (positive and negative), how many of them we have predicted correctly. In this case, it will be 4/7.

##### Accuracy should be high as possible.

## F-measure

![image-9.png](attachment:image-9.png)

# Confusion Matrix for Multi-Class Classification

For simplicity’s sake, let’s consider our multi-class classification problem to be a 3-class classification problem. Say, we have a dataset that has three class labels, namely Apple, Orange and Mango. The following is a possible confusion matrix for these classes.

![image.png](attachment:image.png)

Unlike binary classification, there are no positive or negative classes here. At first, it might be a little difficult to find TP, TN, FP and FN since there are no positive or negative classes, but it’s actually pretty easy. What we have to do here is to find TP, TN, FP and FN for each individual class. For example, if we take class Apple, then let’s see what are the values of the metrics from the confusion matrix.

- TP = 7
- TN = (2+3+2+1) = 8
- FP = (8+9) = 17
- FN = (1+3) = 4

Since we have all the necessary metrics for class Apple from the confusion matrix, now we can calculate the performance measures for class Apple. For example, class Apple has

- Precision = 7/(7+17) = 0.29
- Recall = 7/(7+4) = 0.64
- F1-score = 0.40

Similarly, we can calculate the measures for the other classes. Here is a table that shows the values of each measure for each class.

![image-2.png](attachment:image-2.png)

## Micro F1

This is called micro-averaged F1-score. It is calculated by considering the total TP, total FP and total FN of the model. It does not consider each class individually, It calculates the metrics globally. So for our example,

    Total TP = (7+2+1) = 10
    Total FP = (8+9)+(1+3)+(3+2) = 26
    Total FN = (1+3)+(8+2)+(9+3) = 26

Hence,

    Precision = 10/(10+26) = 0.28
    Recall = 10/(10+26) = 0.28

Now we can use the regular formula for F1-score and get the Micro F1-score using the above precision and recall.

## Micro F1 = 0.28

As you can see When we are calculating the metrics globally all the measures become equal. Also if you calculate accuracy you will see that,

    Precision = Recall = Micro F1 = Accuracy

## Macro F1

This is macro-averaged F1-score. It calculates metrics for each class individually and then takes unweighted mean of the measures. As we have seen from figure “Precision, Recall and F1-score for Each Class”,

    Class Apple F1-score = 0.40
    Class Orange F1-score = 0.22
    Class Mango F1-score = 0.11

Hence,

    Macro F1 = (0.40+0.22+0.11)/3 = 0.24

## Weighted F1

The last one is weighted-averaged F1-score. Unlike Macro F1, it takes a weighted mean of the measures. The weights for each class are the total number of samples of that class. Since we had 11 Apples, 12 Oranges and 13 Mangoes,

    Weighted F1 = ((0.40*11)+(0.22*12)+(0.11*13))/(11+12+13) = 0.24