### Confusion Matrix

**Shows the number of correct and incorrect predictions made by the classification model compared to the actual outcomes (target values) in the data.** <br>
<br>
_A Confusion matrix is an N x N matrix used for evaluating the performance of a classification model, where N is the number of target classes. The matrix compares the actual target values with those predicted by the machine learning model. This gives us a holistic view of how well our classification model is performing and what kinds of errors it is making._
<br>
**Let’s decipher the matrix:** <br>
- The target variable has two values: Positive or Negative
- The columns represent the actual values of the target variable
- The rows represent the predicted values of the target variable
<img src='./ConfusionMatrix.png' width = 400  height = 400 >
- N x N matrix where N is the number of target values (classes). 

- **True Positive (TP)** <br>
    * The predicted value matches the actual value
    * The actual value was positive and the model predicted a positive value
- **True Negative (TN)** <br>
    * The predicted value matches the actual value
    * The actual value was negative and the model predicted a negative value
- **False Positive (FP) – Type 1 error** <br>
    * The predicted value was falsely predicted
    * The actual value was negative but the model predicted a positive value
    * Also known as the Type 1 error
- **False Negative (FN) – Type 2 error** <br>
    * The predicted value was falsely predicted
    * The actual value was positive but the model predicted a negative value
    * Also known as the Type 2 error

- Confusion matrix metrics: 
    * **Accuracy :** the proportion of the total number of predictions that were correct. <br>Accuracy = (TP + TN) / (TP + FP + TN + FN)
    * **Positive Predictive Value or Precision :** the proportion of positive cases that were correctly identified. <br> Precision = TP / (TP + FP)
    * **Negative Predictive Value :** the proportion of negative cases that were correctly identified. <br> NPV = TN / (TN + FN)
    * **Sensitivity or Recall :** the proportion of actual positive cases which are correctly identified. <br> Recall = TP / (TP + FN)
    * **Specificity :** the proportion of actual negative cases which are correctly identified. <br> Specificity = TN / (TN + FP)
    * **F1-score :** is a harmonic mean of Precision and Recall <br> F1-score = 2 / ((1 / Recall) + (1 / Precision))

### Confusion Matrix using sklearn

In [6]:
# importing modules and libraries
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
import pandas as pd

In [7]:
# actual values
actual = [1,0,0,0,1,0,1,0,1,0]
# predicted values
predicted = [1,0,0,0,1,0,0,1,0,0]

In [11]:
# confusion matrix
matrix = confusion_matrix(actual, predicted, labels = [0,1])
print('Confusion matrix : \n',matrix)
pd.DataFrame(confusion_matrix(actual, predicted),
             columns = ["Predicted Class " + str(class_name) for class_name in [0,1]],
             index = ["Actual Class " + str(class_name) for class_name in [0,1]])

Confusion matrix : 
 [[5 1]
 [2 2]]


Unnamed: 0,Predicted Class 0,Predicted Class 1
Actual Class 0,5,1
Actual Class 1,2,2


In [16]:
# outcome values order in sklearn
tn, fp, fn, tp = confusion_matrix(actual,predicted,labels=[0,1]).reshape(-1)
print('Outcome values TP = {}, FN = {}, FP = {} and TN = {} \n'.format(tp, fn, fp, tn))

# classification report for precision, recall f1-score and accuracy
matrix = classification_report(actual, predicted, labels = [0,1])
print('Classification report : \n',matrix)

Outcome values TP = 2, FN = 2, FP = 1 and TN = 5 

Classification report : 
              precision    recall  f1-score   support

          0       0.71      0.83      0.77         6
          1       0.67      0.50      0.57         4

avg / total       0.70      0.70      0.69        10

