<a href="https://colab.research.google.com/github/jvtesteves/Classification_Metrics_Evaluation/blob/main/Classification_Metrics_Evaluation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Classification Metrics Evaluation Notebook

This notebook demonstrates how to calculate the key evaluation metrics for classification models, including Accuracy, Sensitivity (Recall), Specificity, Precision, and F-score.

In classification tasks, these metrics are derived from the confusion matrix. A confusion matrix is used to summarize the performance of a model by comparing the actual and predicted labels.


## Confusion Matrix

A confusion matrix is a table with four outcomes that help to evaluate the performance of a classification model:

- **TP (True Positives)**: The number of positive instances correctly predicted as positive.
- **TN (True Negatives)**: The number of negative instances correctly predicted as negative.
- **FP (False Positives)**: The number of negative instances incorrectly predicted as positive.
- **FN (False Negatives)**: The number of positive instances incorrectly predicted as negative.

For this example, we will assume the following arbitrary values:
- TP = 50
- TN = 40
- FP = 10
- FN = 5


## Evaluation Metrics

Using the confusion matrix, the metrics are calculated as follows:

- **Accuracy**: Measures the overall correctness of the model.
  
  $$
  \text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}
  $$
  
- **Sensitivity (Recall)**: Measures the proportion of actual positives correctly identified.
  
  $$
  \text{Sensitivity} = \frac{TP}{TP + FN}
  $$
  
- **Specificity**: Measures the proportion of actual negatives correctly identified.
  
  $$
  \text{Specificity} = \frac{TN}{TN + FP}
  $$
  
- **Precision**: Measures the proportion of predicted positives that are actually positive.
  
  $$
  \text{Precision} = \frac{TP}{TP + FP}
  $$
  
- **F-score (F1-score)**: The harmonic mean of Precision and Sensitivity, providing a balance between the two.
  
  $$
  \text{F-score} = \frac{2 \times \text{Precision} \times \text{Sensitivity}}{\text{Precision} + \text{Sensitivity}}
  $$


In [1]:
# Define confusion matrix values
TP = 50  # True Positives
TN = 40  # True Negatives
FP = 10  # False Positives
FN = 5   # False Negatives

# Calculate metrics
accuracy = (TP + TN) / (TP + TN + FP + FN)
sensitivity = TP / (TP + FN)
specificity = TN / (TN + FP)
precision = TP / (TP + FP)
f_score = (2 * precision * sensitivity) / (precision + sensitivity)

# Print the results
print("Confusion Matrix:")
print(f"TP (True Positives): {TP}")
print(f"TN (True Negatives): {TN}")
print(f"FP (False Positives): {FP}")
print(f"FN (False Negatives): {FN}\n")

print("Evaluation Metrics:")
print(f"Accuracy: {accuracy:.2f}")
print(f"Sensitivity (Recall): {sensitivity:.2f}")
print(f"Specificity: {specificity:.2f}")
print(f"Precision: {precision:.2f}")
print(f"F-score: {f_score:.2f}")


Confusion Matrix:
TP (True Positives): 50
TN (True Negatives): 40
FP (False Positives): 10
FN (False Negatives): 5

Evaluation Metrics:
Accuracy: 0.86
Sensitivity (Recall): 0.91
Specificity: 0.80
Precision: 0.83
F-score: 0.87


## Discussion of Results

The calculations above show how the values from the confusion matrix can be used to derive meaningful metrics:

- **Accuracy** indicates the overall proportion of correct predictions.
- **Sensitivity (Recall)** is crucial when the cost of missing positive cases is high.
- **Specificity** is important when it is essential to correctly identify negative cases.
- **Precision** tells us how reliable the positive predictions are.
- **F-score** provides a single metric that balances Precision and Sensitivity, especially useful when you need to account for both false positives and false negatives.

These metrics are fundamental in assessing the performance of classification models.


## Conclusion

This notebook has provided a step-by-step approach to calculating the key metrics for evaluating classification models. By adjusting the confusion matrix values, you can simulate different scenarios and better understand how these metrics change with model performance.

Feel free to modify the code to evaluate your own models or datasets.
