# Recall

**Definition:**  
Recall, also known as sensitivity or true positive rate, is the ratio of correctly predicted positive observations to all actual positives. It answers the question: *Of all the actual positive instances, how many did we correctly predict?* Recall is a crucial metric for evaluating the performance of classification models, especially in contexts where missing a positive instance is costly.

**Formula:**

$$
\text{Recall} = \frac{\text{True Positives (TP)}}{\text{True Positives (TP)} + \text{False Negatives (FN)}}
$$

**Key Terms:**
- **True Positives (TP):** Cases where the model correctly predicts the positive class.
- **False Negatives (FN):** Cases where the model fails to predict the positive class (predicted negative but actually positive).

**Importance of Recall:**
Recall is particularly important in applications where it is critical to identify all positive instances. For example:

- **Medical Diagnosis:** In a cancer screening test, a false negative could mean missing a cancer diagnosis, potentially leading to severe health consequences.
- **Fraud Detection:** In financial transactions, failing to detect a fraudulent transaction (false negative) could lead to significant financial loss.

High recall indicates that the model is effective at identifying positive instances, which is crucial in sensitive applications.

**Interpretation:**
- **High Recall:** A high recall value (close to 1) means that the model successfully identifies a large proportion of actual positive instances.
  
- **Low Recall:** A low recall value indicates that the model misses many actual positive instances, which can have serious implications, particularly in critical applications.

**Example:**
Consider a binary classification problem in a medical testing scenario where the task is to identify patients with a specific disease.

Suppose a model predicts 100 patients as having the disease, with the following results:
- True Positives (TP): 80 (patients who actually have the disease and were correctly identified)
- False Negatives (FN): 20 (patients who have the disease but were incorrectly identified as not having it)

Using the recall formula, we calculate recall as follows:

$$
\text{Recall} = \frac{TP}{TP + FN} = \frac{80}{80 + 20} = \frac{80}{100} = 0.8
$$

This indicates that the model correctly identifies 80% of actual positive cases.

**Relation to Other Metrics:**
Recall is often discussed alongside other important metrics, such as precision and F1 score:
- **Precision:** Measures the proportion of predicted positives that are actually positive.

$$
\text{Precision} = \frac{TP}{TP + FP}
$$

- **F1 Score:** The harmonic mean of precision and recall, providing a single score that balances both metrics.

$$
\text{F1 Score} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}
$$

**Conclusion:**
Recall is a valuable metric for evaluating classification models, especially in cases where failing to identify a positive instance can have serious consequences. Understanding recall helps practitioners make informed decisions about model performance and suitability for specific applications. By considering recall alongside precision and other metrics, stakeholders can obtain a more comprehensive view of model performance.


In [1]:
import numpy as np
from sklearn.metrics import recall_score, confusion_matrix

y_true = np.array([1, 0, 1, 1, 0, 1, 0, 0, 1, 0])
y_pred = np.array([1, 0, 1, 0, 0, 1, 1, 0, 1, 0])

recall = recall_score(y_true, y_pred)

print(f"Predicted Labels: {y_pred}")
print(f"True Labels: {y_true}")
print(f"Recall: {recall:.2f}")

conf_matrix = confusion_matrix(y_true, y_pred)

print("\nConfusion Matrix:")
print(conf_matrix)

TP = conf_matrix[1, 1]  # True Positives
FN = conf_matrix[1, 0]  # False Negatives

print(f"\nTrue Positives (TP): {TP}")
print(f"False Negatives (FN): {FN}")

Predicted Labels: [1 0 1 0 0 1 1 0 1 0]
True Labels: [1 0 1 1 0 1 0 0 1 0]
Recall: 0.80

Confusion Matrix:
[[4 1]
 [1 4]]

True Positives (TP): 4
False Negatives (FN): 1
