# Sensitivity, Specificity, Precision, Recall

In this notebook, we'll introduce the concepts of Sensitivity, Specificity, Precision and Recall. These are key metrics used to evaluate the performance of a classification model in Machine Learning.

## Sensitivity

Sensitivity, also known as True Positive Rate (TPR), measures the proportion of actual positive cases which are correctly identified. In other words, it's the ability of a model to find all the relevant cases within a dataset. The formula to calculate sensitivity is:

$$ Sensitivity = \frac{True Positives}{True Positives + False Negatives} $$

## Specificity

Specificity, also known as True Negative Rate (TNR), measures the proportion of actual negative cases which are correctly identified. It's the ability of the model to find all the negative cases. The formula to calculate specificity is:

$$ Specificity = \frac{True Negatives}{True Negatives + False Positives} $$

## Precision

Precision, also known as Positive Predictive Value (PPV), measures the proportion of actual positive cases out of the predicted positive cases. It's the ability of the classifier not to label a negative sample as positive. The formula to calculate precision is:

$$ Precision = \frac{True Positives}{True Positives + False Positives} $$

## Recall

Recall, also known as Sensitivity, Hit Rate, or True Positive Rate (TPR), measures the proportion of actual positive cases which are correctly identified. The formula to calculate recall is the same as the one for sensitivity as they refer to the same concept:

$$ Recall = Sensitivity = \frac{True Positives}{True Positives + False Negatives} $$

Now, let's take a look at these concepts in a practical way by applying them to a binary classification problem.

In [2]:
# Import necessary libraries
from sklearn.metrics import confusion_matrix, precision_score, recall_score
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Create a simple binary classification dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=2, n_redundant=10, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a logistic regression model
clf = LogisticRegression(random_state=42).fit(X_train, y_train)

# Make predictions
y_pred = clf.predict(X_test)

# Calculate confusion matrix
tn, fp, fn, tp = confusion_matrix(y_test, y_pred).ravel()

# Calculate sensitivity, specificity, precision, and recall
sensitivity = tp / (tp + fn)
specificity = tn / (tn + fp)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)

# Print the results
print(f'Sensitivity: {sensitivity:.2f}')
print(f'Specificity: {specificity:.2f}')
print(f'Precision: {precision:.2f}')
print(f'Recall: {recall:.2f}')

Sensitivity: 0.85
Specificity: 0.89
Precision: 0.86
Recall: 0.85
