## Key Concepts of ROC and AUC in Logistic Regression

1. **Logistic Regression for Classification:**
   - Logistic regression models predict the probability of a class (e.g., 'obese' vs 'not obese').
   - A threshold, often set at 0.5, is used to classify samples based on these predicted probabilities.

2. **Creating Confusion Matrices for Different Thresholds:**
   - Adjusting the threshold for classification leads to different sets of predictions and consequently different confusion matrices.
   - From these matrices, we calculate Sensitivity (True Positive Rate) and Specificity (1 - False Positive Rate).

3. **ROC Curves:**
   - The Receiver Operating Characteristic (ROC) curve plots the True Positive Rate (Sensitivity) against the False Positive Rate (1 - Specificity) for various thresholds.
   - Each point on the ROC curve represents a sensitivity/specificity trade-off for a particular threshold.

4. **Area Under the Curve (AUC):**
   - The Area Under the Curve (AUC) provides a single, aggregate measure of the entire ROC curve.
   - A higher AUC value indicates better model performance, with a value of 1.0 representing a perfect model and 0.5 representing a model with no discriminative ability.

By varying the threshold and observing the changes in the ROC curve and AUC, we can gain insights into the trade-offs between correctly identifying true positives and incorrectly labeling negatives as positives. This evaluation method is crucial in fields where the costs of false positives and false negatives carry different weights.


In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, auc

# Create a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=2, n_classes=2, random_state=42)

# Logistic Regression Model
model = LogisticRegression()
model.fit(X, y)

# Predict probabilities
y_probs = model.predict_proba(X)[:, 1]

# Calculate ROC Curve and AUC
fpr, tpr, thresholds = roc_curve(y, y_probs)
roc_auc = auc(fpr, tpr)

# Plot ROC Curve
plt.figure()
plt.plot(fpr, tpr, color='darkorange', lw=2, label=f'ROC curve (area = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic')
plt.legend(loc="lower right")
plt.show()
