# Confusion Matrix and ROC-AUC

In classification problems, two very powerful evaluation tools are the **confusion matrix** and the **ROC-AUC curve**.

---
## 1. Confusion Matrix
- A table that shows how many predictions were correct/incorrect.
- Structure (for binary classification):

|               | Predicted Positive | Predicted Negative |
|---------------|-------------------|-------------------|
| **Actual Positive** | True Positive (TP)   | False Negative (FN)  |
| **Actual Negative** | False Positive (FP)  | True Negative (TN)   |

- Helps to calculate Precision, Recall, and F1-Score.

---
## 2. ROC Curve (Receiver Operating Characteristic)
- Plots **True Positive Rate (TPR)** vs **False Positive Rate (FPR)** at different thresholds.
- TPR = Recall = TP / (TP + FN).
- FPR = FP / (FP + TN).

## 3. AUC (Area Under Curve)
- Measures the entire ROC curve as a single number.
- Ranges from 0.5 (random guessing) to 1.0 (perfect classifier).
- Higher AUC = better classifier.

---
Together, these tools provide deeper insights into classification performance than accuracy alone.

In [None]:
# Example: Confusion Matrix and ROC-AUC
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, roc_curve, roc_auc_score
import matplotlib.pyplot as plt

# Load dataset
X, y = load_breast_cancer(return_X_y=True)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = LogisticRegression(max_iter=500)
model.fit(X_train, y_train)

# Predictions
y_pred = model.predict(X_test)
y_prob = model.predict_proba(X_test)[:, 1]

# Confusion Matrix
cm = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:\n", cm)

# ROC Curve
fpr, tpr, thresholds = roc_curve(y_test, y_prob)
auc = roc_auc_score(y_test, y_prob)

plt.plot(fpr, tpr, label=f'ROC Curve (AUC = {auc:.2f})')
plt.plot([0, 1], [0, 1], 'k--')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend()
plt.show()