# 📅 Day 13: Logistic Regression & ROC Curve

## 🎯 Objective
Learn how logistic regression works for classification problems and how to evaluate it using the ROC Curve and AUC score.

## 🤔 What is Logistic Regression?
- A statistical model used for **binary classification**
- Estimates the **probability** of a class label using the **sigmoid function**
- Output lies between 0 and 1
- If `P(class 1) > 0.5`, predict class 1; else class 0

## 📈 Sigmoid Function
Transforms linear outputs to a probability:
$$ \sigma(z) = \frac{1}{1 + e^{-z}} $$

## 📦 Step 1 – Load & Prepare Data

In [None]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import pandas as pd

data = load_breast_cancer()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target

X = df.drop('target', axis=1)
y = df['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

## 🤖 Step 2 – Train Logistic Regression Model

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, accuracy_score

log_model = LogisticRegression(random_state=42)
log_model.fit(X_train_scaled, y_train)
y_pred = log_model.predict(X_test_scaled)

print('Accuracy:', accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))

## 🔍 Step 3 – Evaluate with ROC Curve & AUC

In [None]:
from sklearn.metrics import roc_curve, roc_auc_score
import matplotlib.pyplot as plt

y_proba = log_model.predict_proba(X_test_scaled)[:, 1]
fpr, tpr, thresholds = roc_curve(y_test, y_proba)
auc_score = roc_auc_score(y_test, y_proba)

plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, label=f'ROC Curve (AUC = {auc_score:.2f})')
plt.plot([0, 1], [0, 1], linestyle='--', color='gray')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend()
plt.grid(True)
plt.show()

## ✅ Summary
- Logistic Regression is a **linear model for classification**.
- It predicts probabilities using the **sigmoid function**.
- Evaluate classification quality using **ROC Curve** and **AUC** score.
- The closer AUC is to **1.0**, the better the model.