# 📅 Day 8: Classification – Logistic Regression

## 🎯 Objective
Understand and implement logistic regression for binary classification problems.

## 🧠 What is Classification?
Classification is a type of supervised learning where the output variable is a category or class (e.g., spam or not spam).

## ⚖️ Logistic Regression
- A statistical model used for binary classification
- Outputs probability (between 0 and 1)
- Uses the logistic (sigmoid) function to map predictions

## 🔢 Sigmoid Function
\[ \sigma(z) = \frac{1}{1 + e^{-z}} \]

## 📘 Dataset: Breast Cancer Dataset (from sklearn)
This is a binary classification problem (malignant vs benign).

In [None]:
from sklearn.datasets import load_breast_cancer
import pandas as pd

data = load_breast_cancer()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target
df.head()

## ✂️ Step 1: Data Splitting

In [None]:
from sklearn.model_selection import train_test_split

X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


## 🔄 Step 2: Feature Scaling

In [None]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)


## 🤖 Step 3: Train Logistic Regression Model

In [None]:
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
model.fit(X_train_scaled, y_train)


## 📊 Step 4: Evaluate the Model

In [None]:
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

y_pred = model.predict(X_test_scaled)
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))


## ✅ Summary
- Logistic Regression is a foundational classification algorithm.
- Ideal for binary outputs.
- Easy to interpret and use as a baseline model.