# Logistic Regression

**Logistic Regression** is a statistical and machine learning technique used for **binary classification problems**. It predicts the probability that an input belongs to a particular class (e.g., 0 or 1).

---

## 🔸 Key Idea

- Unlike linear regression, logistic regression outputs a **probability** (value between 0 and 1).
- It uses the **sigmoid (logistic)** function to "squash" predictions.

---

## 🔹 Sigmoid Function

The sigmoid function is defined as:

$$\sigma(z) = \frac{1}{1 + e^{-z}}$$

Where:

- $z = w_0 + w_1x_1 + w_2x_2 + \dots + w_nx_n$
- $\sigma(z)$ is the predicted probability that the input belongs to class 1.

---

## 🔹 Hypothesis Function

$$\hat{y} = \sigma(w^T x + b)$$

Where:
- $\hat{y}$: Predicted probability
- $x$: Input features
- $w$: Weights
- $b$: Bias

---

## 🔹 Cost Function (Binary Cross-Entropy)

The cost function used to train logistic regression is:

$$
J(w) = -\frac{1}{m} \sum_{i=1}^{m} \left[ y^{(i)} \log(\hat{y}^{(i)}) + (1 - y^{(i)}) \log(1 - \hat{y}^{(i)}) \right]
$$

Where:
- $m$: Number of training examples
- $y^{(i)}$: True label (0 or 1)
- $\hat{y}^{(i)}$: Predicted probability

---

## 🔹 Gradient Descent

Used to minimize the cost function and update weights:

$$w := w - \alpha \cdot \frac{\partial J}{\partial w}$$

Where $\alpha$ is the learning rate.

---

## 🔸 When to Use Logistic Regression

- Binary classification tasks (e.g., spam detection, disease prediction)
- When interpretability is important
- Works well when the relationship between features and target is **linear** in log-odds

---


In [16]:
import numpy as np

In [2]:
def sigmoid(z):
    return 1.0 / (1.0 + np.exp(-z))

In [31]:
 def calculate_gradient(theta, X, y):
     m = y.size
     return (X.T @ (sigmoid(X @ theta) - y) / m)

In [33]:
def gradient_descent(X, y, alpha=0.1, num_iter=100, tol=1e-7):
    X_b = np.c_[np.ones((X.shape[0], 1)), X]

    theta = np.zeros(X_b.shape[1])

    for i in range(num_iter):
        grad = calculate_gradient(theta, X_b, y)
        theta = theta - grad

        if np.linalg.norm(grad) < tol:
            break
    return theta

In [23]:
def predict_proba(X, theta):
    X_b = np.c_[np.ones((X.shape[0], 1)), X]
    return sigmoid(X_b @ theta)

In [24]:
def predict(X, theta, threshold=0.5):
    return (predict_proba(X, theta) >= threshold).astype(int)

In [25]:
from sklearn.datasets import load_breast_cancer
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [35]:
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

theta_hat = gradient_descent(X_train_scaled, y_train, alpha=0.1)

y_pred_train = predict(X_train_scaled, theta_hat)
y_pred_test = predict(X_test_scaled, theta_hat)

train_acc = accuracy_score(y_train, y_pred_train)
test_acc = accuracy_score(y_test, y_pred_test) 


In [37]:
print(train_acc)
print(test_acc)

0.9868131868131869
0.9736842105263158
