# TP2: Logistic Regression with Gradient Descent

By the end of this lab, students will be able to:

- Understand the mathematical foundation of logistic regression.
- Implement sigmoid function, cost function (cross-entropy), and gradient descent step by step in Python.
- Apply logistic regression to univariate and multivariate classification tasks.
- Visualize the decision boundary for 2D problems.


In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

## Part A- Univariate Logistic Regression

### Step 1- Data Preparation

- Provide a binary classification dataset (e.g., exam score vs pass/fail).
- Students load data (NumPy/Pandas).
- Task: Scatter plot the data, label classes differently.


In [None]:
# Example dataset: exam score vs admitted (1) / not admitted (0)
X = np.array([30, 40, 50, 60, 70, 80, 90])
y = np.array([0, 0, 0, 1, 1, 1, 1])

plt.scatter(X, y, c=y, cmap="bwr", edgecolors="k")
plt.xlabel("Exam Score")
plt.ylabel("Admitted (1) / Not Admitted (0)")
plt.title("Training Data")
plt.show()

### Step 3- Sigmoid Function

The sigmoid function is defined as:

$$
\sigma(z) = \frac{1}{1 + e^{-z}}
$$


In [None]:
def sigmoid(z):
    return 1 / (1 + np.exp(-z))


# Test
print(sigmoid(0))  # expected 0.5

### Step 3- Hypothesis Function

The hypothesis function for logistic regression is given by:

$$
h_\theta(x) = \sigma(\theta_0 + \theta_1 x)
$$


In [None]:
def hypothesis(theta0, theta1, x):
    z = theta0 + theta1 * x
    return sigmoid(z)


# Test
print(hypothesis(0, 1, 0))  # expected 0.5

### Step 4- Cost Function (Cross-Entropy)

The cost function for logistic regression is defined as:

$$
J(\theta_0, \theta_1) = -\frac{1}{m} \sum_{i=1}^{m} \left[ y^{(i)} \log(h_\theta(x^{(i)})) + (1 - y^{(i)}) \log(1 - h_\theta(x^{(i)})) \right]
$$


In [None]:
def compute_cost(theta0, theta1, X, y):
    m = len(y)
    predictions = hypothesis(theta0, theta1, X)
    cost = -(1 / m) * np.sum(
        y * np.log(predictions) + (1 - y) * np.log(1 - predictions)
    )
    return cost


# Test
print(compute_cost(0, 0, X, y))  # expected around 0.693

### Step 5- Gradient Descent Update Rule

The parameters are updated using the following rules:

$$
\theta_j := \theta_j - \alpha \frac{1}{m} \sum_{i=1}^{m} \left( h_\theta(x^{(i)}) - y^{(i)} \right) x_j^{(i)}
$$


In [None]:
def gradient_descent(X, y, theta0, theta1, alpha, iterations):
    m = len(y)
    cost_history = []

    for _ in range(iterations):
        predictions = hypothesis(theta0, theta1, X)

        error = predictions - y
        theta0 -= alpha * (1 / m) * np.sum(error)
        theta1 -= alpha * (1 / m) * np.sum(error * X)
        cost = compute_cost(theta0, theta1, X, y)
        cost_history.append(cost)

    return theta0, theta1, cost_history

### Step 6- Training and Visualization

- Initialize θ.
- Run gradient descent loop.
- Plot cost over iterations.
- Plot decision boundary on the data.


In [None]:
theta0, theta1, cost_history = gradient_descent(
    X, y, theta0=0, theta1=0, alpha=0.001, iterations=1000
)

print("Final theta0:", theta0)
print("Final theta1:", theta1)

# Plot cost convergence
plt.plot(cost_history)
plt.xlabel("Iteration")
plt.ylabel("Cost J")
plt.title("Cost Function Convergence")
plt.show()

# Plot decision boundary
plt.scatter(X, y, c=y, cmap="bwr", edgecolors="k")
x_vals = np.linspace(min(X), max(X), 100)
plt.plot(x_vals, hypothesis(theta0, theta1, x_vals), color="green")
plt.xlabel("Exam Score")
plt.ylabel("Admitted Probability")
plt.show()

---


## Part B- Multivariate Logistic Regression

### Step 1- Data Preparation

- Use dataset with multiple features (e.g., admission dataset: exam1, exam2 → admitted).
- Normalize features.
- Add bias column.


In [None]:
# Example dataset (Exam1, Exam2 -> Admitted)
data = {
    "exam1": [34, 78, 50, 85, 60, 45, 82, 70],
    "exam2": [78, 45, 60, 85, 75, 52, 43, 95],
    "admitted": [0, 0, 0, 1, 1, 0, 1, 1],
}
df = pd.DataFrame(data)

X = df[["exam1", "exam2"]].values
y = df["admitted"].values

# Feature normalization
X = (X - X.mean(axis=0)) / X.std(axis=0)

# Add intercept column
X = np.c_[np.ones(X.shape[0]), X]

print("X shape:", X.shape)
print("y shape:", y.shape)

### Step 2- Sigmoid Function + Hypothesis Function

The hypothesis function for multivariate logistic regression is given by:

$$
h_\theta(x) = \sigma(\theta^T x)
$$


In [None]:
def sigmoid(z):
    return 1 / (1 + np.exp(-z))


def hypothesis(theta, X):
    z = np.dot(X, theta)
    return sigmoid(z)

### Step 3- Vectorized Cost Function

The cost function in vectorized form is:

$$
J(\theta) = -\frac{1}{m} \left[ y^T \log(h_\theta(X)) + (1 - y)^T \log(1 - h_\theta(X)) \right]
$$


In [None]:
def compute_cost(theta, X, y):
    m = len(y)
    predictions = hypothesis(theta, X)
    cost = -(1 / m) * np.sum(
        y * np.log(predictions) + (1 - y) * np.log(1 - predictions)
    )
    return cost

### Step 4- Vectorized Gradient Descent

The gradient descent update rule in vectorized form is:

$$
\theta := \theta - \alpha \frac{1}{m} X^T (h_\theta(X) - y)
$$


In [None]:
def gradient_descent(X, y, theta, alpha, iterations):
    m = len(y)
    cost_history = []

    for _ in range(iterations):
        predictions = hypothesis(theta, X)

        error = predictions - y
        theta -= alpha * (1 / m) * np.dot(X.T, error)
        cost = compute_cost(theta, X, y)
        cost_history.append(cost)

    return theta, cost_history

### Step 5- Training and Convergence

- Run gradient descent, track cost.
- Plot cost vs iterations.


In [None]:
theta = np.zeros(X.shape[1])  # initialize
theta, cost_history = gradient_descent(X, y, theta, alpha=0.1, iterations=1000)

print("Learned parameters:", theta)

plt.plot(cost_history)
plt.xlabel("Iteration")
plt.ylabel("Cost J")
plt.title("Multivariate Cost Convergence")
plt.show()

### Step 6: Decision Boundary & Comparison

- For 2D features, plot decision boundary (line separating classes).
- Compare with Scikit-Learn’s `LogisticRegression`


In [None]:
from sklearn.linear_model import LogisticRegression

# Fit using sklearn
model = LogisticRegression()
model.fit(df[["exam1", "exam2"]], df["admitted"])

print("Sklearn intercept:", model.intercept_)
print("Sklearn coefficients:", model.coef_)