<a href="https://colab.research.google.com/github/VasavSrivastava/MAT422/blob/main/HW9.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**3.4 Logistic Regression**

Logistic regression is a classification model that estimates the probability of a binary outcome using a logistic function. Given data points ${(\alpha_i, b_i) : i = 1, \dots, n}$, where $\alpha_i \in \mathbb{R}^d$ represents features and $b_i \in {0, 1}$ is the label, the logit function models the probability of label 1 as a linear function: $\log \frac{p(\alpha; x)}{1 - p(\alpha; x)} = \alpha^T x$. The probability is defined by the sigmoid function: $p(\alpha; x) = \sigma(\alpha^T x)$, where $\sigma(t) = \frac{1}{1 + e^{-t}}$. To optimize, we minimize the cross-entropy loss: $L(x; A, b) = -\frac{1}{n} \sum_{i=1}^{n} [b_i \log(\sigma(\alpha_i^T x)) + (1 - b_i) \log(1 - \sigma(\alpha_i^T x))]$. Gradient descent updates the parameter $x$ as: $x_{k+1} = x_k + \beta \frac{1}{n} \sum_{i=1}^{n} (b_i - \sigma(\alpha_i^T x_k)) \alpha_i$. The stochastic gradient update is: $x_{k+1} = x_k + \beta (b_I - \sigma(\alpha_I^T x_k)) \alpha_I$, where $I$ is a randomly chosen index.

In [1]:
import numpy as np

# Sigmoid function
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

# Loss function (Cross-Entropy)
def compute_loss(A, b, x):
    n = len(b)
    predictions = sigmoid(A.dot(x))
    return -np.mean(b * np.log(predictions) + (1 - b) * np.log(1 - predictions))

# Gradient of the loss function
def compute_gradient(A, b, x):
    n = len(b)
    predictions = sigmoid(A.dot(x))
    return A.T.dot(predictions - b) / n

# Logistic Regression using Gradient Descent
def logistic_regression(A, b, learning_rate=0.1, num_iterations=1000):
    # Initialize weights (parameters) to zeros
    x = np.zeros(A.shape[1])

    # Gradient Descent
    for i in range(num_iterations):
        # Compute the gradient
        gradient = compute_gradient(A, b, x)

        # Update weights
        x -= learning_rate * gradient

        # Compute and print the loss every 100 iterations
        if i % 100 == 0:
            loss = compute_loss(A, b, x)
            print(f"Iteration {i}: Loss = {loss}")

    return x

# Example usage
# Simple dataset with two features and a binary label
# A is a (n x d) matrix, where n is the number of samples and d is the number of features
A = np.array([
    [0.5, 1.5],
    [1.0, 2.0],
    [1.5, 0.5],
    [3.0, 3.5],
    [2.0, 4.0]
])

# Binary labels (0 or 1)
b = np.array([0, 0, 0, 1, 1])

# Train the logistic regression model
x_final = logistic_regression(A, b, learning_rate=0.1, num_iterations=1000)

print("\nFinal weights (parameters):", x_final)

# Predicting on new data
def predict(A, x):
    return (sigmoid(A.dot(x)) >= 0.5).astype(int)

# Example prediction
new_data = np.array([[1.0, 2.0]])
prediction = predict(new_data, x_final)
print("Prediction for new data [1.0, 2.0]:", prediction)


Iteration 0: Loss = 0.6788940656987339
Iteration 100: Loss = 0.656005436447019
Iteration 200: Loss = 0.6553439501617339
Iteration 300: Loss = 0.6552254332660172
Iteration 400: Loss = 0.6552041363260528
Iteration 500: Loss = 0.6552003034518605
Iteration 600: Loss = 0.6551996131496522
Iteration 700: Loss = 0.6551994887874233
Iteration 800: Loss = 0.6551994663798113
Iteration 900: Loss = 0.6551994623421737

Final weights (parameters): [-0.1774323   0.32352887]
Prediction for new data [1.0, 2.0]: [1]
