<a href="https://colab.research.google.com/github/nicorunini/CCMACLRL_EXERCISES_COM232/blob/main/Assignment3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Assignment 3: Predicting Customer Purchase using Logistic Regression

## Dataset

| Customer | Time on site (x₁) | Pages viewed (x₂) | Purchase (y) |
|----------|------------------:|-----------------:|-------------:|
| A        | 1                 | 4                | 0            |
| B        | 2                 | 3                | 0            |
| C        | 3                 | 7                | 1            |
| D        | 5                 | 2                | 1            |
| E        | 6                 | 6                | 1            |


 ##
  

In [8]:
import numpy as np

# Dataset
X = np.array([
    [1, 4],
    [2, 3],
    [3, 7],
    [5, 2],
    [6, 6]
])

y = np.array([0, 0, 1, 1, 1])

# Initial parameters
w = np.array([0.8, 0.4])
b = -4.0
eta = 0.1  # learning rate

# Sigmoid function
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

## 1. Compute Probabilities (5 points)

| Customer | x₁ | x₂ | y | z = w·x + b | 𝑦̂ = σ(z) |
|----------|----|----|---|-------------|------------|
| A        | 1  | 4  | 0 | -1.6        | 0.167      |
| B        | 2  | 3  | 0 | -1.2        | 0.231      |
| C        | 3  | 7  | 1 | 1.2         | 0.768      |
| D        | 5  | 2  | 1 | 0.8         | 0.690      |
| E        | 6  | 6  | 1 | 3.2         | 0.961      |
##

In [9]:
# Compute initial z and probabilities
z = X.dot(w) + b
y_hat = sigmoid(z)
print("Initial z:", z)
print("Initial probabilities:", y_hat)

Initial z: [-1.6 -1.2  1.2  0.8  3.2]
Initial probabilities: [0.16798161 0.23147522 0.76852478 0.68997448 0.96083428]


Compute Average Loss (6 points)

| Customer | y | 𝑦̂ | Loss |
|----------|---|-----|------|
| A        | 0 | 0.167 | 0.182 |
| B        | 0 | 0.231 | 0.265 |
| C        | 1 | 0.768 | 0.264 |
| D        | 1 | 0.690 | 0.371 |
| E        | 1 | 0.961 | 0.040 |

**Average BCE Loss:** 0.224


In [10]:
# Compute BCE loss
loss = - (y*np.log(y_hat) + (1-y)*np.log(1-y_hat))
avg_loss = np.mean(loss)
print("Loss per customer:", loss)
print("Average BCE loss:", avg_loss)

Loss per customer: [0.18390074 0.26328247 0.26328247 0.37110067 0.03995333]
Average BCE loss: 0.2243039349349218


## 3. Update the slope and intercept using Gradient Descent (6 points)

**Gradients:**

- ∂L/∂m₁ = (1/N) Σ ((ŷ - y) * x₁) ≈ -0.370  
- ∂L/∂m₂ = (1/N) Σ ((ŷ - y) * x₂) ≈ -0.223  
- ∂L/∂b  = (1/N) Σ (ŷ - y) ≈ -0.037  

**Updated Parameters (one step of gradient descent, η = 0.1):**

- m₁_new = m₁ - η * ∂L/∂m₁ ≈ 0.837  
- m₂_new = m₂ - η * ∂L/∂m₂ ≈ 0.422  
- b_new  = b - η * ∂L/∂b  ≈ -3.996  

**Summary Table:**

| Parameter | Gradient | Updated Value |
|-----------|----------|---------------|
| m₁        | -0.370   | 0.837         |
| m₂        | -0.223   | 0.422         |
| b         | -0.037   | -3.996        |
##

In [11]:
# Step 3: Gradients and parameter update

# Gradients
grad_w = np.mean((y_hat - y)[:, None] * X, axis=0)
grad_b = np.mean(y_hat - y)

# Update parameters
w_new = w - eta * grad_w
b_new = b - eta * grad_b

# Print gradients and updated values
print("Gradients:")
print(f"∂L/∂m1 = {grad_w[0]:.3f}, ∂L/∂m2 = {grad_w[1]:.3f}, ∂L/∂b = {grad_b:.3f}")
print("Updated parameters:")
print(f"m1_new = {w_new[0]:.3f}, m2_new = {w_new[1]:.3f}, b_new = {b_new:.3f}")


Gradients:
∂L/∂m1 = -0.370, ∂L/∂m2 = -0.222, ∂L/∂b = -0.036
Updated parameters:
m1_new = 0.837, m2_new = 0.422, b_new = -3.996


## 4. Compute new probabilities using the new slopes and intercept (5 points)

| Customer | x₁ | x₂ | y | z_new = w_new·x + b_new | new 𝑦̂ |
|----------|----|----|---|-------------------------|---------|
| A        | 1  | 4  | 0 | -1.471                  | 0.187   |
| B        | 2  | 3  | 0 | -1.056                  | 0.258   |
| C        | 3  | 7  | 1 | 1.469                   | 0.812   |
| D        | 5  | 2  | 1 | 1.033                   | 0.737   |
| E        | 6  | 6  | 1 | 3.558                   | 0.972   |


In [12]:
# Compute new probabilities
z_new = X.dot(w_new) + b_new
y_hat_new = sigmoid(z_new)
print("New z:", z_new)
print("New probabilities:", y_hat_new)

New z: [-1.47068191 -1.05589     1.46980389  1.03284654  3.55854043]
New probabilities: [0.18683899 0.25809566 0.81302758 0.73746738 0.97230831]


## 5. Compute new Average Loss (6 points)

| Customer | y | new 𝑦̂ | Loss_new |
|----------|---|---------|----------|
| A        | 0 | 0.187   | 0.206 |
| B        | 0 | 0.258   | 0.297 |
| C        | 1 | 0.812   | 0.208 |
| D        | 1 | 0.737   | 0.306 |
| E        | 1 | 0.972   | 0.028 |

**New Average BCE Loss:** 0.209


In [13]:
# Compute new BCE loss
loss_new = - (y*np.log(y_hat_new) + (1-y)*np.log(1-y_hat_new))
avg_loss_new = np.mean(loss_new)
print("New loss per customer:", loss_new)
print("New average BCE loss:", avg_loss_new)

New loss per customer: [0.20682614 0.29853497 0.20699025 0.30453342 0.02808234]
New average BCE loss: 0.2089934237451554
