## 📘 Logistic Regression Loss Function Derivation

Logistic regression is a binary classification model that predicts the probability that a given input belongs to class 1 (vs. class 0).

---

### 🧮 Step 1: Model Definition

Given an input vector **x** ∈ ℝᵈ and weights **w** ∈ ℝᵈ, the logistic model outputs a probability:

$$
\hat{y} = \sigma(\mathbf{w}^\top \mathbf{x}) = \frac{1}{1 + e^{-\mathbf{w}^\top \mathbf{x}}}
$$

Here, $\sigma(z) $ is the **sigmoid** function.

---

### 🧷 Step 2: Binary Cross-Entropy Loss

For a single data point $(\mathbf{x}, y)$, where $ y \in \{0, 1\} $, the **binary cross-entropy loss** is:

$$
\mathcal{L}(\hat{y}, y) = - \left[ y \log(\hat{y}) + (1 - y) \log(1 - \hat{y}) \right]
$$

Substitute the predicted probability:

$$
\mathcal{L}(\mathbf{w}) = - \left[ y \log\left(\sigma(\mathbf{w}^\top \mathbf{x})\right) + (1 - y) \log\left(1 - \sigma(\mathbf{w}^\top \mathbf{x})\right) \right]
$$

---

### 🧮 Step 3: Simplified Expression

Let $ z = \mathbf{w}^\top \mathbf{x} $, then:

$$
\mathcal{L}(z, y) = - \left[ y \log(\sigma(z)) + (1 - y) \log(1 - \sigma(z)) \right]
$$

Recall:

- $ \sigma(z) = \frac{1}{1 + e^{-z}} $
- $ 1 - \sigma(z) = \frac{e^{-z}}{1 + e^{-z}} $

Plug in:

$$
\mathcal{L}(z, y) = - \left[ y \log\left( \frac{1}{1 + e^{-z}} \right) + (1 - y) \log\left( \frac{e^{-z}}{1 + e^{-z}} \right) \right]
$$

Simplify:

$$
\mathcal{L}(z, y) = - \left[ y (-\log(1 + e^{-z})) + (1 - y)(-z - \log(1 + e^{-z})) \right]
$$

$$
= y \log(1 + e^{-z}) + (1 - y)(z + \log(1 + e^{-z}))
$$

$$
= \log(1 + e^{-z}) + (1 - y)z
$$

✅ **Final compact form of logistic loss:**

$$
\mathcal{L}(z, y) = \log(1 + e^{-z}) + (1 - y)z
$$

Or, more commonly, as:

$$
\mathcal{L}(\mathbf{w}) = \log\left(1 + e^{-\mathbf{w}^\top \mathbf{x}}\right) \quad \text{if } y = 1
$$

$$
\mathcal{L}(\mathbf{w}) = \log\left(1 + e^{\mathbf{w}^\top \mathbf{x}}\right) \quad \text{if } y = 0
$$

---

### ✏️ Gradient (Optional)

To


In [3]:
import numpy as np

In [None]:
x=np.random.normal(