## Deriving the Logistic Regression Objective Function using MLE

### Logistic Regression Model

We use the logistic (sigmoid) function defined as:

\[
\theta(z) = \frac{1}{1 + e^{-z}}
\]

For logistic regression, the model is:

\[
P(Y = 1 \mid X = x) = \theta(w^T x) = \frac{e^{w^T x}}{1 + e^{w^T x}}
\]

\[
P(Y = 0 \mid X = x) = 1 - \theta(w^T x) = \theta(-w^T x)
\]

These two cases can be combined into a single expression:

\[
P(Y = y \mid X = x) = \theta(w^T x)^y \, \theta(-w^T x)^{1 - y}
\]

where \( y \in \{0,1\} \).

---

### Likelihood Function

The likelihood of the parameters \( w \) is:

\[
L(w) = P(Y \mid X, w) = \prod_{i=1}^{n} P(y_i \mid x_i, w)
\]

Substituting the logistic model:

\[
L(w) = \prod_{i=1}^{n} \theta(w^T x_i)^{y_i} \, \theta(-w^T x_i)^{1 - y_i}
\]

---

### Log-Likelihood

Instead of maximizing the likelihood directly, we maximize the log-likelihood, which
converts the product into a sum:

\[
\ell(w) = \ln L(w)
\]

\[
\ell(w) = \sum_{i=1}^{n} \left[
y_i \ln \theta(w^T x_i) + (1 - y_i)\ln \theta(-w^T x_i)
\right]
\]

Using the identity \( \theta(-z) = 1 - \theta(z) \), this becomes:

\[
\ell(w) = \sum_{i=1}^{n} \left[
y_i \ln \theta(w^T x_i) + (1 - y_i)\ln (1 - \theta(w^T x_i))
\right]
\]


Maximizing the log-likelihood is equivalent to minimizing the negative log-likelihood:

\[
E(w) = -\ell(w)
\]

\[
E(w) = -\sum_{i=1}^{n} \left[
y_i \ln \theta(w^T x_i) + (1 - y_i)\ln (1 - \theta(w^T x_i))
\right]
\]

This loss function is known as the binary cross-entropy loss, and it is the standard
objective function used in logistic regression.

---
### Final Objective

\[
\boxed{
\min_{w}
\; -\sum_{i=1}^{n}
\left[
y_i \ln \theta(w^T x_i) + (1 - y_i)\ln (1 - \theta(w^T x_i))
\right]
}
\]
