Skip to content

Commit

Permalink
Logistic Regression: consistency for cases. Fix display errors
Browse files Browse the repository at this point in the history
  • Loading branch information
chrisyeh96 committed Apr 1, 2020
1 parent 0068c06 commit a824b92
Showing 1 changed file with 24 additions and 22 deletions.
46 changes: 24 additions & 22 deletions _posts/2018-06-11-logistic-regression.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,47 +92,49 @@ For a given $$x$$, if $$p(y=1) \geq 0.5$$, then we predict $$\hat{y}=1$$. Otherw
In the equation above, if we solve for the score $$z$$, we see that we can interpret the score as the log-odds of $$y=1$$ (a.k.a. the **logits**).

$$
\begin{equation*}
\begin{gather*}
p(y=1) = \sigma(z) = \frac{1}{1 + e^{-z}} \\
1 + e^{-z} = \frac{1}{p(y=1)} \\
e^{-z} = \frac{1}{p(y=1)} - 1 = \frac{1 - p(y=1)}{p(y=1)} \\
e^z = \frac{p(y=1)}{1 - p(y=1)} \\
z = \log \frac{p(y=1)}{1 - p(y=1)} \\
\end{equation*}
z = \log \frac{p(y=1)}{1 - p(y=1)}
\end{gather*}
$$

**Loss function**: For a single example $$x$$ with score $$z$$ and label $$y$$, the logistic loss function is

$$
\begin{align*}
\text{loss}
&= -y\log(\sigma(z)) - (1-y)\log(1 - \sigma(z)) \\
&= \begin{cases}
\text{loss}
&= -y\log(\sigma(z)) - (1-y)\log(1 - \sigma(z)) \\
&= \begin{cases}
-\log(1 - \sigma(z)) & y = 0 \\
-\log(\sigma(z)) & y = 1
\end{cases} \\
&= \begin{cases}
-\log(\frac{e^{-z}}{1 + e^{-z}}) & y = 0 \\
\end{cases} \\
&= \begin{cases}
-\log\left( \frac{e^{-z}}{1 + e^{-z}} \right) & y = 0 \\
\log(1 + e^{-z}) & y = 1
\end{cases} \\
&= \begin{cases}
\end{cases} \\
&= \begin{cases}
\log(1 + e^z) & y = 0 \\
\log(1 + e^{-z}) & y = 1
\end{cases} \\
\end{cases}
\end{align*}
$$

In the 2nd line of the equation above, it is clear that in the probabilistic interpretation of our model, this loss function is exactly the negative log probability of a single example $$x$$ having true label $$y$$. Thus, minimizing the sum of the loss over our training examples is equivalent to maximizing the log likelihood. We can see this as follows:

$$
\begin{align*}
p(y|x; w,b) &=
\begin{cases}
\sigma(z), & y=1 \\
1 - \sigma(z), & y=0
\end{cases}
= \sigma(z)^y (1-\sigma(z))^{(1-y)} \\
\log p(y|x; w,b) &= y \log \sigma(z) + (1-y) \log (1-\sigma(z))
p(y|x; w,b)
&= \begin{cases}
1 - \sigma(z) & y=0 \\
\sigma(z) & y=1
\end{cases} \\
&= \sigma(z)^y (1-\sigma(z))^{(1-y)}
\\
\log p(y|x; w,b)
&= y \log \sigma(z) + (1-y) \log (1-\sigma(z))
\end{align*}
$$

Expand Down Expand Up @@ -202,8 +204,8 @@ $$
\begin{align*}
\hat{y}_c
&= \frac{e^{(W_c + v) \cdot x + (b_c + d)}}{\sum_i e^{(W_i + v) \cdot x + (b_i + d)}} \\
& = \frac{e^{W_c \cdot x + b_c} e^{v \cdot x + d}}{\sum_i e^{W_i \cdot x + b_i} e^{v \cdot x + d}} \\
& = \frac{e^{W_c \cdot x + b_c}}{\sum_i e^{W_i \cdot x + b_i}}
&= \frac{e^{W_c \cdot x + b_c} e^{v \cdot x + d}}{\sum_i e^{W_i \cdot x + b_i} e^{v \cdot x + d}} \\
&= \frac{e^{W_c \cdot x + b_c}}{\sum_i e^{W_i \cdot x + b_i}}
\end{align*}
$$

Expand Down Expand Up @@ -255,7 +257,7 @@ $$
&= \begin{cases}
-\log(1 - \sigma(z)) & y = 0 \\
-\log(\sigma(z)) & y = 1
\end{cases} \\
\end{cases}
\end{align*}
$$

Expand Down

0 comments on commit a824b92

Please sign in to comment.