## Logistic Regression

Logistic Regression forward from input $x$ and true label $y$, where $w$ stands for the weight matrix, $b$ bias, $\sigma$ is the activation function (sigmoid) and $\mathcal{L}$ is the loss. 

$
  z = w^Tx + b \\
  \hat{y} = a = \sigma(z) \\
  \mathcal{L}(a, y) = -[y\ log(a) + (1 - y)\ log(1 - a)]
$

## Logistic Regression Derivatives

Calculate the Logistic Regression derivatives to only one example.

### Forward: 

$
  x_1, w_1, x_2, w_2, b \\
  z = w_1x_1 + w_2x_2 + b \\
  a = \sigma(z) \\
  \mathcal{L}(a,y)
$

### Backward:

$
  da = \frac{d\mathcal{L}(a,y)}{da} = -\frac{y}{a} + \frac{1 - y}{1 - a} \\
  dz = \frac{d\mathcal{L}}{dz} = \frac{d\mathcal{L}(a,y)}{dz} \rightarrow\ \ \
  dz = \frac{d\mathcal{L}}{da} . \frac{da}{dz} \rightarrow\ \ \ 
  dz = \left ( - \frac{y}{a} + \frac{1 - y}{1 - a} \right ) . a (1 - a)  \rightarrow\ \ \ 
  dz = a - y \\
  \\
  dw_1 = \frac{d\mathcal{L}}{dw_1} = x_1dz \\
  dw_2 = \frac{d\mathcal{L}}{dw_2} = x_2dz \\
  db = dz
$

### Updating weights

$
  w_1 := w_1 - \alpha dw_1 \\
  w_2 := w_2 - \alpha dw_2 \\
  b := b - \alpha db
$

## Logistic Regression on $m$ examples

### Cost Function

$
\begin{align}
  J(w,b) = \frac{1}{m} \sum\limits_{i=1}^m \mathcal{L}(a^{(i)}, y) \hspace{35pt} & \hspace{35pt} (x^{(i)}, y^{(i)})\\
  a^{(i)} = \hat{y}^{(i)} = \sigma(z^{(i)}) = \sigma(w^Tx^{(i)} + b) & \hspace{35pt} dw_1^{(i)}, dw_2^{(i)}, db^{(i)}
\end{align}
$

$
\frac{d}{dw_1}J(w,b) =  \frac{1}{m} \sum\limits_{i=1}^m \frac{d}{dw_1}\mathcal{L}(a^{(i)}, y^{(i)}) \hspace{35pt} \rightarrow \hspace{35pt} \frac{d}{dw_1}J(w,b) = \frac{1}{m} \sum\limits_{i=1}^m \frac{d}{dw_1^{(i)}}
$

### Calculating for $m$ examples and $n = 2$ features

$
\text{For}\ \ i=1\ \ \text{to}\ \ m \\
\hspace{15pt}z^{(i)} = w^Tx^{(i)} + b \\ 
\hspace{15pt}a^{(i)} = \sigma(z^{(i)}) \\
\hspace{15pt}J\ += - [y^{(i)}\ log(a^{(i)}) + (1 - y^{(i)})\ log(1 - a^{(i)})] \\
\hspace{15pt}dz^{(i)} = a^{(i)} - y^{(i)} \\
\hspace{15pt}dw_1 += x_1^{(i)}dz^{(i)} \\
\hspace{15pt}dw_2 += x_2^{(i)}dz^{(i)} \\
\hspace{15pt}db += dz^{(i)} \\
\hspace{15pt}J\ /=\ m \\
\hspace{15pt}dw_1\ /=\ m \\
\hspace{15pt}dw_2\ /=\ m \\
\hspace{15pt}db\ /=\ m \\
\text{Fim for}
$

### Updating weights

$
w_1\ := w_1 - \alpha dw_1 \\
w_2\ := w_2 - \alpha dw_2 \\
b\ := b - \alpha db \\
$