<a href="https://colab.research.google.com/github/Fmaj7-dev/coursera_machine_learning/blob/master/logistic_regression_as_neural_network.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Logistic regression as NN



## Notation
$X =
 \underset {nx \mspace{4mu} \times \mspace{4mu} m}{
 \begin{pmatrix}
  \vdots & \vdots & & \vdots  \\ 
 x^{(1)} & x^{(2)} & \cdots & x^{(m)}  \\
 \vdots & \vdots & & \vdots
 \end{pmatrix}}$

 X.shape = (nx, m)
 
 $Y = 
 \begin{pmatrix}
y^{(1)} & y^{(2)} & \cdots & y^{(m)}
 \end{pmatrix}$

 $Y \in \mathbb{R}^{1 \times m}$

 ## Gradient Descent
 $\hat{y} = \sigma(w^Tx+b), \sigma(z) = \dfrac{1}{1+e^{-z}}$

 $\mathcal{L}(\hat{y},y)= y^{(i)}\log{\hat{y}^{(i)}} + (1-y^{(i)})\log{(1-\hat{y}^{(i)}})$

 $J(w, b) = \dfrac{1}{m}\sum_{i=1}^{m}\mathcal{L}(\hat{y}^{(i)},y^{(i)}) =$

 $-\dfrac{1}{m}\sum_{i=1}^{m}(y^{(i)}\log{\hat{y}^{(i)}} + (1-y^{(i)})\log{(1-\hat{y}^{(i)})})$

 $w := w - \alpha\dfrac{\partial J(w, b)}{\partial w}$

 $b := b - \alpha\dfrac{\partial J(w, b)}{\partial b}$
 

## Vectorizing

In [0]:
import numpy as np

# use vectorized operations instead of loops

a = np.random.rand(100)
b = np.random.rand(100)

np.dot(a, b)
np.log(a)
np.abs(a)
np.maximum(a, b)
np.exp(a)
dw = np.zeros(100, 1)

# single iteration of gradient descent
m = X.shape[1]
z = np.dot(w.T, X) + b # note b is broadcasted
A = sigmoid(z)
cost = -np.sum(Y*np.log(A) + (1-Y)*np.log(1-A))/m

Y = [y1, y2, ...,ym]
dz = A-Y
dw = np.dot(X, dz.T) / m
db = np.sum(dz)/m

w = w - alpha*dw
b = b - alpha*db