- [Binary Classification](#bincls)
	- Use logistic regression algorithm for binary classification
- [Logistic Regresssion](#logreg)
- [Gradient Decent](#gradecent)
- [Derivatives](#deriv)
- [Computation Graph](#compgraph)
- [Logistic Regression Gradient Descent](#logreg-gradecent)

### Binary Classification

In a binary classificatin problem, the result is a discrete value output.

- The goal is to train a classifier that the input is an image represented by a feature vector, 𝑥, and predicts whether the corresponding label 𝑦 is 1 or 0.
- Process m training sets w/o loops using matrix
- Computation of NN : Forward & Backward propagation steps

$X^{(i)} \in \mathbb{R}^{n_{x}}$ ($i^{th}$ training example)<br>
$Y^{(i)} \in \{0,1\}$ where `i=1,...,m` <br>

Put an entire training set $\{(x^{(1)},y^{(1)}), ... , (x^{(m)},y^{(m)})\} $ into $X$:<br>
$X=$
	\begin{bmatrix}
	| & | & & |\\
	X^{(1)} & X^{(2)} & ... & X^{(m)}\\
	| & | & & |\\
	\end{bmatrix} $\in \mathbb{R}^{n_{x}\times m}$<br>
$Y=$
	\begin{bmatrix}
	Y^{(1)} & X^{(2)} & ... & Y^{(m)}\\
	\end{bmatrix} $\in \mathbb{R}^{1\times m}$

`X.shape = ` $(n_{x},m)$<br>
`Y.shape = ` $(1,m)$

$n_{x}$ = (number of input features) = 64x64x3 = 12,288<br>
$m_{train}$ = (number of training data)<br>
$m_{test}$ = (number of test data)<br>


### Logistic Regression

- Logistic regression is a learning algorithm used in a supervised learning problem when the output 𝑦 are all either zero or one. The goal of logistic regression is to minimize the error between its predictions and training data.

<img id="logreg" src="https://i.imgur.com/9iAacgB.png" style="width:550px;height:320px;">

In linear regression, $\hat{y} = w^{T}X+b$. But $\hat{y}$ should be $0\leq\hat{y}\leq1$<br>
In logistic regression,
$\hat{y} = \sigma({w^T+b})$ using sigmoid function

$\hat{y}= P(y=1|x) = \sigma(w^T x+b) = \sigma(z)= \frac{1}{1+e^{-z}}$ <br>

for `i=1,...m` $\hat{y}^{(i)}= \sigma(w^T x^{(i)}+b) = \sigma(z^{(i)})= \frac{1}{1+e^{-z^{(i)}}}$<br>
given $\{(x^{(1)}, y^{(1)}), ... (x^{(m)}, y^{(m)})\}$, want $\hat{y}^{(i)} \approx y^{(i)}$

To train the parameters $\omega$ and b, we need to define a Cost function
- Loss $L(\hat{y},y)= -(y\log{\hat{y}}+(1-y)\log{(1-\hat{y})})$

- Cost $J(\omega, b)= \frac{1}{m} \sum_{i=1}^{m} L(\hat{y}^{(i)},y^{(i)})= -\frac{1}{m} \sum_{i=1}^{m}[y^{(i)}\log{\hat{y}^{(i)}}+(1-y^{(i)})\log{(1-\hat{y}^{(i)})}]$


In logistic regresion implementation objective is to find parameters $w$ and $b$ to make $\hat{y}$ a good estimate of the chance of Y being equal to 1.


In training logistic regression model, find $w$ and $b$ that minimize overall cost function, $J(w,b)$


<img id="gradecent" src="https://i.imgur.com/m5HTDzl.png" style="width:550px;height:300px;">

- You've seen the Logistic regression model. Loss function that measures how well you're doing on the single training example. Cost function that measures how well your parameters $w$ and $b$ are doing on your entire training set.
- Now let's talk about how you can use the gradient descent algorithm to train, or to learn, the parameters w and b on your training set.

- Take iterative steps from initial position to the global optimum


<img src="https://i.imgur.com/2mGFlWu.png" style="width:550px;height:300px;">


<img src="https://i.imgur.com/XSPppwf.png" style="width:550px;height:300px; float: left;">

<img src="https://i.imgur.com/fTmw5jJ.png" style="width:550px;height:300px;">

One step of backward propagation on a computation graph yields derivative of final output variable.

- Logistic regression Gradient descent

Want: modify w and b to reduce L-> go backwards propagation to compute derivatives

<img src="https://i.imgur.com/Ntnnqey.png" style="width:550px;height:300px; float: left;">

<img src="https://i.imgur.com/AhMybZm.png" style="width:550px;height:250px; float: left;">

<img src="https://i.imgur.com/2pseXrt.png" style="width:550px;height:280px; float: left;">

In [3]:
# Type the following in 'Python Console' to install dependencies in jupyter
# import sys
# !conda install --yes --prefix {sys.prefix} numpy

import numpy as np
import matplotlib.pyplot as plt
from lr_utils import load_dataset

%matplotlib inline

In [None]:
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()

In [None]:
# [data types](https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html)

# 64x64 with 8-bit integer components
print(np.iinfo(np.uint8))

for index in range(train_set_x_orig.shape[0]):
    cat = train_set_x_orig[index]
    assert(np.ndarray(shape=(64, 64, 3), dtype=np.uint8).shape == cat.shape)

In [None]:
# [data types](https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html)

# 64x64 with 8-bit integer components
print(np.iinfo(np.uint8))

for index in range(train_set_x_orig.shape[0]):
    cat = train_set_x_orig[index]
    assert(np.ndarray(shape=(64, 64, 3), dtype=np.uint8).shape == cat.shape)

In [None]:
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()

In [None]:
# [data types](https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html)

# 64x64 with 8-bit integer components
print(np.iinfo(np.uint8))

for index in range(train_set_x_orig.shape[0]):
    cat = train_set_x_orig[index]
    assert(np.ndarray(shape=(64, 64, 3), dtype=np.uint8).shape == cat.shape)

In [None]:
# [data types](https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html)

# 64x64 with 8-bit integer components
print(np.iinfo(np.uint8))

for index in range(train_set_x_orig.shape[0]):
    cat = train_set_x_orig[index]
    assert(np.ndarray(shape=(64, 64, 3), dtype=np.uint8).shape == cat.shape)

In [None]:
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()

In [1]:
# [data types](https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html)

# 64x64 with 8-bit integer components
print(np.iinfo(np.uint8))

for index in range(train_set_x_orig.shape[0]):
    cat = train_set_x_orig[index]
    assert(np.ndarray(shape=(64, 64, 3), dtype=np.uint8).shape == cat.shape)

NameError: name 'np' is not defined