# Overview
## Logistic regression with a neural network mindset
Build general architecture of a learning algorithm:
- init parameters
- calculate the cost function and its gradient
- use optimization algorithm


## Problem statement
Given a dataset:
- a training set of `m_train` images labeled as cat `(y=1)` or non-cat `(y=0)`
- a test set of `m_test` images labeled as cat or non-cat

## 1 Packages

In [2]:
import numpy as np
import copy
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage



# 2 Pre-processing
In order to pre-process images of shape (num_px, num_px, 3) into numpy-array of shape (num_px * num_px * 3, 1):


In [None]:
# Given a dataset
# train_set_x_orig.shape: (m_train, num_px, num_px, 3)
train_set_x_orig = np.random.randint(0, 256, size = (20, 64, 64, 3))
# test_set_x_orig.shape: (m_test, num_px, num_px, 3)
test_set_x_orig = np.random.randint(0, 256, size = (10, 64, 64, 3))
m_train = train_set_x_orig.shape[0] # number of training examples
m_test = test_set_x_orig.shape[0] # number of test examples
train_set_x_flatten = train_set_x_orig.reshape(m_train, -1).T # reshape to (num_px * num_px * 3, m_train)
test_set_x_flatten = test_set_x_orig.reshape(m_test, -1).T # reshape to (num_px * num_px * 3, m_test)

print("number of training examples: " + str(m_train))
print("number of test examples: " + str(m_test))
print("train_set_x_orig shape: " + str(train_set_x_orig.shape))
print("train_set_x_flatten shape: " + str(train_set_x_flatten.shape))
print("test_set_x_orig shape: " + str(test_set_x_orig.shape))
print("test_set_x_flatten shape: " + str(test_set_x_flatten.shape))




number of training examples: 20
number of test examples: 10
train_set_x_orig shape: (20, 64, 64, 3)
train_set_x_flatten shape: (12288, 20)
test_set_x_orig shape: (10, 64, 64, 3)
test_set_x_flatten shape: (12288, 10)


Assuming pixels refers to RGB values with range from 0 to 255 (inclusive),  
standardize the dataset by simple division of row by 255 (max value).

In [6]:
# Standardize the data
train_set_x = train_set_x_flatten / 255.
test_set_x = test_set_x_flatten / 255.

# 3 Architecture for learning algorithm
Mathematical expression  
For one example $x^{(i)}$:
$$z^{(i)} = w^T x^{(i)} + b \tag{1}$$
$$\hat{y}^{(i)} = a^{(i)} = sigmoid(z^{(i)})\tag{2}$$ 
$$ \mathcal{L}(a^{(i)}, y^{(i)}) =  - y^{(i)}  \log(a^{(i)}) - (1-y^{(i)} )  \log(1-a^{(i)})\tag{3}$$

The cost is then computed by summing over all training examples:
$$ J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})\tag{6}$$


# 4 Building the algorithm
1. Define model structure
2. Initialize model parameters
3. Loop
- Calculate current loss (forward propagation)
- Calculate current gradient (backward propagation)
- Update parameters (gradient descent)

## 4.1 Helper functions

### Sigmoid


In [None]:
def sigmoid(z):
    s = 1 / ( 1 + np.exp(-z))
    return s