## Understanding inner function of NN through Logistic Regression Binary Classifier

In [None]:
import torch
import h5py
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np

### 1. read files from directory

In [None]:
train_dataset = h5py.File('train_catvnoncat.h5', "r")

In [None]:
test_dataset = h5py.File('test_catvnoncat.h5', "r")

**Exercise 1:**
Find the values for:

m_train (number of training examples)
m_test (number of test examples)
num_px (= height = width of a training image) Remember that train_set_x_orig is a numpy-array of shape (m_train, num_px, num_px, 3). For instance, you can access m_train by writing train_set_x_orig.shape[0].

### Get the training and test set

In [None]:
train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels

test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels

classes = np.array(test_dataset["list_classes"][:]) # the list of classes

### Plot an image from the dataset

### 2. Reshape/flatten a matrix
**Exercise 2:**

Reshape the training and test data sets so that images of size (num_px, num_px, 3) are flattened into single vectors of shape (num_px, num_px, 3, 1).

A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape ($b*c*d$, a) is to use:


In [2]:
print ("train_set_x_flatten shape: " + str(train_set_x_flatten.shape))
#print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x_flatten shape: " + str(test_set_x_flatten.shape))
#print ("test_set_y shape: " + str(test_set_y.shape))
print ("sanity check after reshaping: " + str(train_set_x_flatten[0:5,0]))

### 3. Scale the dataset
To represent color images, the red, green and blue channels (RGB) must be specified for each pixel, and so the pixel value is actually a vector of three numbers ranging from 0 to 255.

One common preprocessing step in machine learning is to center and standardize your dataset, meaning that you substract the mean of the whole numpy array from each example, and then divide each example by the standard deviation of the whole numpy array. But for picture datasets, it is simpler and more convenient and works almost as well to just divide every row of the dataset by 255 (the maximum value of a pixel channel).

In [None]:
train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.

### 4 - Building the parts of our algorithm
The main steps for building a Neural Network are:

1. Define the model structure (such as number of input features)
2. Initialize the model's parameters
3. Loop:
    - Calculate current loss (forward propagation)
    - Calculate current gradient (backward propagation)
    - Update parameters (gradient descent)

You often build 1-3 separately and integrate them into one function we call model().

### 4.1 Helper function- Sigmoid

### 4.2 Initializing parameters

In [None]:
def initialize_with_zeros(dim):
    """
    This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.
    
    Argument:
    dim -- size of the w vector we want (or number of parameters in this case)
    
    Returns:
    w -- initialized vector of shape (dim, 1)
    b -- initialized scalar (corresponds to the bias)
    """

    b#code here

    #####

    #check the dimension
    assert(w.shape == (dim, 1) )
    assert(isinstance(b, float) or isinstance(b, int))

    return w, b

In [None]:
#verify
w, b = initialize_with_zeros(3)
print("w " + str(w))

### 4.3 FP and BP

**Forward Propagation:**
- Get X,
- Compute $A = \sigma(w^T X + b) = (a^{(0)}, a^{(1)}, ..., a^{(m-1)}, a^{(m)})$,
- Cost function: $J = -\frac{1}{m}\sum_{i=1}^{m}y^{(i)}\log(a^{(i)})+(1-y^{(i)})\log(1-a^{(i)})$"
  
**Back Propagation:** 
    Here are the two formulas you will be using:"

- $\frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^T\tag{7}$,
- $ \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (a^{(i)}-y^{(i)})\tag{8}$

In [None]:
# GRADED FUNCTION: propagate

def propagate(w, b, X, Y):
    """
    Implement the cost function and its gradient for the propagation explained above

    Arguments:
    w -- weights, a numpy array of size (num_px * num_px * 3, 1)
    b -- bias, a scalar
    X -- data of size (num_px * num_px * 3, number of examples)
    Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)

    Return:
    cost -- negative log-likelihood cost for logistic regression
    dw -- gradient of the loss with respect to w, thus same shape as w
    db -- gradient of the loss with respect to b, thus same shape as b
    
    Tips:
    - Write your code step by step for the propagation. np.log(), np.dot()
    """
    #no of examples
    m = X.shape[1]

    #Forward Propagation   (CODE HERE - 2 lines) 
    A = 
    J = 
      

    # Back Propagation (CODE HERE - 2 lines) 
    dw = 
    db = 


    #dimension check
    assert(dw.shape == w.shape)
    assert(db.dtype == float)

    grads = {"dw": dw,
             "db": db}
    #print("*****propagate completed ***********")

    return grads, J
    

In [None]:
#check the function
w, b, X, Y = np.array([[1.],[2.]]), 2., np.array([[1.,2.,-1.],[3.,4.,-3.2]]), np.array([[1,0,1]])
grads, cost = propagate(w, b, X, Y)
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))
print ("cost = " + str(cost))

```
dw = [[0.99845601] [2.39507239]]
 ```
 
db = 0.001455578136784208
cost = 5.801545319394553