## Neual Network Step by Step - Part I

Written by: Timothy (tq.wang93@hotmail.com)

Initial draft: November 2020

Last update: November 2020

<img style="float: center;" src="Figures/NN_Part_I_head.gif" width="100%"> 

---

### Logistic regression from scratch

Please read [my tutorial](https://github.com/Timothy802/Computational-Modeling-of-Word-Semantics/blob/main/Neural_Netword_Step_by_Step_Part_I.pdf) on logistic regression, which will help you understand why we code like this.

### Coding a logistic regression

#### Code: sigmoid function

In [None]:
# define sigmoid function
def sigmoid(z):
    a = 1 / (1 + np.exp(-z))
    return a

#### Code: feed-forward, cost function and gradient descent

- Feed-forward
- Cost function
- Gradient descent

In [None]:
# initialize parameters
n_dim = train_data_sta.shape[0] # number of rows in training data
w = np.zeros((n_dim, 1))
b = 0

# propagate
def propagate(w, b, X, Y):
    
    # feed-forward function
    Z = np.dot(w.T, X) + b # np.dot -> matrix multiplication
    A = sigmoid(Z)
    
    # cost function
    m = X.shape[1]
    J = -1/m * np.sum(Y * np.log(A) + (1-Y) * np.log(1-A))
    
    # gradient descent (Note: mean)
    dw = 1/m * np.dot(X,(A-Y).T)
    db = 1/m * np.sum(A-Y)
    
    grands = {'dw': dw, 'db': db}
    
    return grands, J

#### Code: optimization

In [None]:
# Optimization
def optimize(w, b, X, Y, alpha, n_iters):
    costs = []
    for i in range(n_iters):
        grands, J = propagate(w, b, X, Y)
        dw = grands['dw']
        db = grands['db']
        
        w = w - alpha * dw
        b = b - alpha * db
        
        if i % 100 == 0:
            costs.append(J)
            print('Epoch %d: cost = %.4f' % (i+1, J))
     
    grands = {'dw': dw, 'db': db}
    params = {'w': w, 'b': b}
    
    return grands, params, costs

#### Code: prediction

In [None]:
# Prediction
def predict(w, b, X_test):
    
    Z = np.dot(w.T, X_test) + b
    A = sigmoid(Z)
    
    m = X_test.shape[1]
    Y_pred = np.zeros((1, m))
    
    for i in range(m):
        if A[:, i] > 0.5:
            Y_pred[:, i] = 1
        else:
            Y_pred[:, i] = 0
    
    return Y_pred

#### Code: integrating previous steps

In [None]:
# integrating previous steps
def model(w, b, X_train, X_test, Y_train, Y_test, alpha, n_iters):
    grands, params, costs = optimize(w, b, X_train, Y_train, alpha, n_iters)
    w = params['w']
    b = params['b']
    
    Y_pred_train = predict(w, b, X_train)
    Y_pred_test = predict(w, b, X_test)
    
    print('Train accuracy: %.2f' % np.mean(y_pred_train == y_train))
    print('Test accuracy: %.2f' % np.mean(y_pred_test == y_test))
    
    dic = {
           'w': w,
           'b': b,
           'costs': costs,
           'y_pred_train': y_pred_train,
           'y_pred_test': y_pred_test,
           'alpha': alpha
    }
    
    return dic

#### Code: train and test

- Train and test

In [None]:
dic = model(w, b, 
            train_data_sta, train_labels_tran, 
            test_data_sta, test_labels_tran,
            alpha = 0.005, n_iters = 2000
           )

- Plot the change of cost as a function of iterations

In [None]:
plt.plot(b['costs'])
plt.xlabel('per hundred iterations')
plt.ylabel('cost')

#### Code: predict a picture

In [None]:
index = 1
print('True label: %d' % test_labels_tran[0, index])
print('Pred label: %d' % int(b['y_pred_test'][0, index]))

# show the picture
plt.imshow(test_data_org[index])