### Multiple Layer Forward Pass Network with Sigmoid Activation 

Below, you'll implement a forward pass through a 4x3x2 network, with sigmoid activation functions for both layers.

Things to do:

- Calculate the input to the hidden layer.
- Calculate the hidden layer output.
- Calculate the input to the output layer.
- Calculate the output of the network

**Perceptron Function:** Receives Two Inputs, applying linear combination function as linear_combination = weight1 . input1 + weight2 . input2

**Activation Function:** In this case, Sigmoid Function with Bias: output as range of probabilities (of success)

**Error** Error is nothing but known output (y) minus computed output (y-hat)

**Error Term** Error minus derivative of sigmoid

**Gradient Descent Step** Gives us revised weights, or 'delta w' -- It is learning rate (eta) multipliedBy error_term  multipliedBy input(x)

**Gradient Descent** After every step computed i.e. delta of weights. Updated weights with delta weights from Gradient Step. We will be follwing MSE (Mean Squared Error) instead of SSE (Sum of Squared Errors) avoid Gradient from Divergering & keep it computationally efficient .. 

In [4]:
import numpy as np
from data_prep import features, targets, features_test, targets_test

np.random.seed(21)

def sigmoid(x):
    """
    Calculate sigmoid
    """
    return 1 / (1 + np.exp(-x))


# Hyperparameters
n_hidden = 2  # number of hidden units
epochs = 900
learnrate = 0.005

n_records, n_features = features.shape
last_loss = None
# Initialize weights
weights_input_hidden = np.random.normal(scale=1 / n_features ** .5,
                                        size=(n_features, n_hidden))
weights_hidden_output = np.random.normal(scale=1 / n_features ** .5,
                                         size=n_hidden)

for e in range(epochs):
    del_w_input_hidden = np.zeros(weights_input_hidden.shape)
    del_w_hidden_output = np.zeros(weights_hidden_output.shape)
    for x, y in zip(features.values, targets):
        ## Forward pass ##
        # TODO: Calculate the output
        hidden_input = np.dot(x, weights_input_hidden)
        hidden_output = sigmoid(hidden_input)

        output = sigmoid(np.dot(hidden_output,
                                weights_hidden_output))

        ## Backward pass ##
        # TODO: Calculate the error
        error = y - output

        # TODO: Calculate error gradient in output unit
        output_error = error * output * (1 - output)

        # TODO: propagate errors to hidden layer
        hidden_error = np.dot(output_error, weights_hidden_output) * \
                       hidden_output * (1 - hidden_output)

        # TODO: Update the change in weights
        del_w_hidden_output += output_error * hidden_output
        del_w_input_hidden += hidden_error * x[:, None]

    # TODO: Update weights
    weights_input_hidden += learnrate * del_w_input_hidden / n_records
    weights_hidden_output += learnrate * del_w_hidden_output / n_records

    # Printing out the mean square error on the training set
    if e % (epochs / 10) == 0:
        hidden_output = sigmoid(np.dot(x, weights_input_hidden))
        out = sigmoid(np.dot(hidden_output,
                             weights_hidden_output))
        loss = np.mean((out - targets) ** 2)

        if last_loss and last_loss < loss:
            print("Train loss: ", loss, "  WARNING - Loss Increasing")
        else:
            print("Train loss: ", loss)
        last_loss = loss

# Calculate accuracy on test data
hidden = sigmoid(np.dot(features_test, weights_input_hidden))
out = sigmoid(np.dot(hidden, weights_hidden_output))
predictions = out > 0.5
accuracy = np.mean(predictions == targets_test)
print("Prediction accuracy: {:.3f}".format(accuracy))

('Train loss: ', 0.27630002065852294)
('Train loss: ', 0.27487280940102665)
('Train loss: ', 0.2734814690053808)
('Train loss: ', 0.27212535119812675)
('Train loss: ', 0.27080379729958337)
('Train loss: ', 0.2695161402601928)
('Train loss: ', 0.2682617065761968)
('Train loss: ', 0.26703981808591765)
('Train loss: ', 0.2658497936485804)
('Train loss: ', 0.26469095070807397)
Prediction accuracy: 0.425
