# Implementing backpropagation

Now we've seen that the error in the output layer is

δ
​k
​​ =(y
​k
​​ −
​y
​^
​​ 
​k
​​ )f
​′
​​ (a
​k
​​ )

and the error in the hidden layer is

For now we'll only consider a simple network with one hidden layer and one output unit. Here's the general algorithm for updating the weights with backpropagation:

Set the weight steps for each layer to zero
The input to hidden weights Δw
​ij
​​ =0
The hidden to output weights ΔW
​j
​​ =0
For each record in the training data:
Make a forward pass through the network, calculating the output 
​y
​^
​​ 
Calculate the error gradient in the output unit, δ
​o
​​ =(y−
​y
​^
​​ )f
​′
​​ (z) where z=∑
​j
​​ W
​j
​​ a
​j
​​ , the input to the output unit.
Propagate the errors to the hidden layer δ
​j
​h
​​ =δ
​o
​​ W
​j
​​ f
​′
​​ (h
​j
​​ )
Update the weight steps,:
ΔW
​j
​​ =ΔW
​j
​​ +δ
​o
​​ a
​j
​​ 
Δw
​ij
​​ =Δw
​ij
​​ +δ
​j
​h
​​ a
​i
​​ 
Update the weights, where η is the learning rate and m is the number of records:
W
​j
​​ =W
​j
​​ +ηΔW
​j
​​ /m
w
​ij
​​ =w
​ij
​​ +ηΔw
​ij
​​ /m
Repeat for e epochs.
Backpropagation exercise
Now you're going to implement the backprop algorithm for a network trained on the graduate school admission data. You should have everything you need from the previous exercises to complete this one.

Your goals here:

Implement the forward pass.
Implement the backpropagation algorithm.
Update the weights.

In [21]:
import numpy as np
from data_prep import features, targets, features_test, targets_test

np.random.seed(21)

def sigmoid(x):
    """
    Calculate sigmoid
    """
    return 1 / (1 + np.exp(-x))


# Hyperparameters
n_hidden = 2  # number of hidden units
epochs = 900
learnrate = 0.005

n_records, n_features = features.shape
last_loss = None
# Initialize weights
weights_input_hidden = np.random.normal(scale=1 / n_features ** .5,
                                        size=(n_features, n_hidden))
weights_hidden_output = np.random.normal(scale=1 / n_features ** .5,
                                         size=n_hidden)

for e in range(epochs):
    del_w_input_hidden = np.zeros(weights_input_hidden.shape)
    del_w_hidden_output = np.zeros(weights_hidden_output.shape)
    for x, y in zip(features.values, targets):
        ## Forward pass ##
        # TODO: Calculate the output
        hidden_input = np.dot(x, weights_input_hidden)
        hidden_output = sigmoid(hidden_input)
        output = sigmoid(np.dot(hidden_output, weights_hidden_output))

        ## Backward pass ##
        # TODO: Calculate the network's prediction error
        error = y - output

        # TODO: Calculate error term for the output unit
        output_error_term = error * output * (1 - output)

        ## propagate errors to hidden layer

        # TODO: Calculate the hidden layer's contribution to the error
        hidden_error = np.dot(output_error_term, weights_hidden_output)
        
        # TODO: Calculate the error term for the hidden layer
        hidden_error_term = hidden_error * hidden_output * (1 - hidden_output)
        
        # TODO: Update the change in weights
        del_w_hidden_output += output_error_term * hidden_output
        del_w_input_hidden += hidden_error_term * x[:, None]

    # TODO: Update weights
    weights_input_hidden += learnrate * del_w_input_hidden / n_records
    weights_hidden_output += learnrate * del_w_hidden_output / n_records

    # Printing out the mean square error on the training set
    if e % (epochs / 10) == 0:
        hidden_output = sigmoid(np.dot(x, weights_input_hidden))
        out = sigmoid(np.dot(hidden_output,
                             weights_hidden_output))
        loss = np.mean((out - targets) ** 2)

        if last_loss and last_loss < loss:
            print("Train loss: ", loss, "  WARNING - Loss Increasing")
        else:
            print("Train loss: ", loss)
        last_loss = loss

# Calculate accuracy on test data
hidden = sigmoid(np.dot(features_test, weights_input_hidden))
out = sigmoid(np.dot(hidden, weights_hidden_output))
predictions = out > 0.5
accuracy = np.mean(predictions == targets_test)
print("Prediction accuracy: {:.3f}".format(accuracy))

Train loss:  0.276300020659
Train loss:  0.274872809401
Train loss:  0.273481469005
Train loss:  0.272125351198
Train loss:  0.2708037973
Train loss:  0.26951614026
Train loss:  0.268261706576
Train loss:  0.267039818086
Train loss:  0.265849793649
Train loss:  0.264690950708
Prediction accuracy: 0.425


In [2]:
import numpy as np
from data_prep import features, targets, features_test, targets_test

np.random.seed(21)

def sigmoid(x):
    """
    Calculate sigmoid
    """
    return 1 / (1 + np.exp(-x))


# Hyperparameters
n_hidden = 2  # number of hidden units
epochs = 900
learnrate = 0.005

n_records, n_features = features.shape
last_loss = None
# Initialize weights
weights_input_hidden = np.random.normal(scale=1 / n_features ** .5,
                                        size=(n_features, n_hidden))
weights_hidden_output = np.random.normal(scale=1 / n_features ** .5,
                                         size=n_hidden)

for e in range(epochs):
    del_w_input_hidden = np.zeros(weights_input_hidden.shape)
    del_w_hidden_output = np.zeros(weights_hidden_output.shape)
    for x, y in zip(features.values, targets):
        ## Forward pass ##
        # TODO: Calculate the output
        hidden_input = None
        hidden_output = None
        output = None

        ## Backward pass ##
        # TODO: Calculate the network's prediction error
        error = None

        # TODO: Calculate error term for the output unit
        output_error_term = None

        ## propagate errors to hidden layer

        # TODO: Calculate the hidden layer's contribution to the error
        hidden_error = None
        
        # TODO: Calculate the error term for the hidden layer
        hidden_error_term = None
        
        # TODO: Update the change in weights
        del_w_hidden_output += 0
        del_w_input_hidden += 0

    # TODO: Update weights
    weights_input_hidden += 0
    weights_hidden_output += 0

    # Printing out the mean square error on the training set
    if e % (epochs / 10) == 0:
        hidden_output = sigmoid(np.dot(x, weights_input_hidden))
        out = sigmoid(np.dot(hidden_output,
                             weights_hidden_output))
        loss = np.mean((out - targets) ** 2)

        if last_loss and last_loss < loss:
            print("Train loss: ", loss, "  WARNING - Loss Increasing")
        else:
            print("Train loss: ", loss)
        last_loss = loss

# Calculate accuracy on test data
hidden = sigmoid(np.dot(features_test, weights_input_hidden))
out = sigmoid(np.dot(hidden, weights_hidden_output))
predictions = out > 0.5
accuracy = np.mean(predictions == targets_test)
print("Prediction accuracy: {:.3f}".format(accuracy))

Train loss:  0.276316082465
Train loss:  0.276316082465
Train loss:  0.276316082465
Train loss:  0.276316082465
Train loss:  0.276316082465
Train loss:  0.276316082465
Train loss:  0.276316082465
Train loss:  0.276316082465
Train loss:  0.276316082465
Train loss:  0.276316082465
Prediction accuracy: 0.300
