From before we saw that one weight update can be calculated as:


$$ Δw_i = ηδx_i  $$

$$ -(y-\hat{y}) \sigma'(x) $$

where η is the learn rate, and with the error term δ as

$$ δ = (y − \hat{y})f′(h) = (y − \hat{y})f'(∑w_ix_i) $$

Remember, in the above equation $$ -(y-\hat{y}) $$ is the output error, and $$ f′(h) $$
refers to the derivative of the activation function, $$ f(h) $$, for example, for the sigmoid function.

$$ \sigma'(x) = \sigma(x) * (1 - \sigma(x)) $$

We'll call that derivative the output gradient.

Below we'll also be using the sigmoid as the activation function $$ f(h) $$.

In [6]:
import numpy as np

def sigmoid(x):
    """
    Calculate sigmoid
    """
    return 1/(1+np.exp(-x))

def sigmoid_prime(x):
    """
    # Derivative of the sigmoid function
    """
    return sigmoid(x) * (1 - sigmoid(x))

# learnrate = η
learnrate = 0.5
x = np.array([1, 2, 3, 4])
y = np.array(0.5)

# Initial weights
w = np.array([0.5, -0.5, 0.3, 0.1])

### Calculate one gradient descent step for each weight
### Note: Some steps have been consilated, so there are
###       fewer variable names than in the above sample code

# TODO: Calculate the node's linear combination of inputs and weights
h = np.dot(x, w)

# TODO: Calculate output of neural network (y_hat) = f(h)
nn_output = sigmoid(h)

# TODO: Calculate error of neural network (y - y_hat)
error = y - nn_output

# TODO: Calculate the error term
#       Remember, this requires the output gradient, which we haven't
#       specifically added a variable for. δ=(y−ŷ)f′(h) 
#  here we are using sigmoid as the activation function so, f'(h) = σ′(h) = σ(h)∗(1−σ(h))

error_term = error * sigmoid_prime(h)

# TODO: Calculate change in weights Δwi = ηδxi del_w is a vector with the delta of each weight
del_w = learnrate * error_term * x

print('Neural Network output:')
print(nn_output)
print('Amount of Error:')
print(error)
print('Change in Weights:')
print(del_w)

Neural Network output:
0.689974481128
Amount of Error:
-0.189974481128
Change in Weights:
[-0.02031869 -0.04063738 -0.06095608 -0.08127477]


One step forward pass and backpropogation multi layer

In [1]:
import numpy as np


def sigmoid(x):
    """
    Calculate sigmoid
    """
    return 1 / (1 + np.exp(-x))


x = np.array([0.5, 0.1, -0.2])
target = 0.6
learnrate = 0.5

weights_input_hidden = np.array([[0.5, -0.6],
                                 [0.1, -0.2],
                                 [0.1, 0.7]])

weights_hidden_output = np.array([0.1, -0.3])

## Forward pass
hidden_layer_input = np.dot(x, weights_input_hidden)
hidden_layer_output = sigmoid(hidden_layer_input)

output_layer_in = np.dot(hidden_layer_output, weights_hidden_output)
output = sigmoid(output_layer_in)

## Backwards pass
## TODO: Calculate output error
error = (target - output)

# TODO: Calculate error term for output layer
output_error_term = error * output * (1- output)

# TODO: Calculate error term for hidden layer
hidden_error_term = np.dot(output_error_term, weights_hidden_output) * \
                    hidden_layer_output * (1 - hidden_layer_output)
# sigmoid_linha = np.array([ hidden_layer_output[0] * (1 - hidden_layer_output[0]), hidden_layer_output[1] *(1 - hidden_layer_output[1])])
# hidden_error_term = np.array([sigmoid_linha[0] * output_error_term * weights_hidden_output[0], sigmoid_linha[1] * output_error_term * weights_hidden_output[1] ])
# print(sigmoid_linha)
# TODO: Calculate change in weights for hidden layer to output layer
delta_w_h_o = learnrate * output_error_term * hidden_layer_output

# TODO: Calculate change in weights for input layer to hidden layer
delta_w_i_h = learnrate * hidden_error_term *x [:,None]

print('Change in weights for hidden layer to output layer:')
print(delta_w_h_o)
print('Change in weights for input layer to hidden layer:')
print(delta_w_i_h)


Change in weights for hidden layer to output layer:
[ 0.00804047  0.00555918]
Change in weights for input layer to hidden layer:
[[  1.77005547e-04  -5.11178506e-04]
 [  3.54011093e-05  -1.02235701e-04]
 [ -7.08022187e-05   2.04471402e-04]]
