## Exercício de Backpropagation

Implementar o código para __calcular uma rodada de atualização com backpropagation__ para __dois conjuntos de pesos__. O andamento para frente já foi feito, o seu objetivo é escrever o andamento para trás.

Coisas a fazer

1. Calcular o __erro da rede__.
  * R: y - y^ (predição da rede) <br><br>
1. Calcular o __gradiente de erro__ da __camada de output__.
  * R: erro*(sigmoid_prime(derivada)) = erro * (output * (1 - output)) <br><br>
1. Usar a __backpropagation__ para __calcular__ o __erro da camada oculta__.
  * R: 
    * __Combinação Linear__ do __ERRO\_OUTPUT__ com os __PESOS\_OUTPUT\_HIDDEN__ vezes:
    * a __derivada__ do OUTPUT\_HIDDEN\_LAYER, que no Python fica:
      * __Combinação Linear__: ```np.dot(output_error_term, weights_hidden_output)```
      * __Derivada__ do Output da camada Oculta: ```hidden_layer_output*(1 - hidden_layer_output)```
      * ```np.dot(output_error_term, weights_hidden_output)*hidden_layer_output*(1 - hidden_layer_output)```  <br><br>
1. Calcular o __passo de atualização dos pesos__.
  * R:
    * delta_w_hidden_out = learnrate * output_error_term * hidden_layer_output
    * delta_w_input_hidden = learnrate * hidden_error_term * x[:, None]
  
  
  
## Resultado do "Experimento":
1. dado os __inputs(dataset) [0.5, 0.1, -0.2]__; 
1. jogamos os __pesos "aleatórios"__: 
  * de input: [[0.5, -0.6],[0.1, -0.2],[0.1, 0.7]];
  * de output: [0.1, -0.3];
1. a rede aprendeu que era necessário variar em:
  * hidden-to-output: [0.00804047  0.00555918]
  * input-to-hidden: [[1.77005547e-04  -5.11178506e-04], [3.54011093e-05  -1.02235701e-04], [-7.08022187e-05   2.04471402e-04]]

In [9]:
import numpy as np


def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# inputs
x = np.array([0.5, 0.1, -0.2])
# y
target = 0.6
# n
learnrate = 0.5 

# hidden layer INPUT weights (W)
weights_input_hidden = np.array([[0.5, -0.6],
                                 [0.1, -0.2],
                                 [0.1, 0.7]])

# hidden layer OUTPUT weights
weights_hidden_output = np.array([0.1, -0.3])

## Forward pass (passing to next layer)
hidden_layer_input = np.dot(x, weights_input_hidden)
hidden_layer_output = sigmoid(hidden_layer_input)

output_layer_in = np.dot(hidden_layer_output, weights_hidden_output)
output = sigmoid(output_layer_in)

## Backwards pass
## TODO: Calculate output error
# DONE: y - y^ (y-heat = prediction)
error = target - output

# TODO: Calculate error term for output layer
# DONE: error * derivated of output
output_error_term = error * output * (1 - output)

# TODO: Calculate error term for hidden layer
# DONE: error * derivated of output
linear_combination_output_error_and_hidden_weights = np.dot(output_error_term, weights_hidden_output)
derivated_output = hidden_layer_output * (1 - hidden_layer_output)
hidden_error_term = linear_combination_output_error_and_hidden_weights * derivated_output

# TODO: Calculate change in weights for hidden layer to output layer
# DONE: Global formula: delta_w = n*(y-y^)*ƒ'(h)
## n = learnrate
## (y-y^) = error
## (y-y^)*ƒ'(h) = error_term
delta_w_hidden_out = learnrate * output_error_term * hidden_layer_output

# TODO: Calculate change in weights for input layer to hidden layer
x_as_column_vector = x[:, None]
delta_w_input_hidden = learnrate * hidden_error_term * x_as_column_vector

print('Change in weights for hidden layer to output layer:')
print(delta_w_hidden_out)
print('Change in weights for input layer to hidden layer:')
print(delta_w_input_hidden)

Change in weights for hidden layer to output layer:
[ 0.00804047  0.00555918]
Change in weights for input layer to hidden layer:
[[  1.77005547e-04  -5.11178506e-04]
 [  3.54011093e-05  -1.02235701e-04]
 [ -7.08022187e-05   2.04471402e-04]]
