## Error Gradient
\begin{align}
\frac{\partial{E}}{\partial{w_{ij}}} = -1 \quad (y_j - \hat{y_j}) \quad f'(\sum_{i} w_{ij}x_i) \quad x_i
\end{align}

## Weight Adjustment
\begin{align}
\Delta{w_{ij}} = \eta \quad (y_j - \hat{y_j}) \quad f'(h_j) \quad x_i
\end{align}

where 
\begin{align}
h_j = \sum_{i} w_{ij}x_i
\end{align}

## Useful pre-calculation per neuron

### For output neuron k
\begin{align}
δ_k =(y_k − \hat{y_k}) \quad f'(a_k)
\end{align}

where 
\begin{align}
f'(a_k)
\end{align}
is the summed input for neuron _k_ .


When using sigmoid as _f_ this formular can be further simplified as

\begin{align}
sigmoid'(x) = sigmoid(x) \quad (1 - sigmoid(x) )
\end{align}

into 

\begin{align}
δ_k =(y_k − \hat{y_k}) \quad o_k (1 - o_k)
\end{align}
where _o_ is the output of the neuron


### For hidden neurons j
\begin{align}
\delta_j = \sum{[w_{jk}\delta_k]} \quad f'(h_j)
\end{align}

where 
\begin{align}
f'(h_j)
\end{align}

is the summed input for neuron _j_ 

Again, when using sigmoid as _f_ this formular can be further simplified into

\begin{align}
\delta_j = \sum{[w_{jk}\delta_k]} \quad o_j * (1 - o_j)
\end{align}

# Numpy basics

In [4]:
import numpy as np
np.array([1.0,2.0,3.0])

array([ 1.,  2.,  3.])

In [3]:
np.array([1.0,2.0,3.0])[:, None]

array([[ 1.],
       [ 2.],
       [ 3.]])

### Dot Product

In [13]:
# create a vec 
# [1, 2, 3]
vec = np.array([1,2,3])

# Create a matrix
# [[1, 2],
# [3, 4],
# [5, 6]]
matrix = np.matrix('1 2; 3 4; 5 6')

np.dot(vec, matrix)

matrix([[22, 28]])

In [18]:
#create a vec
#  [[1]
#   [2]]
vec = np.array([1,2])
#[:, None]

#create a matrix
# ([     [ 10,  20,  30],
#        [100, 200, 300]])
matrix = np.matrix('10 20 30; 100 200 300')
np.dot(vec, matrix)

matrix([[210, 420, 630]])

In [19]:
np.dot(matrix, vec)

ValueError: shapes (2,3) and (2,) not aligned: 3 (dim 1) != 2 (dim 0)

### Multiply Vectors

In [21]:
vecA = np.array([1.0,2.0,3.0])[:, None]
vecA

array([[ 1.],
       [ 2.],
       [ 3.]])

In [22]:
vecB = np.array([10.0,20.0,30.0])[:, None]
vecA * vecB

array([[ 10.],
       [ 40.],
       [ 90.]])

In [23]:
vecC = np.array([[ 2 ]])
vecA * vecC

array([[ 2.],
       [ 4.],
       [ 6.]])

In [25]:
vecA.shape

(3, 1)