In [1]:
%%HTML
<link rel="stylesheet" type="text/css" href="../css/custom.css">

# Neural Networks
![footer_logo](../images/logo.png)

# Neural Networks: intuition

![neuron_comparison center half](../images/neuron_comparison.png)


# Neural Networks: intuition

![neuron_comparison center half](../images/neuron.png)


# Neural Networks
* **Neuron (node)**: element of a network where inputs (vector **x**) are combined with weights (vector **w**), a bias (value *b*) and a non-linear activation function (σ) to produce an output value, 
i.e. $\text{output = σ}(w^T x + b)$ 
* **Layer**: a column of neurons stacked together that can receive the same inputs.
* **Hidden layers**: intermediate layers between inputs and outputs.
* **Deep neural network**: a neural network that contains many hidden layers, and can therefore provide solutions to more complicated and subtle decision problems. 


![neuron_comparison center half](../images/neuron.png)


# Neural Networks: training


*Training*: the process of tuning the weights **w** in a network by providing the network with example data; a combination of input data **X** and the target label the network should predict *y*.  

* Gradient descent: process of following the gradients of the error function towards a minimum value
* Backpropagation: fast algorithm for computing such gradients based on the chain-rule



![center](../images/cost_function_gradient.png)

### <center> $ w' = w - \eta \frac{\partial J(w)}{\partial w }$


# Neural Networks: training

Epoch: 

1. *Forward pass*: a data sample is passed forward through the network to determine a prediction
2. *Backward pass*: recursively compute the error backwards from the last layer following the chain-rule and update the weights w.r.t. the known target output. 

Requirement: all elements of the neural network should be differentiable


![center](../images/model_diagram.gif)


# Forward pass
### > Feed data through network

![center](../images/forward_pass_0.gif)

<sub>*Ryszard Tadeusiewcz "Sieci neuronowe", Kraków 1992*</sub>

# Forward pass
### > Feed data through network


![center](../images/forward_pass_1.gif)

<sub>*Ryszard Tadeusiewcz "Sieci neuronowe", Kraków 1992*</sub>

# Forward pass
### > Error = truth - output
(or error function)

![center](../images/forward_pass_2.gif)

<sub>*Ryszard Tadeusiewcz "Sieci neuronowe", Kraków 1992*</sub>

# Backpropagation
### > Local error contribution
![center](../images/backpropagation_0.gif)

<sub>*Ryszard Tadeusiewcz "Sieci neuronowe", Kraków 1992*</sub>

# Backpropagation
### > Local error contribution


![center](../images/backpropagation_1.gif)

<sub>*Ryszard Tadeusiewcz "Sieci neuronowe", Kraków 1992*</sub>

# Backpropagation

### > We want to find the best weights!

#### Update weights using gradient descent

#### $ w'_{(x_1)1} = w_{(x_1)1} - \eta \frac{\partial \text{loss}}{\partial w_{(x_1)1} }$






#### we find $\frac{\partial \text{loss}}{\partial w_{(x_1)1} }$ using the chain rule

#### $ \frac{\partial \text{loss}}{\partial w_{(x_1)1}} = \frac{\partial \text{loss}}{\partial f_1(e)} \frac{\partial f_1(e)}{\partial e} \frac{\partial e}{\partial w_{(x_1)1}} $, where $e=w_{(x_1)1}x_1 + w_{(x_2)1}x_2 $


![center](../images/forward_pass_0.gif)

<sub>*Ryszard Tadeusiewcz "Sieci neuronowe", Kraków 1992*</sub>

# Backpropagation

### > We want to find the best weights!

#### Update weights using gradient descent

#### $ w'_{(x_1)1} = w_{(x_1)1} - \eta \frac{\partial \text{loss}}{\partial w_{(x_1)1} }$

#### we find $\frac{\partial \text{loss}}{\partial w_{(x_1)1} }$ using the chain rule

#### $ \frac{\partial \text{loss}}{\partial w_{(x_1)1}} = \frac{\partial \text{loss}}{\partial f_1(e)} \frac{\partial f_1(e)}{\partial e} \frac{\partial e}{\partial w_{(x_1)1}} $, where $e=w_{(x_1)1}x_1 + w_{(x_2)1}x_2 $

Thus,

$ \frac{\partial \delta}{\partial w_{(x_1)1}} = -\delta_1 \frac{\partial f_1(e)}{\partial e} x_1 $ 
![center](../images/forward_pass_0.gif)

<sub>*Ryszard Tadeusiewcz "Sieci neuronowe", Kraków 1992*</sub>


# Backpropagation
### > Optimize weight with local error


![center](../images/backpropagation_2.gif)

<sub>*Ryszard Tadeusiewcz "Sieci neuronowe", Kraków 1992*</sub>

# Backpropagation
### > Optimize weight with local error


![center](../images/backpropagation_3.gif)

<sub>*Ryszard Tadeusiewcz "Sieci neuronowe", Kraków 1992*</sub>

# Conclusion 

* Intuition to neural networks: perceptron
* Backpropagation

### [Exercise: gradient descent for XOR perceptron](../exercises/01-01_exercises_xor_perceptron.ipynb)


![footer_logo](../images/logo.png)