# **Assignment 2A**

# **Backpropagation Step by Step**

04 MAR 2019

[![get-location](http://hmkcode.github.io/images/ai/backpropagation.png)](http://hmkcode.github.io/images/ai/backpropagation.png)

If you are building your own neural network, you will definitely need to understand how to train it. Backpropagation is a commonly used technique for training neural network. There are many resources explaining the technique, but this post will explain backpropagation with concrete example in a very detailed colorful steps.

## Overview

In this post, we will build a neural network with three layers:

- **Input** layer with two inputs neurons
- One **hidden** layer with two neurons
- **Output** layer with a single neuron

![android-tabs](http://hmkcode.github.io/images/ai/nn1.png)

## Weights, weights, weights

Neural network training is about finding weights that minimize prediction error. We usually start our training with a set of randomly generated weights.Then, backpropagation is used to update the weights in an attempt to correctly map arbitrary inputs to outputs.

Our initial weights will be as following: **`w1 = 0.19`, `w2 = 0.41`, `w3 = 0.23`, `w4 = 0.14`, `w5 = 0.32` and `w6 = 0.78`**

![bp_weights](https://raw.githubusercontent.com/Surya-prakash-v/MLAI/master/bp_weights.png)

## Dataset

Our dataset has one sample with two inputs and one output.

![dataset](http://hmkcode.github.io/images/ai/bp_dataset.png)

Our single sample is as following `inputs=[2, 3]` and `output=[1]`.

![training_sample](http://hmkcode.github.io/images/ai/bp_sample.png)

## Forward Pass

We will use given weights and inputs to predict the output. Inputs are multiplied by weights; the results are then passed forward to next layer.

![bp_forward](https://raw.githubusercontent.com/Surya-prakash-v/MLAI/master/bp_forward.png)

## Calculating Error

Now, it’s time to find out how our network performed by calculating the difference between the actual output and predicted one. It’s clear that our network output, or **prediction**, is not even close to **actual output**. We can calculate the difference or the error as following.

![bp_error](https://raw.githubusercontent.com/Surya-prakash-v/MLAI/master/bp_error.png)

## Reducing Error

Our main goal of the training is to reduce the **error** or the difference between **prediction** and **actual output**. Since **actual output** is constant, “not changing”, the only way to reduce the error is to change **prediction** value. The question now is, how to change **prediction** value?

By decomposing **prediction** into its basic elements we can find that **weights** are the variable elements affecting **prediction** value. In other words, in order to change **prediction** value, we need to change **weights** values.

![bp_prediction_elements](http://hmkcode.github.io/images/ai/bp_prediction_elements.png)

> The question now is **how to change\update the weights value so that the error is reduced?**
> The answer is **Backpropagation!**

## **Backpropagation**

**Backpropagation**, short for “backward propagation of errors”, is a mechanism used to update the **weights** using [gradient descent](https://en.wikipedia.org/wiki/Gradient_descent). It calculates the gradient of the error function with respect to the neural network’s weights. The calculation proceeds backwards through the network.

> **Gradient descent** is an iterative optimization algorithm for finding the minimum of a function; in our case we want to minimize th error function. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient of the function at the current point.

![bp_update_formula](http://hmkcode.github.io/images/ai/bp_update_formula.png)

For example, to update `w6`, we take the current `w6` and subtract the partial derivative of **error** function with respect to `w6`. Optionally, we multiply the derivative of the **error** function by a selected number to make sure that the new updated **weight** is minimizing the error function; this number is called **learning rate**.

![update w6](http://hmkcode.github.io/images/ai/bp_w6_update.png)

The derivation of the error function is evaluated by applying the chain rule as following

![finding partial derivative with respect to w6](http://hmkcode.github.io/images/ai/bp_error_function_partial_derivative_w6.png)

So to update `w6` we can apply the following formula

![bp_w6_update_closed_form.png](http://hmkcode.github.io/images/ai/bp_w6_update_closed_form.png)

Similarly, we can derive the update formula for `w5` and any other weights existing between the output and the hidden layer.

![bp_w5_update_closed_form.png](http://hmkcode.github.io/images/ai/bp_w5_update_closed_form.png)

However, when moving backward to update `w1`, `w2`, `w3` and `w4` existing between input and hidden layer, the partial derivative for the error function with respect to `w1`, for example, will be as following.

![finding partial derivative with respect to w1](http://hmkcode.github.io/images/ai/bp_error_function_partial_derivative_w1.png)

We can find the update formula for the remaining weights `w2`, `w3` and `w4` in the same way.

In summary, the update formulas for all weights will be as following:

![bp_update_all_weights](http://hmkcode.github.io/images/ai/bp_update_all_weights.png)

We can rewrite the update formulas in matrices as following

![bp_update_all_weights_matrix](http://hmkcode.github.io/images/ai/bp_update_all_weights_matrix.png)

## Backward Pass

Using derived formulas we can find the new **weights**.

> **Learning rate:** is a hyperparameter which means that we need to manually guess its value.

![bp_new_weights](https://raw.githubusercontent.com/Surya-prakash-v/MLAI/master/bp_new_weights.png)

Now, using the new **weights** we will repeat the forward passed

![bp_forward_2](https://raw.githubusercontent.com/Surya-prakash-v/MLAI/master/bp_forward_2.png)

We can notice that the **prediction `1.07709`** is a little bit closer to **actual output** than the previously predicted one **`1.20`**. We can repeat the same process of backward and forward pass until **error** is close or equal to zero.



<iframe id="dsq-app3147" name="dsq-app3147" allowtransparency="true" frameborder="0" scrolling="no" tabindex="0" title="Disqus" width="100%" src="https://disqus.com/embed/comments/?base=default&amp;f=hmkcode&amp;t_u=https%3A%2F%2Fhmkcode.github.io%2Fai%2Fbackpropagation-step-by-step%2F&amp;t_d=%20Backpropagation%20Step%20by%20Step&amp;t_t=%20Backpropagation%20Step%20by%20Step&amp;s_o=default&amp;d_m=0#version=1a104dbd58e8322771ec614533d7cf2f" horizontalscrolling="no" verticalscrolling="no" style="width: 1px !important; min-width: 100%; border: none !important; overflow: hidden !important; height: 679px !important;"></iframe>

-----

#**Assignment 2B**

In [78]:
import numpy as np

inputs = np.array([2,3])
weights = np.array([[0.19,0.23,0.32],[0.41,0.14,0.78]])
actualOutputValue = 1

def forwardPass(weights):
  print("input")
  print(inputs)
  
  #layer1
  hiddenLayerActivations = np.dot(inputs,weights[:,:2])
  print("hidden Layer Activations :")
  print(hiddenLayerActivations)
  
  #layer2
  output = np.dot(hiddenLayerActivations,weights[:,2])
  print("predicted output :" + str(output))
  
  return hiddenLayerActivations,output

def backPropagation(hiddenLayerActivations,output):
  learningRate = 0.05
  error = ((actualOutputValue-output.item(0))**2)/2
  print(" error : "+ str(error))
  
  delta = output.item(0) - actualOutputValue
  print(" delta : "+ str(delta))
  
  newWeights = np.zeros((2,3))
  #layer2
  newWeights[:,2] = weights[:,2] - learningRate * delta * hiddenLayerActivations
  #layer1
  newWeights[:,:2] = weights[:,:2] - learningRate * delta * np.dot(np.asmatrix(inputs).transpose(),np.asmatrix(weights[:,2]))
  
  return newWeights

#Forward pass 1
print("================================= Forward pass 1 =====================================")
(hiddenLayerActivations,output) = forwardPass(weights)

#back propagation
print("================================ Back propagation ====================================")
newWeights = backPropagation(hiddenLayerActivations,output)
print(" New updated weights :")
print(newWeights)

#Forward pass2
print("================================= Forward pass 2 =====================================")
(hiddenLayerActivations,output) = forwardPass(newWeights)

input
[2 3]
hidden Layer Activations
[1.61 0.88]
predicted output :1.2016
 error : 0.02032128
 delta : 0.2016
 New updated weights :
[[0.1835488 0.2142752 0.3037712]
 [0.4003232 0.1164128 0.7711296]]
input
[2 3]
hidden Layer Activations
[1.5680672 0.7777888]
predicted output :1.0761096212531203
