# Chapter : 8

#### Backpropagation

<img src="backPropFormula.png"/>

In [7]:
# We will write code for backpropagation method
# Recall using CHAIN RULE, we find derivative of a function w.r.t the function inside it.
# We store this result, called gradient, and use it to multiply to the derivative to previous
# layer's output.

# For simplicity we are going to consider that the gradient we recieved from the
# next layer is 1, since multiplying with one wont change anything

# Let's code now

# FORWARD PASS
x = [1.0, -2.0, 3.0]    # inputs
w = [-3.0, -1.0, 2.0]   # weights
b = 1.0                 # bias

xw0 = x[0] * w[0]
xw1 = x[1] * w[1]
xw2 = x[2] * w[2]

# output of dense layer
sum = xw0 + xw1 + xw2 + b

# ReLU function
y = max(sum, 0)

# BACKWARD PASS

# The derivative from previous layer
dvalue = 1.0

# One important thing to note here is that the derivative of ReLU function is
# 1 if the input is greater than 1 else 0
drelu_dsum = dvalue * (1.0 if sum > 0 else 0.0)
print(drelu_dsum)

# Another important thing to note is that the partial derivative
# of a sum is always 1 no matter the inputs

# for x0w0 pair
dsum_dmulxw0 = 1
drelu_dmulxw0 = drelu_dsum * dsum_dmulxw0

# for x1w1 pair
dsum_dmulxw1 = 1
drelu_dmulxw1 = drelu_dsum * dsum_dmulxw1

# for x2w2 pair
dsum_dmulxw2 = 1
drelu_dmulxw2 = drelu_dsum * dsum_dmulxw2

# for b
dsum_db = 1
drelu_db = drelu_dsum * dsum_db

print(dsum_dmulxw0, dsum_dmulxw1, dsum_dmulxw2, drelu_db)

# Continuing with the backpropagation
# One more important thing to note here is that the partial derivative of
# a product is the value with which it is being multiplied.
# For eg.
# d(x*y)/d(x) = y
# d(x*y)/d(y) = x
# Therefore, partial derivative of the first weighted-input w.r.t the input equals the weight

# with respect to x values
dmulxw0_dx0 = w[0]
drelu_dx0 = drelu_dmulxw0 * dmulxw0_dx0

dmulxw1_dx1 = w[1]
drelu_dx1 = drelu_dmulxw1 * dmulxw1_dx1

dmulxw2_dx2 = w[2]
drelu_dx2 = drelu_dmulxw2 * dmulxw2_dx2

# with respect to w values (weights)
dmulxw0_dw0 = x[0]
drelu_dw0 = drelu_dmulxw0 * dmulxw0_dw0

dmulxw1_dw1 = x[1]
drelu_dw1 = drelu_dmulxw1 * dmulxw1_dw1

dmulxw2_dw2 = x[2]
drelu_dw2 = drelu_dmulxw2 * dmulxw2_dw2
print(drelu_dx0, drelu_dx1, drelu_dx2, drelu_dw0, drelu_dw1, drelu_dw2)


1.0
1 1 1 1.0
-3.0 -1.0 2.0 1.0 -2.0 3.0


<img src="./drelu_dz_final1.png" height="300px"/>

In [None]:
# The above completes out code for backpropagation for a single neuron.
# We can optimize the code a bit

