# Lab 5. Vector and Matrix Backpropogation
by Domrachev Ivan, B20-Ro-01

In [1]:
from nn_from_scratch.nodes import SoftMax, ReLU
from nn_from_scratch.neurons import Linear
import numpy as np

## Part 1. Vectorized softmax function

> Note: accidentally, I've implemented this functionality in the previous lab assignment. Here is the copy of corresponding charpter from there

All the logic is implemented in the file [backprop.py](./backprop.py). There is an abstract class Node, which implements all the expected logic except for the function itself and the jacobian matrix calculation.

In [2]:
sm = SoftMax(5)

# Random input values (x itself, and assumed partial derivative)
x_input = np.array([0, 1, 2, 3, 4])
dL_dy = np.array([1, 2, 3, 2, 1])

# Forward call
y_value = sm.forward(x_input)

# Backpropogation
dL_dx = sm.backward(dL_dy)

y_value, dL_dx

(array([0.01165623, 0.03168492, 0.08612854, 0.23412166, 0.63640865]),
 array([-0.00510617,  0.01780491,  0.1345273 ,  0.13156147, -0.27878751]))

The verification was described in the previous lab, the code (as well as outputs) haven't changed, so let's omit that here.

## Part 2. ReLU

The function ReLU, as well as the partial derivative, were implemented according to the lecture. Note that despite derivative $\frac{\partial ReLU}{\partial x} (0)$ was chosen, it was assigned to $0$ in my implementation. 

In [3]:
relu = ReLU(5)

# Random input values (x itself, and assumed partial derivative)
x_input = np.array([0, -1, 2, -1, 4])
dL_dy = np.array([1, 2, 3, 2, 1])

# Forward call
y_value = relu.forward(x_input)

# Backpropogation
dL_dx = relu.backward(dL_dy)

y_value, dL_dx

(array([0, 0, 2, 0, 4]), array([0., 0., 3., 0., 1.]))

It is easy to see from that example that `ReLU` function works correctly.

## Part 3. Linear layer

To implement linear node $Y = WX$ in my structure, I decided to make the following assumptions:
1. The weigts matrix `W` is an inner parameter of the class, which could be accessed via a getter. It's partial derivative is also stored inside it.
2. Despite there are functions to compute derivatives w.r.t. `W` and `X`, I decided to override the functoin `forward` with dot products from `numpy`, because this approach is more efficient

You can explore the implementation in `backprop.py` file.
> *Note. To compute the jacobian $\frac{\partial Y}{\partial X}$ and $\frac{\partial Y}{\partial W}$, the key note is that:*
> 1. *$\frac{\partial Y}{\partial X}_{i, j}$ is a zero matrix, except for the $j$-th column, which equals to:*
> $$\frac{\partial Y}{\partial X}_{i, j, l, j} = W_{l, j}$$
> 2. *Similarly, $\frac{\partial Y}{\partial W}_{i, j}$ is a zero matrix, except for the $i$-th raw, which equals to:*
> $$\frac{\partial Y}{\partial W}_{i, j, j, l} = X_{j, l}$$

In [4]:
input_dim=(2, 3)
output_dim=(5, 3)
relu = Linear(input_dim, output_dim)

# Random input values (x itself, and assumed partial derivative)
x_input = np.random.rand(*input_dim)
dL_dy = np.random.rand(*output_dim)

# Forward call
y_value = relu.forward(x_input)

# Backpropogation
dL_dx = relu.backward(dL_dy)
dL_dw = relu._W_pd

print("Forward output:")
print(y_value)
print("Jacobian dL/dx:")
print(dL_dx)
print("Jacobian dL/dw:")
print(dL_dw)

Forward output:
[[0.69166497 1.05687395 0.66806232]
 [0.66735337 0.96454763 0.65531636]
 [0.75730103 1.06981767 0.73763436]
 [0.68926678 1.05670438 0.67034384]
 [0.74321974 1.12798567 0.72313718]]
Jacobian dL/dx:
[[1.89233349 0.84690493 0.92517105]
 [1.81512837 0.78568075 0.86684344]]
Jacobian dL/dw:
[[1.49979587 0.37961463 0.43178553]
 [1.0411152  0.20627368 0.24631968]
 [2.09897645 0.47136409 0.63882777]
 [1.38170136 0.16064022 0.31876463]
 [0.98934576 0.31704245 0.34003937]]


Also note that calculations of back propogation wia jacobians lead to the same result:

In [5]:
dL_dx_jac = relu.backward(dL_dy, use_jacobian=True)
dL_dw_jac = relu._W_pd

print("Jacobian dL/dx:")
print(dL_dx_jac)

print("Jacobian dL/dw")
print(dL_dw_jac)

Jacobian dL/dx:
[[1.89233349 0.84690493 0.92517105]
 [1.81512837 0.78568075 0.86684344]]
Jacobian dL/dw
[[1.49979587 0.37961463 0.43178553]
 [1.0411152  0.20627368 0.24631968]
 [2.09897645 0.47136409 0.63882777]
 [1.38170136 0.16064022 0.31876463]
 [0.98934576 0.31704245 0.34003937]]
