# Computational Graph

Database -> using ER, ..etc to visualise

AI? -> model visualization using graph (computational graph)

- Deterministic (NN)
- Stochastic (Bayesian)

## Partial derivatives

3 methods:
- Numerical derivatives

$$
\frac{\partial f}{\partial x} = \lim_{\Delta x \rightarrow 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}
$$

- Symbolic derivatives

$$
\cos(x)'= -\sin(x)
$$

- Automatic derivatives = Numerical derivatives + Symbolic derivatives

## Phương pháp đạo hàm tự động

$x \rightarrow y \rightarrow z \rightarrow w$

Forward differentiation:

$$
\frac{\partial w}{\partial x} = \frac{\partial x}{\partial x}\frac{\partial y}{\partial x}\frac{\partial z}{\partial y}\frac{\partial w}{\partial z}
$$

Backward differentiation:

$$
\frac{\partial w}{\partial x} = \frac{\partial y}{\partial x}\frac{\partial z}{\partial y}\frac{\partial w}{\partial z}\frac{\partial w}{\partial w}
$$

Example #1 (slide)

In [1]:
import torch

In [2]:
a = torch.tensor(2., requires_grad = True)
b = torch.tensor(1., requires_grad = True)
c = a + b
d = b + 1
e = c * d

In [3]:
e.backward()

In [4]:
print(f"de/da = {a.grad}")
print(f"de/db = {b.grad}")

de/da = 2.0
de/db = 5.0


Example #2

<img src="hand-calc.png">

In [5]:
x1 = torch.tensor(2., requires_grad = True)
w1 = torch.tensor(-3., requires_grad = True)
x2 = torch.tensor(4., requires_grad = True)
w2 = torch.tensor(-1., requires_grad = True)

In [6]:
a = x1 + w1
b = x2 + w1
c = x2 + w2

d = a * b
e = torch.max(b, c)
y = e + d

In [7]:
y.backward()

In [8]:
x1.grad, w1.grad, x2.grad, w2.grad

(tensor(1.), tensor(0.), tensor(0.), tensor(1.))

## Tensor

### Using NumPy

<img src="hand-calc-2.png">

Note: fuck these green, I'm wrong lol

In [9]:
import numpy as np

In [10]:
x = np.matrix([[1, 1, -1], [0, -1, 1]])
w1 = np.matrix([[1, 0], [1, 2]])
w2 = np.matrix([1, -2])
y = np.matrix([1, -2, -1])

In [11]:
a = w1 @ x
b = w2 @ a
e = y - b
cost = np.sum(np.square(e))

In [12]:
a, b, e

(matrix([[ 1,  1, -1],
         [ 1, -1,  1]]),
 matrix([[-1,  3, -3]]),
 matrix([[ 2, -5,  2]]))

In [13]:
cost

33

In [14]:
dcde = 2 * e

In [15]:
dcde

matrix([[  4, -10,   4]])

In [16]:
dcdw2 = -dcde @ a.T

In [17]:
dcdw2

matrix([[ 10, -18]])

In [18]:
dcda = w2.T @ -dcde

In [19]:
dcda

matrix([[ -4,  10,  -4],
        [  8, -20,   8]])

In [20]:
dcdw1 = dcda @ x.T

In [21]:
dcdw1

matrix([[ 10, -14],
        [-20,  28]])

In [22]:
dcdx = w1.T @ dcda

In [23]:
dcdx

matrix([[  4, -10,   4],
        [ 16, -40,  16]])

In [24]:
dcdy = dcde

In [25]:
dcdx, dcdw1, dcdy, dcdw2

(matrix([[  4, -10,   4],
         [ 16, -40,  16]]),
 matrix([[ 10, -14],
         [-20,  28]]),
 matrix([[  4, -10,   4]]),
 matrix([[ 10, -18]]))

### Using pytorch

In [26]:
x = torch.tensor([[1., 1., -1.], [0., -1., 1.]], requires_grad = True)
w1 = torch.tensor([[1., 0.], [1., 2.]], requires_grad = True)
y = torch.tensor([1., -2., -1.], requires_grad = True)
w2 = torch.tensor([1., -2.], requires_grad = True)

In [27]:
a = w1 @ x
b = w2 @ a
e = y - b
c = torch.sum(torch.square(e))

In [28]:
c

tensor(33., grad_fn=<SumBackward0>)

In [29]:
a, b, e

(tensor([[ 1.,  1., -1.],
         [ 1., -1.,  1.]], grad_fn=<MmBackward0>),
 tensor([-1.,  3., -3.], grad_fn=<SqueezeBackward3>),
 tensor([ 2., -5.,  2.], grad_fn=<SubBackward0>))

In [30]:
c.backward()

In [31]:
x.grad, w1.grad, y.grad, w2.grad

(tensor([[  4., -10.,   4.],
         [ 16., -40.,  16.]]),
 tensor([[ 10., -14.],
         [-20.,  28.]]),
 tensor([  4., -10.,   4.]),
 tensor([ 10., -18.]))