## Backpropagation using Pytorch

https://towardsdatascience.com/pytorch-autograd-understanding-the-heart-of-pytorchs-magic-2686cd94ec95

Gradient(අනුක්‍රමණය): https://www.youtube.com/watch?v=tIpKfDc295M
<br> 

Derivative - ව්යුත්පන්න

<b>Tensors:</b> In simple words, its just an n-dimensional array in PyTorch. Tensors support some additional enhancements which make them unique: Apart from CPU, they can be loaded or the GPU for faster computations. On setting <b> .requires_grad = True </b>they start forming a backward graph that tracks every operation applied on them to calculate the gradients using something called a dynamic computation graph (DCG) 


<b>torch.Tensor</b> is capable of tracking history and behaves like the old Variable

<b>Note: </b>By PyTorch’s design, gradients can only be calculated for floating point tensors which is why I’ve created a float type numpy array before making it a gradient enabled PyTorch tensor

<b>requires_grad:</b> This member, if true starts tracking all the operation history and forms a backward graph for gradient calculation. For an arbitrary tensor a It can be manipulated in-place as follows: a.requires_grad_(True).

<b>grad:</b> grad holds the value of gradient. If requires_grad is False it will hold a None value. Even if requires_grad is True, it will hold a None value unless .backward() function is called from some other node. For example, if you call out.backward() for some variable out that involved x in its calculations then x.grad will hold ∂out/∂x.

### Backward() function

Backward is the function which actually calculates the gradient by passing it’s argument (1x1 unit tensor by default) through the backward graph all the way up to every leaf node traceable from the calling root tensor. The calculated gradients are then stored in .grad of every leaf node. Remember, the backward graph is already made dynamically during the forward pass. Backward function only calculates the gradient using the already made graph and stores them in leaf nodes.

y=X^2

In [1]:
import torch

In [2]:
#requires_grad to calculate gradient
x = torch.tensor(4.0,requires_grad=True)

In [3]:
x

tensor(4., requires_grad=True)

In [4]:
y=x**2
y

tensor(16., grad_fn=<PowBackward0>)

In [5]:
### Backpropagation y = 2*x
y.backward()

In [6]:
print(x.grad)

tensor(8.)


In [7]:
lst=[[2.,3.,1.],[4.,5.,3.],[7.,6.,4.]]
torch_input = torch.tensor(lst,requires_grad=True)

In [8]:
torch_input

tensor([[2., 3., 1.],
        [4., 5., 3.],
        [7., 6., 4.]], requires_grad=True)

In [9]:
3*2**2+2*2

16

In [10]:
#y=x**3 + x**3
#y =3x**2 +2x
y = torch_input**3 +torch_input**2

In [11]:
y

tensor([[ 12.,  36.,   2.],
        [ 80., 150.,  36.],
        [392., 252.,  80.]], grad_fn=<AddBackward0>)

In [12]:
z=y.sum()

In [13]:
z

tensor(1040., grad_fn=<SumBackward0>)

In [14]:
z.backward()

In [16]:
torch_input.grad

tensor([[ 16.,  33.,   5.],
        [ 56.,  85.,  33.],
        [161., 120.,  56.]])