
PyTorch's `autograd` is a masterpiece. It diminishes the burden of calculating multiple partial derivatives aka gradients.

`torch.autograd` provides classes and functions implementing automatic differentiation of arbitrary scalar-valued functions.

Autograd is PyTorchâ€™s automatic differentiation package. It deals with the automatic computation of gradients for computation graphs.

**why we need autograd??**

Calculating derivatives is the core of training deep neural networks. The calculations may be simple, but working out by hand can be tedious and prone to error. As the model becomes more complex, it becomes impractical to calculate gradients of each and every functions.
PyTorch's autograd takes all of the tedious work from our hand


In [19]:
import torch

In [21]:
# create input tensors
a = torch.tensor([5.], requires_grad=True) #Only Tensors of floating point and complex dtype can require gradients
b = torch.tensor([6.], requires_grad=True)
a, b

(tensor([5.], requires_grad=True), tensor([6.], requires_grad=True))

By setting the flag `requires_grad=True`, PyTorch will automatically build a computation graph in the background. It means that autograd accumulates the history of the computation on the tensors.

In [22]:
y = a**3 - b**2
y

tensor([89.], grad_fn=<SubBackward0>)

$dy/da = 3a^2 = 75$

$dy/db = -2b = -12$

In [15]:
print(a.grad)
print(b.grad)

None
None


`Tensor.backward()` - Computes the gradient of current tensor w.r.t. graph leaves.

so only when `y.backward()` is called the gradients backpropagate to the leaf tensors. 

In [16]:
y.backward()

we can now access the gradients using the `grad` attribute.

In [17]:
a.grad

tensor([75.])

In [18]:
b.grad

tensor([-12.])