<h4> Autograd in PyTorch</h4>

Autograd is PyTorch’s automatic differentiation engine.
It computes gradients of tensors with respect to some scalar loss.<br>
PyTorch remembers every operation you perform on tensors and then applies the chain rule for you.

Key concepts:
- requires_grad
- computation graph
- backward()
- gradients stored in .grad

If a tensor has requires_grad=True, PyTorch:
- builds a computation graph
- tracks all operations
- computes gradients when .backward() is called

In [2]:
import torch

y = x²<br>
dy/dx = 2x<br>
At x = 2 → gradient = 4
<br>
Below is the code for the same.

In [1]:
import torch

x = torch.tensor(2.0, requires_grad=True)
y = x ** 2

y.backward()

print(x.grad)

tensor(4.)


In [3]:
# y = x² + 3x + 1
# dy/dx = 2x + 3
# At x=3 → 2(3)+3 = 9

x = torch.tensor(3.0, requires_grad=True)
y = x * x + 3 * x + 1

y.backward()

print(x.grad)

tensor(9.)


See Autograd doing magic in chain rule<br>
below is the simple math<br>

z = (x+1)²<br>
dz/dx = 2(x+1)<br>
At x=2 → 6

Now see the code for the same problem

In [4]:
x = torch.tensor(2.0, requires_grad=True)
y = x + 1
z = y * y

z.backward()

print(x.grad)


tensor(6.)


When "requires_grad = True"<br>
pytorch automatically:
- Track all operations on x
- Store graph
- Enable gradient computation

Now see Gradients accumulation

In [None]:
x = torch.tensor(2.0, requires_grad=True)

y1 = x*x
y1.backward() #gives dy x^2/dx = 2X = 4 at x=2

y2 = x*3
y2.backward() #gives dy 3x/dx = 3

x.grad #gives cumulative gradient 4 + 3 = 7

tensor(7.)

Meaning Gradients are accumulated in pytorch by default.<br>
To avoid this we can set the gradients to zero before calling backward() again.