## Autograd

The autograd package provides automatic differentiation of operations on a tensor. If we set the attribute `requires_grad` to `True` for a Tensor then it starts tracking all its operations. We can then call `.backward()` to calculate all the gradients automatically. These gradients are then stored in the `.grad` attribute

To stop a tensor from tracking further operations we can call `.detach()` on the tensor. We can also switch off gradients for a block of code with `torch.no_grad()` context thus saving our memory. This can be used for inference of our model.

Another important class is the class called `Function`. Every tensor has a `.grad_fn` attribute that references a function that created the tensor.

In [1]:
import torch

In [2]:
x = torch.ones(2,2, requires_grad=True)
x

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

In [3]:
x.grad_fn

In [4]:
x.grad

In [5]:
y = x + 2
y

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)

In [6]:
y.grad_fn

<AddBackward0 at 0x7f3b8244ddd8>

In [7]:
z = y * y * 3
z

tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward0>)

In [8]:
out = z.mean()
out

tensor(27., grad_fn=<MeanBackward1>)

In [9]:
x.requires_grad

True

In [10]:
# we can use requires_grad_(bool) to change the grad requirement in place
a = torch.rand(2,2)
a = ((a * 3) / (a - 1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.requires_grad)

False
True


In [11]:
b = (a * a).sum()
print(b.grad_fn)

<SumBackward0 object at 0x7f3b823e82b0>


In [12]:
# lets do backprop now
# out.backward is equivalent to d(out) / dx
out.backward()

In [13]:
x.grad

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])

In [14]:
print(x.requires_grad)
print((x ** 2).requires_grad)

with torch.no_grad():
    print((x ** 2).requires_grad)

True
True
False
