### Autograd: Engine in pytorch to automatically compute gradients for backpropogation

Simple one layer NN

Gradients are required for w and b, so we set the requires_grad=True for those tensors

In [1]:
import torch

#input tensor
x = torch.ones(5)

#output
y = torch.zeros(3)

w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w) + b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)

In [2]:
loss

tensor(0.5478, grad_fn=<BinaryCrossEntropyWithLogitsBackward0>)

Computation Graph for this operation is

Reference to grad object of z

In [3]:
z.grad_fn


<AddBackward0 at 0x7fce7017e260>

In [4]:
loss.grad_fn

<BinaryCrossEntropyWithLogitsBackward0 at 0x7fce7017ecb0>

Partial derivatives of loss, wrt w and b

∂loss/∂w and ∂loss/∂b under some fixed values of x and y

In [5]:
loss.backward()
print(w.grad)
print(b.grad)

tensor([[0.0123, 0.2486, 0.0700],
        [0.0123, 0.2486, 0.0700],
        [0.0123, 0.2486, 0.0700],
        [0.0123, 0.2486, 0.0700],
        [0.0123, 0.2486, 0.0700]])
tensor([0.0123, 0.2486, 0.0700])


##### We can get grad properties for only leaf nodes of the computational graph

Disable Gradient Tracking

In [6]:
with torch.no_grad():
    z = torch.matmul(x, w)+b
z.requires_grad

False

In [7]:
z = torch.matmul(x, w)+b
z_det = z.detach()
z_det.requires_grad

False

- This can be used to freeze cetain params while finetuning a pretrained model
- Also, when only forward pass is req

- Autograd keeps a record of tensor and all executed operations in a DAG
- Leaves are input tensors and root is output tensor
- Tracing from root to leaves, gradients can be computed using chain rule

Autograd Test

In [13]:
a = torch.randn(3, 4, requires_grad=True)
a

tensor([[ 1.2586, -2.9369, -0.5570,  1.9444],
        [-1.9207, -0.4076, -0.7817, -0.7956],
        [-0.9108, -1.1626, -1.9424,  0.5715]], requires_grad=True)

In [16]:
b = torch.sin(a)
b

tensor([[ 0.9517, -0.2033, -0.5287,  0.9310],
        [-0.9394, -0.3964, -0.7045, -0.7143],
        [-0.7900, -0.9178, -0.9318,  0.5409]], grad_fn=<SinBackward0>)

In [17]:
c = 2 * b
d = c + 1
d

tensor([[ 2.9033,  0.5934, -0.0573,  2.8621],
        [-0.8788,  0.2073, -0.4090, -0.4286],
        [-0.5800, -0.8357, -0.8635,  2.0818]], grad_fn=<AddBackward0>)

Output for autograd gradient computation must be scalar value

In [18]:
out = d.sum()
out

tensor(4.5950, grad_fn=<SumBackward0>)

In [19]:
out.backward()