## Automatic Differentiation
This example computes the gradient of the loss function wrt w and b parameters.

In [2]:
import torch

# Define parameters
x = torch.ones(5)  # input tensor
y = torch.zeros(3)  # expected output

# requires_grad tells python to compute gradient when this tensor is used
# Note: x.requires_grad_(True) will also do the trick!
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b

# Loss function
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)

Gradient functions:

In [3]:
print('Gradient function for z =',z.grad_fn)
print('Gradient function for loss =', loss.grad_fn)

Gradient function for z = <AddBackward0 object at 0x7f5d7830f790>
Gradient function for loss = <BinaryCrossEntropyWithLogitsBackward0 object at 0x7f5d7830f610>


I think loss.backward() says that we are taking derivatives of the loss function specifically...

In [4]:
loss.backward()
print(w.grad)
print(b.grad)

tensor([[0.1422, 0.2821, 0.0234],
        [0.1422, 0.2821, 0.0234],
        [0.1422, 0.2821, 0.0234],
        [0.1422, 0.2821, 0.0234],
        [0.1422, 0.2821, 0.0234]])
tensor([0.1422, 0.2821, 0.0234])


We can also disable gradients. We might want to do this once we're done training the model and we're just using it to make predictions or if we want to fix parameters in place.

In [6]:
z = torch.matmul(x, w)+b
print(z.requires_grad)

with torch.no_grad():
    z = torch.matmul(x, w)+b
print(z.requires_grad)

z = torch.matmul(x, w)+b
z_det = z.detach()
print(z_det.requires_grad)

True
False
False
