Source: https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html

In [1]:
import torch
from torchvision.models import resnet18, ResNet18_Weights
model = resnet18(weights=ResNet18_Weights.DEFAULT)
data = torch.rand(1, 3, 64, 64)
labels = torch.rand(1, 1000)

  from .autonotebook import tqdm as notebook_tqdm
Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /home/thien/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
100%|██████████| 44.7M/44.7M [00:03<00:00, 11.7MB/s]


Run the input data through the model through each of its layers to make a prediction. This is the forward pass.

In [2]:
prediction = model(data) # forward pass

We use the model’s prediction and the corresponding label to calculate the error (loss). The next step is to backpropagate this error through the network. Backward propagation is kicked off when we call .backward() on the error tensor. Autograd then calculates and stores the gradients for each model parameter in the parameter’s .grad attribute.

In [3]:
loss = (prediction - labels).sum()
loss.backward() # backward pass

Next, we load an optimizer, in this case SGD with a learning rate of 0.01 and momentum of 0.9. We register all the parameters of the model in the optimizer.

In [4]:
optim = torch.optim.SGD(model.parameters(), lr=1e-2, momentum=0.9)

Finally, we call .step() to initiate gradient descent. The optimizer adjusts each parameter by its gradient stored in .grad.

In [5]:
optim.step() #gradient descent

-----------------------------------------------------------

Differentiation in Autograd

In [36]:
import torch

a = torch.tensor([2., 3.], requires_grad=True)
b = torch.tensor([6., 4.], requires_grad=True)

In [37]:
Q = 3*a**3 - b**2

In [38]:
Q

tensor([-12.,  65.], grad_fn=<SubBackward0>)

In [39]:
external_grad = torch.tensor([1., 1.])
Q.backward(gradient=external_grad)
print(f'a.grad = {a.grad}')
print(9*a**2 == a.grad)
print(f'b.grad = {b.grad}')
print(-2*b == b.grad)

a.grad = tensor([36., 81.])
tensor([True, True])
b.grad = tensor([-12.,  -8.])
tensor([True, True])


a.grad == gradient of Q in term of a
--> if we want to calculate gradient of Q1 in temr of a, we need to zero out a.grad first

In [40]:
Q

tensor([-12.,  65.], grad_fn=<SubBackward0>)

In [44]:
a.grad.data.zero_()
b.grad.data.zero_()
# a.grad == gradient of Q in term of a
# --> if we want to calculate gradient of Q1 in temr of a, we need to zero out a.grad first
Q = 3*a**3 - b**2
external_grad = torch.tensor([1., 2.])
Q.backward(gradient=external_grad) # need to assign Q again before do again

In [45]:
# check if collected gradients are correct
print(f'a.grad = {a.grad}')
print(9*a**2 == a.grad)
print(f'b.grad = {b.grad}')
print(-2*b == b.grad)

a.grad = tensor([ 72., 324.])
tensor([False, False])
b.grad = tensor([-24., -32.])
tensor([False, False])


In [34]:
Q

tensor([-12.,  65.], grad_fn=<SubBackward0>)

In [139]:
a1 = torch.tensor([2., 3.], requires_grad=True)
b1 = torch.tensor([6., 4.], requires_grad=True)

In [140]:
Q1 = 3*a1.detach()**3 - b1**2
# Q1.detach()
Q1

tensor([-12.,  65.], grad_fn=<SubBackward0>)

In [141]:
Q2 = 3*a1**3 - b1**2

In [134]:
print(a1.grad)

None


In [135]:
a1

tensor([2., 3.], requires_grad=True)

In [136]:
a1.grad.data.zero_()
b1.grad.data.zero_()

AttributeError: 'NoneType' object has no attribute 'data'

In [143]:
external_grad = torch.tensor([1,1])
Q1.backward(gradient=external_grad)

In [144]:
print(f'a1.grad = {a1.grad}')
print(9*a1**2 == a1.grad)
print(f'b1.grad = {b1.grad}')
print(-2*b1 == b1.grad)

a1.grad = None
False
b1.grad = tensor([-12.,  -8.])
tensor([True, True])


In [145]:
Q2.backward(gradient=external_grad)

In [146]:
print(f'a1.grad = {a1.grad}')
print(9*a1**2 == a1.grad)
print(f'b1.grad = {b1.grad}')
print(-2*b1 == b1.grad)

a1.grad = tensor([36., 81.])
tensor([True, True])
b1.grad = tensor([-24., -16.])
tensor([False, False])


In [147]:
# need .detach() inside the definition of the function Q1 = 3*a1.detach()**3 - b1**2
# because if we do detach() in 1 single command, it will detach our variable
a1.detach()

tensor([2., 3.])