## **torch.autograd**
<hr>
automatic differentiation: method that computes derivatives by applying the chain rule.
<hr>
An automatic differentiation engine calculates derivatives for you in this way, which makes it much easier to compute gradients
<hr>
torch.autograd is PyTorch's automatic differentiation engine

Training a Neural Network occurs in two steps:
1. Forward Propagation: NN makes its best guess about the correct output given an input
2. Backward Propagation: NN computes gradients by propagating errors backwards through the network and using the chain rule

In [1]:
import torch
from torchvision.models import resnet18, ResNet18_Weights

In [2]:
model = resnet18(weights=ResNet18_Weights.DEFAULT)
data = torch.rand(1, 3, 64, 64)
labels = torch.rand(1, 1000)

Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
100%|██████████| 44.7M/44.7M [00:00<00:00, 134MB/s]


In [None]:
prediction = model(data)
loss = (prediction - labels).sum
loss.backward()

**optimizer**: a method to update the model parameters to minimize loss \\

**SGD** (stochastic gradient descent): a simple optimizer that updates parameters based on the gradient of the loss \\

**learning rate**: determines how big of a step we take in the direction of the gradient \\

**momentum**: helps accelerate updates by incorporating a fraction of the previous update, which smooths the descent process


In [3]:
optimizer = torch.optim.SGD(model.parameters(), lr=1e-2, momentum=0.9)

.step() initiates gradient descent

In [None]:
optimizer.step()

requires_grad for a tensor is False by default. If we want PyTorch to track operations on a tensor, we need to set requires_grad = True

In [14]:
a = torch.tensor((2., 3.), requires_grad=True)
b = torch.tensor((6., 4.), requires_grad = True)

Q = 3*a**3 - b**2    # our error / loss, we want to know how changes in a and b affect Q

external_grad = torch.tensor([1., 1.])
Q.backward(gradient=external_grad)         # automatically computes partial derivatives for gradients

check if collected gradients are correct

In [16]:
print(9*a**2 == a.grad)
print(-2*b == b.grad)

tensor([True, True])
tensor([True, True])
