# Module 2 - A Gentle Introduction to TORCH.AUTOGRAD
`torch.autograd` is PyTorch's automatic differentiation engine that powers neural network training.

## Background 
NNs are a collection of nested functions that are executed on some input. These functions are defined by parameters (consisting of weights and biases), which in PyTorch are stored in tensors.

Training a NN happens in two steps:
Forward Propagation: In forward prop, the NN makes it best guess about the correct output. It runs the input data through each of its functions to make this guess.
Backward Propagation: In backprop, the NN adjusts its parameters proportionally to the error of its guess. It does this by traversing backwards from the output, collecting the derivatives of the error with respect to the parameters of the functions (gradients), and optimizing the parameters using gradient descent.

In [12]:
import torch
from torchvision.models import resnet18, ResNet18_Weights
model = resnet18(weights=ResNet18_Weights.DEFAULT)
data = torch.rand(1, 3, 64, 64)
labels = torch.rand(1, 1000)

#Next, run input data through the model through each of its layers to make a prediction. This is a forward pass:
prediction = model(data)
# Use the model's prediction and the corresponding label to calculate the error (loss). The next step is to backprop this error through the network. Backprop is kicked off when we call backward() on the error tensor. Autograd then calculates and stores the gradients for each model parameter in the parameter's .grad attribute.
loss = (prediction - labels).sum()
loss.backward()

optim = torch.optim.SGD(model.parameters(), lr=1e-2, momentum=0.9)
optim.step()