`torch.autograd` is *PyTorch's automatic differentiation engine* that powers neural network training. In this section, you will get a conceptual understanding of how autograd helps a neural network train.

## Background

Neural Networks:
- A collection of nested functions that are executed on some input data.
    - These functions are defined by parameters (consisting of weights and biases), which in PyTorch are stored in tensors.

### Training a Neural Network occurs in 2 steps:

**Forward Propagation**: 
- NN makes the best guess about the correct output
- Runs the input data through each of its function to make its guess

**Backward Propagation:**
- NN adjusts its parameters proportionate to the error in its guess
    - Does this by traversing backwards from the output
        - Collect derivatives of the error w/ respect to the parameters of the functions
        - Optimizing the parameters using gradient descent

## Usage in PyTorch 

In [3]:
import torch, torchvision

In [4]:
model = torchvision.models.resnet18(pretrained = True)
data = torch.rand(1, 3, 64, 64)
labels = torch.rand(1, 1000)

### Forward Pass:
Running the input data through the model through each of its layers to make a prediction

In [5]:
prediction = model(data)

After:
1. Use the model's prediction & the corresponding label to calculate the error (`loss`)
2. Backpropagate the error through the network
3. Backward propagation is started when `.backward()` is called on the error tensor
4. Autograd then calculates & stores the gradients for each model parameter in the parameter's `.grad` attribute

In [6]:
loss = (prediction - labels).sum()
loss.backward() # backward pass

In [7]:
# Loading Optimizer
optim = torch.optim.SGD(model.parameters(), lr = 1e-2, momentum = 0.9)

In [8]:
# Initiate gradient descent
optim.step()

## Differentiation in Autograd

In [9]:
a = torch.tensor([2., 3.], requires_grad = True)
b = torch.tensor([6., 4.], requires_grad = True)

In [10]:
Q = 3*a**3 - b**2 # Q = 3a^3 - b^2

In [11]:
external_grad = torch.tensor([1., 1.])
Q.backward(gradient = external_grad)

In [12]:
# Validate if gradients are correct
print(9*a**2 == a.grad)
print(-2*b == b.grad)

tensor([True, True])
tensor([True, True])
