# Points

- Backpropagation chain-rule: https://drive.google.com/file/d/1tUJUn0EJN8oOXlsWF2OiN_ZfWUUc0y-G/view?usp=sharing

- For every operation we do with our tensors, PyTorch will create a graph for us, where at each node we apply a operation (function), which then gives us an output.
- At these nodes we can create local gradients and use them later in the chain rule to get the final gradient.
- At the very end of all our operations, we want to calculate a loss function - we want to calculate the gradient of the loss w.r.t our parameter x from the beginning.

  https://drive.google.com/file/d/1Q1UciIHYq6eyLxt91N7q01h7nq4CXrbG/view?usp=sharing


The whole concept of backpropagation consists of three steps:
  - Forward pass: compute loss
  - Compute local gradients
  - Backward pass: compute dLoss/dWeights using the chain rule

Here is a concrete example, using linear regression:
  - Setup: https://drive.google.com/file/d/1-rSR2uqkcoJMFqnOBfWu3UBW91GgoFMQ/view?usp=sharing
  - 3-step process: https://drive.google.com/file/d/1uIUaKPeEb1p_DRaxLN4nw_dlPJ8oOyuB/view?usp=share_link
  
Here is a numerical example:
  - Forward pass: https://drive.google.com/file/d/1kPpteFPz5gJjfcmNsLpUawwiAgyopP1Q/view?usp=sharing
  - Backward pass: https://drive.google.com/file/d/1W6DGUAbFQncCXOfejPoVW5jI2TbFbtDn/view?usp=share_link
  


# Numerical Example using PyTorch

In [3]:
import torch

In [4]:
x = torch.tensor(1.0)
y = torch.tensor(2.0)

w = torch.tensor(1.0, requires_grad = True)

# do forward pass and compute the loss
y_hat = w * x
loss = (y_hat - y)**2

print(loss)

tensor(1., grad_fn=<PowBackward0>)


In [5]:
# now we want to do the backward pass
# PyTorch will automatically compute the local gradients and backward pass for us

# we just need to call .backward() and then 'w' has its .grad() attribute
loss.backward()
print(w.grad)

tensor(-2.)


In [None]:
# update our weights and do the next forward and backward passes (a few iterations)