In [1]:
import torch
from torch import nn
import torchviz

## Pytorch Autograd: Backpropagation

During the forward pass, autograd builds up a **computation graph** eagerly. This computation graph is represented with **Nodes and Edges**. 

- Whenever an operation is called on a tensor that *requires grad* (e.g., mul), PyTorch creates a **Node** in the computation graph
- A Node can store *a reference to Tensors* and *other things that it needs for backward computation* (we’ll see an example of this later)
- When it comes time to compute gradients, we pass some values through the created autograd graph.
- Each node defines a backward formula.

## Simple Linear Regression Example

Consider the linear regression problem
$$
    t = Ax + B,
$$
in which the solution is $A = 2$, $B = 3$.

In [2]:
x = torch.tensor([1., 2.])
t = torch.tensor([5., 7.])

# initialize A and B
A = torch.rand((1,), requires_grad = True)
B = torch.rand((1,), requires_grad = True)

Forward Process is as follows.

In [3]:
# Forward Process
scaled = A * x 
t_hat = scaled + B 

# Calculate Error
diff = t - t_hat
sqdiff = diff ** 2
loss = sqdiff.sum()