In [2]:
#Step1: Perform standard imports
import torch

In [3]:
#Step 2. Create a tensor with requires_grad set to True
# This sets up computational tracking on the tensor.
x = torch.tensor(2.0, requires_grad=True)

In [4]:
# Step 3. Define a function
y = 2*x**4 + x**3 + 3*x**2 + 5*x + 1

print(y)

tensor(63., grad_fn=<AddBackward0>)


Since $y$ was created as a result of an operation, it has an associated gradient function accessible as <tt>y.grad_fn</tt><br>
The calculation of $y$ is done as:<br>

$\quad y=2(2)^4+(2)^3+3(2)^2+5(2)+1 = 32+8+12+10+1 = 63$

This is the value of $y$ when $x=2$.

#### Step 4. Backprop

In [5]:
y.backward()

In [6]:
#Step5: Display the resulting gradient
print(x.grad)

tensor(93.)


Note that <tt>x.grad</tt> is an attribute of tensor $x$, so we don't use parentheses. The computation is the result of<br>

$\quad y'=8(2)^3+3(2)^2+6(2)+5 = 64+12+12+5 = 93$

This is the slope of the polynomial at the point $(2,63)$.



## Back-propagation on multiple steps
Now let's do something more complex, involving layers $y$ and $z$ between $x$ and our output layer $out$.


In [7]:
# 1. Create a tensor
x = torch.tensor([[1.,2,3],[3,2,1]], requires_grad=True)
print(x)

tensor([[1., 2., 3.],
        [3., 2., 1.]], requires_grad=True)


In [9]:
#2 Create the first layer with y =3x+2
y= 3*x + 2
print(y)

tensor([[ 5.,  8., 11.],
        [11.,  8.,  5.]], grad_fn=<AddBackward0>)


#### 3. Create the second layer with $z = 2y^2$

In [10]:
z = 2*y**2
print(z)

tensor([[ 50., 128., 242.],
        [242., 128.,  50.]], grad_fn=<MulBackward0>)


In [11]:
# 4. Set the output to be the matrix mean
out = z.mean()
print(out)

tensor(140., grad_fn=<MeanBackward0>)


In [12]:
# 5. Now perform back-propagation to find the gradient of x w.r.t out
out.backward()
print(x.grad)

tensor([[10., 16., 22.],
        [22., 16., 10.]])


## Turn off tracking
There may be times when we don't want or need to track the computational history.

You can reset a tensor's <tt>requires_grad</tt> attribute in-place using `.requires_grad_(True)` (or False) as needed.

When performing evaluations, it's often helpful to wrap a set of operations in `with torch.no_grad():`

A less-used method is to run `.detach()` on a tensor to prevent future computations from being tracked. This can be handy when cloning a tensor.