PyTorch Gradients
This section covers the PyTorch autograd implementation of gradient descent. Tools include:

torch.autograd.backward()
torch.autograd.grad()
Before continuing in this section, be sure to watch the theory lectures to understand the following concepts:

Error functions (step and sigmoid)
One-hot encoding
Maximum likelihood
Cross entropy (including multi-class cross entropy)
Back propagation (backprop)
Additional Resources:
PyTorch Notes:  Autograd mechanics
Autograd - Automatic Differentiation
In previous sections we created tensors and performed a variety of operations on them, but we did nothing to store the sequence of operations, or to apply the derivative of a completed function.

In this section we'll introduce the concept of the dynamic computational graph which is comprised of all the Tensor objects in the network, as well as the Functions used to create them. Note that only the input Tensors we create ourselves will not have associated Function objects.

The PyTorch autograd package provides automatic differentiation for all operations on Tensors. This is because operations become attributes of the tensors themselves. When a Tensor's .requires_grad attribute is set to True, it starts to track all operations on it. When an operation finishes you can call .backward() and have all the gradients computed automatically. The gradient for a tensor will be accumulated into its .grad attribute.

Let's see this in practice.

Back-propagation on one step
We'll start by applying a single polynomial function 𝑦=𝑓(𝑥) to tensor 𝑥. Then we'll backprop and print the gradient 𝑑𝑦𝑑𝑥.

𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛:𝑦𝐷𝑒𝑟𝑖𝑣𝑎𝑡𝑖𝑣𝑒:𝑦′=2𝑥4+𝑥3+3𝑥2+5𝑥+1=8𝑥3+3𝑥2+6𝑥+5
Step 1. Perform standard imports

In [1]:
import torch

In [2]:
x = torch.tensor(2.0,requires_grad=True)

In [3]:
y = 2*x**4 + x**3 + 3*x**2 + 5*x +1

In [4]:
print(y)

tensor(63., grad_fn=<AddBackward0>)


In [5]:
y.backward()

In [6]:
x.grad 

tensor(93.)

In [8]:
x = torch.tensor([[1.,2.,3],[3.,2.,1.]],requires_grad=True)

In [9]:
print(x)

tensor([[1., 2., 3.],
        [3., 2., 1.]], requires_grad=True)


In [10]:
y = 3*x + 2

In [11]:
print(y)

tensor([[ 5.,  8., 11.],
        [11.,  8.,  5.]], grad_fn=<AddBackward0>)


In [12]:
z = 2*y**2

In [13]:
print(z)

tensor([[ 50., 128., 242.],
        [242., 128.,  50.]], grad_fn=<MulBackward0>)


In [14]:
out = z.mean()

In [15]:
print(out)

tensor(140., grad_fn=<MeanBackward0>)


In [16]:
out.backward()

In [17]:
print(x.grad)

tensor([[10., 16., 22.],
        [22., 16., 10.]])
