# Autograd: automatic differentiation

The ``autograd`` package provides automatic differentiation for all operations
on Tensors. It is a define-by-run framework, which means that your backprop is
defined by how your code is run, and that every single iteration can be
different.

In [1]:
import torch

ModuleNotFoundError: No module named 'torch'

Create a tensor:

In [2]:
# Create a 2x2 tensor with gradient-accumulation capabilities
x = torch.tensor([[1, 2], [3, 4]], requires_grad=True, dtype=torch.float32)
print(x)

tensor([[1., 2.],
        [3., 4.]], requires_grad=True)


Do an operation on the tensor:

In [3]:
# Deduct 2 from all elements
y = x - 2
print(y)

tensor([[-1.,  0.],
        [ 1.,  2.]], grad_fn=<SubBackward0>)


``y`` was created as a result of an operation, so it has a ``grad_fn``.



In [4]:
print(y.grad_fn)

<SubBackward0 object at 0x00000204F4C2D7B8>


In [5]:
# What's happening here?
print(x.grad_fn)

None


In [6]:
# Let's dig further...
y.grad_fn

<SubBackward0 at 0x204f4c2d3c8>

In [7]:
y.grad_fn.next_functions[0][0]

<AccumulateGrad at 0x204f63532b0>

In [8]:
y.grad_fn.next_functions[0][0].variable

tensor([[1., 2.],
        [3., 4.]], requires_grad=True)

In [9]:
# Do more operations on y
z = y * y * 3
a = z.mean()  # average

print(z)
print(a)

tensor([[ 3.,  0.],
        [ 3., 12.]], grad_fn=<MulBackward0>)
tensor(4.5000, grad_fn=<MeanBackward1>)


In [16]:
!pip3 install torchviz

Collecting torchviz
  Using cached torchviz-0.0.1.tar.gz (41 kB)
Collecting torch
  Using cached torch-0.1.2.post2.tar.gz (128 kB)
Collecting graphviz
  Using cached graphviz-0.14.1-py2.py3-none-any.whl (18 kB)
Using legacy setup.py install for torchviz, since package 'wheel' is not installed.
Using legacy setup.py install for torch, since package 'wheel' is not installed.
Installing collected packages: torch, graphviz, torchviz
    Running setup.py install for torch: started
    Running setup.py install for torch: finished with status 'error'


    ERROR: Command errored out with exit status 1:
     command: 'c:\py38\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\Harsha\\AppData\\Local\\Temp\\pip-install-ewz2lxaa\\torch\\setup.py'"'"'; __file__='"'"'C:\\Users\\Harsha\\AppData\\Local\\Temp\\pip-install-ewz2lxaa\\torch\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\Harsha\AppData\Local\Temp\pip-record-8fih5c3o\install-record.txt' --single-version-externally-managed --compile --install-headers 'c:\py38\Include\torch'
         cwd: C:\Users\Harsha\AppData\Local\Temp\pip-install-ewz2lxaa\torch\
    Complete output (23 lines):
    running install
    running build_deps
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\Users\Harsha\AppData\Local\Temp\pip-install-ewz2lxaa\torch\setup.py", line 225, 

In [10]:
# Let's visualise the computational graph! (thks @szagoruyko)
from torchviz import make_dot

ModuleNotFoundError: No module named 'torchviz'

In [None]:
make_dot(a)

## Gradients

Let's backprop now `out.backward()` is equivalent to doing `out.backward(torch.tensor([1.0]))`

In [None]:
# Backprop
a.backward()

Print gradients $\frac{\text{d}a}{\text{d}x}$.




In [None]:
# Compute it by hand BEFORE executing this
print(x.grad)

You can do many crazy things with autograd!
> With Great *Flexibility* Comes Great Responsibility

In [None]:
# Dynamic graphs!
x = torch.randn(3, requires_grad=True)

y = x * 2
i = 0
while y.data.norm() < 1000:
    y = y * 2
    i += 1
print(y)

In [None]:
# If we don't run backward on a scalar we need to specify the grad_output
gradients = torch.FloatTensor([0.1, 1.0, 0.0001])
y.backward(gradients)

print(x.grad)

In [None]:
# BEFORE executing this, can you tell what would you expect it to print?
print(i)

## Inference

In [None]:
# This variable decides the tensor's range below
n = 3

In [None]:
# Both x and w that allows gradient accumulation
x = torch.arange(1., n + 1, requires_grad=True)
w = torch.ones(n, requires_grad=True)
z = w @ x
z.backward()
print(x.grad, w.grad, sep='\n')

In [None]:
# Only w that allows gradient accumulation
x = torch.arange(1., n + 1)
w = torch.ones(n, requires_grad=True)
z = w @ x
z.backward()
print(x.grad, w.grad, sep='\n')

In [None]:
x = torch.arange(1., n + 1)
w = torch.ones(n, requires_grad=True)

# Regardless of what you do in this context, all torch tensors will not have gradient accumulation
with torch.no_grad():
    z = w @ x

try:
    z.backward()  # PyTorch will throw an error here, since z has no grad accum.
except RuntimeError as e:
    print('RuntimeError!!! >:[')
    print(e)

## More stuff

Documentation of the automatic differentiation package is at
http://pytorch.org/docs/autograd.