# Autograd: Automatic Differentiation
* `autograd` package provides automatic differentiation for all operations on tensors.
* Set `require_grad` to `True` on a torch.Tensor, and all operations will be tracked.
* Call `backward` after computation to have all gradients computed automatically.
* To stop a tensor from tracking history, call `.detach()` to detach it from computation history.
* To prevent tracking history/using memory, wrap the code block in `with torch.no_grad():`
* `Function` class:
    * `Tensor` and `Function` are interconnected in an acyclic graph that encodes the computation history.
    * Each `Tensor` has a `.grad_fn` attribute if it is created by a `Function`. `.grad_fn is None` if the `Tensor` was created by the user.

In [1]:
import torch

In [2]:
x = torch.ones(2, 2, requires_grad=True)
x

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

In [3]:
y = x + 2
y

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)

`y` was created as a result of an operation, so it has a `.grad_fn`.

In [4]:
y.grad_fn

<AddBackward0 at 0x10adfe700>

In [5]:
z = y * y * 3
out = z.mean()

print(z, out)

tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward0>)


`.requires_grad_()` changes an existing Tensor's `requires_grad` flag in-place. The input flag defaults to `False` if not given.

In [7]:
a = torch.randn(3, 3)
a = ((a * 4) / (a - 1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.requires_grad)
b = (a * a).sum()
print(b.grad_fn)

False
True
<SumBackward0 object at 0x1316a0ca0>


## Gradients
`torch.autograd` is an engine for computing the vector-Jacobian product. Note that the vector-Jacobian product is the gradient of the vector.

In [10]:
x = torch.randn(4, requires_grad=True)

y = x * 2

while y.data.norm() < 1000:
    y = y * 2
    
print(y)

tensor([-580.3675, 1696.1810, -281.3878, -480.0916], grad_fn=<MulBackward0>)


`torch.autograd` cannot compute the full Jacobian directly, but if you want to the vector-Jacobian product, pass the vector to `backward()` as an argument.

In [12]:
v = torch.tensor([0.1, 1.0, 0.0001, 0.5], dtype=torch.float)
y.backward(v)

x.grad

tensor([2.0480e+02, 2.0480e+03, 2.0480e-01, 1.0240e+03])

Ways to stop `autograd` from tracking history on tensors:
1. `with torch.no_grad():`
2. `.detach()`

In [14]:
print(x.requires_grad)
print((x ** 2).requires_grad)

with torch.no_grad():
    print((x ** 2).requires_grad)

True
True
False


In [13]:
print(x.requires_grad)
y = x.detach()
print(y.requires_grad)
print(x.eq(y).all())

True
False
tensor(True)
