# Item: Automatic Differentiation

#### References

1. **Autograd: Automatic Differentiation:** https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html

## Automatic Differentiation

The ```autograd``` package provides automatic differentiation for all operations on Tensors. It is a define-by-run framework, which means that your backprop is defined by how your code is run, and that every single iteration can be different.

Generally speaking, ```torch.autograd``` is an engine for computing vector-Jacobian product.  

### Tensor

```torch.Tensor``` is the central class of the package. If you set its attribute ```.requires_grad``` as ```True```, it starts to track all operations on it. When you finish your computation you can call ```.backward()``` and have all the gradients computed automatically. The gradient for this tensor will be accumulated into ```.grad``` attribute.

---

**Remark**

Tracking the gradients of a Tensor requires memory. To stop a tensor from tracking history, you can call ```.detach()``` to detach it from the computation history, and to prevent future computation from being tracked.

To prevent tracking history, you can also wrap the code block in ```with torch.no_grad():```. This can be particularly helpful when evaluating a model because the model may have trainable parameters with ```requires_grad=True```, but for which we don’t need the gradients

---

### Function

Tensor and Function are interconnected and build up an acyclic graph, that encodes a complete history of computation. Each tensor has a ```.grad_fn``` attribute that references a Function that has created the Tensor. 

---

**Remark** 

The ```.grad_fn``` attribute references the Function that has created the Tensor except for Tensors created by the user ; their ```grad_fn``` is ```None```.

---

In [10]:
import torch

In [11]:
x = torch.ones(2, 2, requires_grad = True)
x

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

In [12]:
# let's see the graients
print(x.grad)

None


In [13]:
# Do a tensor operation
y = x +3
y

tensor([[4., 4.],
        [4., 4.]], grad_fn=<AddBackward0>)

In [14]:
# Do more operations on y
z = y*y*3
out = x.mean()
out

tensor(1., grad_fn=<MeanBackward1>)

In [15]:
out.backward()
x.grad

tensor([[0.2500, 0.2500],
        [0.2500, 0.2500]])