[Link To Tutorial](https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html)

## Overview

* The `autograd` package provides automatic differentiation for all operations on Tensor. 


## Tensor

* `torch.Tensor` is the central class of the package. If you set its attribute `.requires_grad` as `True`, it starts to track all operations on it. 

* When you finish your computation you can call `.backward()` and have all the gradients computed automatically. 

* The gradient for this tensor will be accumulated into `.grad` attribute.

* To stop a tensor from tracking history, you can call `.detach()` to detach it from the computation history, and to prevent future computation from being tracked.

* To prevent tracking history (and using memory), you can also wrap the code block in with torch.no_grad():. This can be particularly helpful when evaluating a model because the model may have trainable parameters with requires_grad=True, but for which we donâ€™t need the gradients.

In [1]:
## necessary modules
import torch

In [2]:
## version
print(f"PyTorch: {torch.__version__}")

PyTorch: 1.7.0


In [3]:
## create a tensor and set requires_grad to True
x = torch.ones(2, 2, requires_grad=True)

x

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

In [4]:
## do a tensor operation
y = x + 2

y

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)

In [5]:
## y was created as a result of an operation, so it has a grad_fn.
y.grad_fn

<AddBackward0 at 0x7f62a02bff70>

In [6]:
## do more operation
z = y * y * 3

out = z.mean()

z, out

(tensor([[27., 27.],
         [27., 27.]], grad_fn=<MulBackward0>),
 tensor(27., grad_fn=<MeanBackward0>))

In [7]:
a = torch.randn(2,2)

a = ((a * 3) / (a - 1))

print(a.requires_grad)

a.requires_grad_(True)

print(a.requires_grad)

b = a**2 * 3

print(b.requires_grad)
print(b.grad_fn)

False
True
True
<MulBackward0 object at 0x7f62a02c63a0>


## Gradients

In [8]:
out.backward(retain_graph=True)

In [9]:
x.grad

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])

In [10]:
temp_vector = torch.randn(2,2)

z.backward(temp_vector)

In [11]:
x.grad

tensor([[-10.8913,  -1.1485],
        [  4.1017,  10.3016]])

## Stop Autograd From Tracking History

In [12]:
## requires_grad_

x = torch.tensor([1, 2, 3], dtype=torch.float, requires_grad=True)

print(x.requires_grad)

x.requires_grad_(False)

print(x.requires_grad)

True
False


In [13]:
## detach

x = torch.tensor([1, 2, 3], dtype=float, requires_grad=True)

print(x.requires_grad)

y = x.detach()

print(y.requires_grad)

True
False


In [14]:
## torch.no_grad()

x = torch.tensor([1, 2, 3], dtype=float, requires_grad=True)

print(x, x.requires_grad)

with torch.no_grad():
    y = x * 2
    
    print(y, y.requires_grad)

tensor([1., 2., 3.], dtype=torch.float64, requires_grad=True) True
tensor([2., 4., 6.], dtype=torch.float64) False


## Loop Example

In [17]:
## tensor
weight = torch.ones(4, requires_grad=True)

for epoch in range(3):
    model_output = (weight * 3).sum()
    
    model_output.backward()
    
    print(weight.grad)

tensor([3., 3., 3., 3.])
tensor([6., 6., 6., 6.])
tensor([9., 9., 9., 9.])


In [23]:
## tensor
weight = torch.ones(4, requires_grad=True)

for epoch in range(3):
    model_output = (weight * 3).sum()
    
    model_output.backward()
    
    print(weight.grad)
    
    weight.grad.zero_()

tensor([3., 3., 3., 3.])
tensor([3., 3., 3., 3.])
tensor([3., 3., 3., 3.])
