In [1]:
import torch

Essentially, autograd keeps track of all the operations done to a `Tensor` object in order to compute derivatives. The Tensor has an attribute called grad_fn that references a PyTorch `Function` that created the `Tensor`
* Note that for user-defined `Tensors` (like when initially just doing `x = torch.ones(2, 2)`, it does not have a grad_fn attribute  
Every operation is stored in what is called the `computation graph`
* The graph is what keeps track of the operations doene to each `Tensor`

<img src="https://colah.github.io/posts/2015-08-Backprop/img/tree-eval-derivs.png">
Example of a computation graph. Each variable has a tree-like path in which you are able to compute derivatives.

In [27]:
# requires_grad is the parameter needed to keep track of computation on that variable
x = torch.ones(2, 2, requires_grad=True)
x

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

In [28]:
y = x + 2
y

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)

In [29]:
# Shows that y was created as a result of an operation (Function), so it has a grad_fn property
y.grad_fn

<AddBackward0 at 0x138d5735a88>

In [30]:
z = y * y * 3
# retain_grad() make sure that the autograd saves the gradient for the Tensor after doing .backward()
z.retain_grad()
out = z.mean()

# Shows that y was created from a multiplication Function
print(z)

# Shows that out was created from a mean Function
print(out)

tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward0>)
tensor(27., grad_fn=<MeanBackward0>)


In [10]:
# The requires_grad property is what determines if a Tensor is added to the graph

# Originally, a is not part of the computation graph
a = torch.randn(2, 2)
a = ((a * 3) / (a - 1))
print("Before a was added, requires_grad =", a.requires_grad)

# Here, we change that and add it to the graph to track all operations
a.requires_grad_(True)
print("After a was added, requires_grad =", a.requires_grad)
b = (a * a).sum()
print(b.grad_fn)

Before a was added, requires_grad = False
After a was added, requires_grad = True
<SumBackward0 object at 0x00000138D56AA588>


### Gradients

In [25]:
# Recall our out variable
# out.backward() computes the derivatives 
out.backward()
print(z.grad)
print(x.grad)

tensor([[0.2500, 0.2500],
        [0.2500, 0.2500]])
tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])


In [26]:
# To stop autograd from tracking history on Tensors, 
# you can wrap the code that you don't want tracked within a "with torch.no_grad()""

print(x.requires_grad)
print((x ** 2).requires_grad)

with torch.no_grad():
    print((x ** 2).requires_grad)

True
True
False


In [31]:
# Could also use .detech() to get a new Tensor with the same content but does not require gradients
print(x.requires_grad)
y = x.detach()
print(y.requires_grad)

# Just checks to see if x and y are equal
print(x.eq(y).all())

True
False
tensor(True)
