# 02_Autograd
In this notebook, we will see how to compute the derative of tensors.

In [None]:
from __future__ import print_function
import torch

## Auto Graph & Gradient Computation

It is the `autograd` package that provides automatic differentiation for all operations on Tensors.
It is a define-by-run framework, which means that your backprop is defined by how your code is run, and that every single iteration can be different.

If you set the attribute `.requires_grad` of some tensor as `True`, it starts to track all operations on it.
When you finish your computation you can call `.backward()` and have all the gradients computed automatically.
The gradient for this tensor will be accumulated into `.grad` attribute.

There’s one more class which is very important for autograd implementation - a `Function`.

`Tensor` and `Function` are interconnected and build up an acyclic graph, that encodes a complete history of computation. Each tensor has a `.grad_fn` attribute that references a `Function` that has created the `Tensor` (except for Tensors created by the user - their `grad_fn` is None).

In [None]:
x = torch.ones(1)  # create a tensor with requires_grad=False (default)
print(x.requires_grad)

y = torch.ones(1)  # another tensor with requires_grad=False
z = x + y

print(z.requires_grad)

# then autograd won't track this computation. let's verify!
# z.backward()

w = torch.ones(1, requires_grad=True)
print(w.requires_grad)

# add to the previous result that has require_grad=False
total = w + z
# the total sum now requires grad!
print(total.requires_grad)

# no computation is wasted to compute gradients for x, y and z, which don't require grad
print(z.grad == x.grad == y.grad == None)

# you can also manually enable gradients for a tensor, but use this with caution!
x = torch.ones(1)
print(x.requires_grad)
x.requires_grad_(True)
print(x.requires_grad)

In [None]:
# create graph

x = torch.tensor([3], dtype=torch.float, requires_grad=True)

y = 2*x +3

print(x, y)
print(x.requires_grad, y.requires_grad)

In [None]:
print(y.grad_fn, y.grad_fn.next_functions[0][0], y.grad_fn.next_functions[0][0].next_functions[0][0], sep='\n')
print(y.grad_fn.next_functions[0][0].next_functions[0][0].next_functions)

In [None]:
y.backward()  # calculate dy / dx  == d(2*x + 3) / dx  == 2

In [None]:
print(x, x.grad) # dy / dx is stored"

To stop a tensor from tracking history, you can call `.detach()` to detach it from the computation history, and to prevent future computation from being tracked.

In [None]:
z = x.detach()
print(z, z.requires_grad)
print(z.grad)

To prevent tracking history (and using memory), you can also wrap the code block in `with torch.no_grad():`. This can be particularly helpful when evaluating a model because the model may have trainable parameters with requires_grad=True, but for which we don’t need the gradients.

In [None]:
x = torch.zeros(1, requires_grad=True)
with torch.no_grad():
    y = x * 2
print(y.requires_grad)