# Pytorch 102 - Autograd

The `autograd` package provides automatic differentiation for all operations on Tensors.  
It is a define-by-run framework, which means that your backprop is defined by how your code is run, and that every single iteration can be different.

## What the heck is an Autograd.

Here some info for later

## Example

In [2]:
import torch

Let's then create an example where we require `grad`  to track the computation on it.

In [4]:
x =  torch.ones(2, 2, requires_grad=True)
print(x)

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)


Now let's do a tensor operation

In [5]:
y = x + 2
print(y)

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)


In this case `y` was created as a result of an operation, so it has `grad_fn.`

In [6]:
print(y.grad_fn)

<AddBackward0 object at 0x7f0a39be8250>


Let's do mor operations in y

In [12]:
z = y * y * 3
out = z.mean()

print(z, out)

tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward0>)


`.requires_grad_( ... )` changes an existing Tensor’s `requires_grad` flag in-place.  
The input flag defaults to `False` if not given

In [10]:
# Create tensor and performe some operations
a = torch.randn(2, 2)
a = ((a * 3) / (a - 1))

# Check if requires grad
print(a.requires_grad)
print(a.grad_fn) # No grad_fn if requrires_grad == False

False
None


In [11]:
# Change the grad flag of the Tensor to true
a.requires_grad_(True)

# Check if requires grad
print(a.requires_grad)
b = (a * a).sum()

# Get grand fn
print(b.grad_fn)

True
<SumBackward0 object at 0x7f0a3a1634d0>


## Gradients

Now let's backdrop.  
Since `out` contains a single scalar, `out.backward()` is equivalent to `out.backward(torch.tensor(1.))`

In [13]:
out.backward()

Print gradients d(out)/dx

In [14]:
print(x.grad)

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])


Now let's take a look at an example of vector-Jacobian product:

In [15]:
x = torch.randn(3, requires_grad=True)

y = x * 2
while y.data.norm() < 1000:
    y = y * 2

print(y)

tensor([-1037.7939,  -193.1925,   -91.6601], grad_fn=<MulBackward0>)


Now in this case y is no longer a scalar. torch.autograd could not compute the full Jacobian directly,  
but if we just want the vector-Jacobian product, simply pass the vector to backward as argument:

In [16]:
v = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
y.backward(v)

print(x.grad)

tensor([5.1200e+01, 5.1200e+02, 5.1200e-02])


You can also stop autograd from tracking history on Tensors with  
`.requires_grad=True` either by wrapping the code block in `with torch.no_grad():`

In [17]:
print(x.requires_grad)
print((x ** 2).requires_grad)

with torch.no_grad():
    print((x ** 2).requires_grad)

True
True
False


Or by using `.detach()` to get a new Tensor with the same content but that does not require gradients:

In [19]:
print(x.requires_grad)

y = x.detach()
print(y.requires_grad)
print(x.eq(y).all())

True
False
tensor(True)
