### How to Compute Grad using pytorch

Official guide:
- https://pytorch.org/tutorials/beginner/introyt/autogradyt_tutorial.html

If you what compute grad in Pytorch, firstly you should set `requires_grad=True`. Then use `torch.autograd.grad` to obtain grad.

### 1. `requires_grad`

In [4]:
import torch
a = torch.tensor([1., 2., 3.])
b = torch.tensor([4., 5., 6.], requires_grad=True)

print("sin(a):", torch.sin(a))
print("sin(b):",torch.sin(b))

c = torch.sin(torch.sin(b).sum()**2)
print("c:", c)

sin(a): tensor([0.8415, 0.9093, 0.1411])
sin(b): tensor([-0.7568, -0.9589, -0.2794], grad_fn=<SinBackward0>)
c: tensor(-0.7440, grad_fn=<SinBackward0>)


Each `grad_fn` stored with our tensors allows you to walk the computation all the way back to its inputs with its `next_functions` property. We can see below that drilling down on this property on d shows us the gradient functions for all the prior tensors. Note that `b.grad_fn` is reported as None, indicating that this was an input to the function with no history of its own.

In [22]:
print('c:')
print(c.grad_fn)
print(c.grad_fn.next_functions)
print(c.grad_fn.next_functions[0][0].next_functions)
print(c.grad_fn.next_functions[0][0].next_functions[0][0].next_functions)
print(c.grad_fn.next_functions[0][0].next_functions[0][0].next_functions[0][0].next_functions)
print(c.grad_fn.next_functions[0][0].next_functions[0][0].next_functions[0][0].next_functions[0][0].next_functions)

print("b.grad_fn:", b.grad_fn)

c:
<SinBackward0 object at 0x7fe798a1b160>
((<PowBackward0 object at 0x7fe798a1ad70>, 0),)
((<SumBackward0 object at 0x7fe63967d3f0>, 0),)
((<SinBackward0 object at 0x7fe63967d960>, 0),)
((<AccumulateGrad object at 0x7fe63967d3f0>, 0),)
()
b.grad_fn: None


### 2. `torch.autograd.grad`

In [25]:
from torch.autograd import grad

d = torch.sum(b**2)

grad(d, b)

(tensor([ 8., 10., 12.]),)

In [26]:
e = b**2

grad(e, b, grad_outputs=torch.ones_like(e))

(tensor([ 8., 10., 12.]),)

### 3. Second Derivative

In [29]:
x = torch.tensor(2.).requires_grad_()
y = torch.tensor(3.).requires_grad_()

z = x * x * y

grad_x = torch.autograd.grad(outputs=z, inputs=x)

print(grad_x[0])

tensor(12.)


In [None]:
grad_xx = torch.autograd.grad(outputs=grad_x, inputs=x)
# RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

The reason is that pytorch will release the gradient of the nodes created by the intermediate calculation process. So we need to use `create_graph = True` to store our compute graph.

In [36]:
x = torch.tensor(2.).requires_grad_()
y = torch.tensor(3.).requires_grad_()

z = x * x * y

grad_x = torch.autograd.grad(outputs=z, inputs=x, create_graph=True)

print(grad_x)

(tensor(12., grad_fn=<AddBackward0>),)


In [37]:
grad_xx = torch.autograd.grad(outputs=grad_x, inputs=x)
print(grad_xx)

(tensor(6.),)
