# Autograd: Automatic Differentiation

### torch.Tensor 
|Attributes and Functions| Description|
|:---|:---|
__.requires_grad__|* has attribute __.requires_grad__ which if set True tracks operations on it.
__.backward()__|* After the completion of computation __.backward()__ can be called and have all the gradients computed automtically.
__.grad__|* The gradient is accumulated into .grad attribute
__.detach()__|* To stop a tensor from tracking history, __.detach()__ can be called which detaces from the computation history and prevent future computation from being tracked
__with torch.no_grad():__|* To prevent tracking , the code block can be wrapped in __with torch.no_grad():__ which can be helpful when evaluating a model with parameters __.requires_grad=True__
__.grad_fn__|* Every tensor has a __.grad_fn__ attibute that references a function that has created the Tensor(except for Tensors created by the user their __grad_fn__ is __None__).
__backward()__|* If __Tensor__ is scalar(holds a one element data), no need to specify any arguments to __.backward()__, however if it has more elements, need to specify a __gradient__ argument that is a tensor of matching shape.
__.requires_grad_(True)__|* changes an existing Tensor's requires_grad flag in place. The input flag defaults to __False__ if not given.
__y.data.norm() vs y.norm()__ | * both calculates the norm but first one doesnot keep track of the grad_fn while the second one does

In [71]:
import torch

In [72]:
x = torch.ones(2,2, requires_grad=True)
print(x)

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)


In [73]:
print(x.requires_grad)

True


In [74]:
y=x+2
print(y)

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)


In [75]:
print(y.grad_fn)

<AddBackward0 object at 0x7f6b724e37b8>


In [76]:
z = y*y*3
out = z.mean()
print(z, out)

tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward1>)


In [64]:
out.backward()

In [57]:
y.detach_()

tensor([[3., 3.],
        [3., 3.]])

In [48]:
x.grad

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])

In [65]:
y.requires_grad

True

In [66]:
a = torch.randn(2,2)
a = ((a*3)/(a-1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.requires_grad)
b = (a*a).sum()
print(b.grad_fn)

False
True
<SumBackward0 object at 0x7f6b725ddc18>


In [69]:
a.requires_grad

True

In [77]:
out.backward()

In [78]:
print(x.grad)

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])


In [88]:
x = torch.randn(3, requires_grad=True)
y=x*2
while y.data.norm() < 1000:
    y = y*2
print(y)

tensor([-621.3907, 1412.7545, -535.0473], grad_fn=<MulBackward0>)


In [86]:
y.norm()

tensor(1966.8599, grad_fn=<NormBackward0>)

## Now in this case y is no longer a scalar. torch.autograd could not compute the full Jacobian directly, but if we just want the vector-Jacobian product, simply pass the vector to backward as argument:
The vector represent the gradient of the scalar value loss with the elements of tensor y(__dl/dy1__, __dl/dy2__, __dl/dy3__,.......)

In [91]:
v = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
y.backward(v)

In [92]:
print(x.grad)

tensor([5.1200e+01, 5.1200e+02, 5.1200e-02])


In [97]:
print(x.requires_grad)
print((x**2).requires_grad)
with torch.no_grad():
    print((x**2).requires_grad)

True
True
False
