<a href="https://colab.research.google.com/github/vikash0837/PyTorch/blob/master/Pytorch_tutorial_2(AUTOGRAD_AUTOMATIC_DIFFERENTIATION).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Learning Pytorch Tutorial basics**
[Source](https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html)

# **AUTOGRAD: AUTOMATIC DIFFERENTIATION**




*   `autograd` is Center to all neural network
*   `autograd` provides automatic differentiation for all operations on Tensors
*   It is a define-by-run framework, which means that your backprop is defined by how your code is run, and that every single iteration can be different.






## **Tensor**


*   `torch.Tensor` is the central class of the package
*   If you set its attribute `.requires_grad` as `True`, it starts to track all operations on it
* When you finish your computation you can call `.backward()` and have all the gradients computed automatically
* The gradient for this tensor will be accumulated into `.grad` attribute.
* To stop a tensor from tracking history, you can call `.detach()` to detach it from the computation history, and to prevent future computation from being tracked.

* To prevent tracking history (and using memory), you can also wrap the code block in with `torch.no_grad():`. This can be particularly helpful when evaluating a model because the model may have trainable parameters with `requires_grad=True`, but for which we don’t need the gradients.

* There’s one more class which is very important for autograd implementation `- a Function`.
* Tensor and Function are interconnected and build up an acyclic graph, that encodes a complete history of computation
* Each tensor has a `.grad_fn` attribute that references a Function that has created the Tensor (except for Tensors created by the user - their `grad_fn` is `None`).
* If you want to compute the derivatives, you can call `.backward()` on a Tensor. If Tensor is a scalar (i.e. it holds a one element data), you don’t need to specify any arguments to `backward()`, however if it has more elements, you need to specify a gradient argument that is a tensor of matching shape.


In [0]:
import torch

In [7]:
# Create a tensor and set requires_grad=True to track computation with it
x = torch.ones(2, 2, requires_grad=True)
print(x)

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)


In [8]:
# Do a tensor operation:
y = x + 2
print(y)

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)


In [9]:
# y was created as a result of an operation, so it has a grad_fn.
print(y.grad_fn)

<AddBackward0 object at 0x7f4eadf80470>


In [10]:
# more operation on y
z = y*y*3
out = z.mean()
print(z, out)

tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward0>)


Note: `.requires_grad_( ... )` changes an existing Tensor’s `requires_grad` flag in-place. The input flag defaults to False if not given.

In [11]:
a = torch.randn(2, 2)
a = ((a * 3) / (a - 1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.requires_grad)
b = (a * a).sum()
print(b.grad_fn)

False
True
<SumBackward0 object at 0x7f4eadf84358>


In [12]:
print(a)
print(b)

tensor([[ 1.7722, 46.1572],
        [-2.5183, -7.0657]], requires_grad=True)
tensor(2189.8940, grad_fn=<SumBackward0>)


# Gradients


> Let’s backprop now. Because out contains a single scalar, out.backward() is equivalent to out.backward(torch.tensor(1.)).



In [0]:
out.backward()

In [14]:
# Print gradients d(out)/dx
print(x.grad)

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])


In [25]:
# et’s take a look at an example of vector-Jacobian product:
x = torch.randn(3, requires_grad=True)
print(x)
y = x * 2
print(y)
i=0
#print("iteration:",i)
while y.data.norm() < 10:
    print("iteration:",i)
    y = y * 2
    i+=1
    

print(y)

tensor([-0.7681,  2.7217,  0.7993], requires_grad=True)
tensor([-1.5363,  5.4435,  1.5987], grad_fn=<MulBackward0>)
iteration: 0
tensor([-3.0725, 10.8870,  3.1974], grad_fn=<MulBackward0>)


## Generally speaking, `torch.autograd` is an engine for computing vector-Jacobian product

In [26]:
v = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
y.backward(v)

print(x.grad)

tensor([4.0000e-01, 4.0000e+00, 4.0000e-04])


In [23]:
print(x.requires_grad)
print((x ** 2).requires_grad)

with torch.no_grad():
    print((x ** 2).requires_grad)

True
True
False


In [24]:
# Or by using .detach() to get a new Tensor with the same content but that does not require gradients:
print(x.requires_grad)
y = x.detach()
print(y.requires_grad)
print(x.eq(y).all())

True
False
tensor(True)
