<a href="https://colab.research.google.com/github/kouroshkarimi/Pytorch-Introducing-to-Deep-Learning/blob/main/010_Autograd_ex.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Autograd: automatic differentiation

The ``autograd`` package provides automatic differentiation for all operations
on Tensors. It is a define-by-run framework, which means that your backprop is
defined by how your code is run, and that every single iteration can be
different.

In [1]:
# Import package
import torch

In [21]:
x = torch.tensor([[2, 1],[2, 1],[3, 1]], requires_grad=True, dtype = torch.float32)

In [22]:
print("[INFO]")
print("The size of x is {}".format(x.size()))
print("The dimentiona of x is {}".format(x.dim()))

[INFO]
The size of x is torch.Size([3, 2])
The dimentiona of x is 2


In [23]:
print(x)

tensor([[2., 1.],
        [2., 1.],
        [3., 1.]], requires_grad=True)


In [24]:
y = x + 3

In [25]:
print(y.grad_fn)

<AddBackward0 object at 0x7f6c47468e10>


### There’s one more class which is very important for autograd implementation - a Function. Tensor and Function are interconnected and build up an acyclic graph, that encodes a complete history of computation. Each variable has a .grad_fn attribute that references a function that has created a function (except for Tensors created by the user - these have None as .grad_fn).

In [26]:
print(x.grad_fn)

None


In [27]:
y.grad_fn.next_functions[0][0].variable

tensor([[2., 1.],
        [2., 1.],
        [3., 1.]], requires_grad=True)

In autograd, if any input Tensor of an operation has requires_grad=True, the computation will be tracked. After computing the backward pass, a gradient w.r.t. this tensor is accumulated into .grad attribute

In [29]:
z = 4 * (y ** 2)
a = z.mean()

In [32]:
# We can visualize Computational graph with torchviz package (in google colab dont work you must install it)
import torchviz as tv
tv.make_dot(a)

ModuleNotFoundError: ignored

## Gradients
* we can calculate all bakcward partial diffrentials with bakcward().

In [33]:
a.backward()

In [37]:
# calculating da/dx (index by index)
x.grad

tensor([[6.6667, 5.3333],
        [6.6667, 5.3333],
        [8.0000, 5.3333]])

## Inference

In [38]:
# This variable decides the tensor's range below
n = 3
# Both x and w that allows gradient accumulation
x = torch.arange(1., n + 1, requires_grad=True)
w = torch.ones(n, requires_grad=True)
z = w @ x
z.backward()
print(x.grad, w.grad, sep='\n')

tensor([1., 1., 1.])
tensor([1., 2., 3.])


In [39]:
# Only w that allows gradient accumulation
x = torch.arange(1., n + 1)
w = torch.ones(n, requires_grad=True)
z = w @ x
z.backward()
print(x.grad, w.grad, sep='\n')

None
tensor([1., 2., 3.])


In [40]:
x = torch.arange(1., n + 1)
w = torch.ones(n, requires_grad=True)

# Regardless of what you do in this context, all torch tensors will not have gradient accumulation
with torch.no_grad():
    z = w @ x

try:
    z.backward()  # PyTorch will throw an error here, since z has no grad accum.
except RuntimeError as e:
    print('RuntimeError!!! >:[')
    print(e)

RuntimeError!!! >:[
element 0 of tensors does not require grad and does not have a grad_fn


## More stuff

* You can see renfrence of this codes below and you can do more and see more example there.

Documentation of the automatic differentiation package is at
http://pytorch.org/docs/autograd.