### Variables:
1. Wraps a Tensor
2. Allows accumulation of gradients

In [2]:
import torch
from torch.autograd import Variable

In [5]:
a = Variable(torch.ones(2, 2), requires_grad= True)
a

Variable containing:
 1  1
 1  1
[torch.FloatTensor of size 2x2]

In [8]:
#not a torch variable
torch.ones(2, 2)


 1  1
 1  1
[torch.FloatTensor of size 2x2]

In [12]:
#Tensors and Variables behave same
b = Variable(torch.ones(2, 2), requires_grad= True)
print(a + b)
#or
print(torch.add(a, b))


Variable containing:
 2  2
 2  2
[torch.FloatTensor of size 2x2]

Variable containing:
 2  2
 2  2
[torch.FloatTensor of size 2x2]



### Gradients

#### Why are we using requires_grad?
 - Allows calculation of gradients w.r.t variable
 

#### Autograd: automatic differentiation

Central to all neural networks in PyTorch is the ``autograd`` package.
Let’s first briefly visit this, and we will then go to training our
first neural network.


The ``autograd`` package provides automatic differentiation for all operations
on Tensors. It is a define-by-run framework, which means that your backprop is
defined by how your code is run, and that every single iteration can be
different.

In [1]:
import torch
from torch.autograd import Variable

In [2]:
#Create Variable
x = Variable(torch.ones(2, 2), requires_grad=True)
print(x)

Variable containing:
 1  1
 1  1
[torch.FloatTensor of size 2x2]



In [5]:
#Do some operation on Variable
y = x + 2
print(y)

Variable containing:
 3  3
 3  3
[torch.FloatTensor of size 2x2]



``y`` was created as a result of an operation, so it has a ``grad_fn``.

In [6]:
print(y.grad_fn)

<torch.autograd.function.AddConstantBackward object at 0x1028c5a98>


In [7]:
z = y * y * 3
out = z.mean()

print(z, out)

Variable containing:
 27  27
 27  27
[torch.FloatTensor of size 2x2]
 Variable containing:
 27
[torch.FloatTensor of size 1]



#### Gradients in action

In [8]:
out.backward()

print gradients d(out)/dx

In [59]:
print(x.grad)

Variable containing:
 4.5000  4.5000
 4.5000  4.5000
[torch.FloatTensor of size 2x2]



You should have got a matrix of ``4.5``. Let’s call the ``out``
*Variable* “$o$”.
We have that 
$o = \frac{1}{4}\sum_i z_i$,
$z_i = 3(x_i+2)^2$ and $z_i\bigr\rvert_{x_i=1} = 27$.
Therefore,
$\frac{\partial o}{\partial x_i} = \frac{3}{2}(x_i+2)$, hence
$\frac{\partial o}{\partial x_i}\bigr\rvert_{x_i=1} = \frac{9}{2} = 4.5$.

