# PyTorch Fundamentals
## 2. Tensors with Variables and Gradients

### Varibales and Gradients

- A Variable wraps a Tensor
- Allows accumulation of gradients

In [0]:
import torch
from torch.autograd import Variable

Need to import `Variable` to create variables earlier in version of `Pytorch < 0.4.0`

In [2]:
a = Variable(torch.ones((2,2)), requires_grad=True)
a

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

In [3]:
# Not a Variable
torch.ones((2,2))

tensor([[1., 1.],
        [1., 1.]])

The Variable API has been deprecated in `Pytorch0.4.0`: 

Variables are no longer necessary to use autograd with tensors. Autograd automatically supports Tensors with `requires_grad` set to True. 

In [4]:
print(torch.__version__)

1.3.1+cu100


In [5]:
a = torch.ones((2, 2), requires_grad=True)
a

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

In [6]:
a.requires_grad

True

In [0]:
# Not a variable
no_gradient = torch.ones(2, 2)

In [8]:
no_gradient.requires_grad

False

### Manually and Automatically Calculating Gradients

**What exactly is `requires_grad`?**
- Allows calculation of gradients w.r.t. the variable

$$y_i = 5(x_i+1)^2$$

In [0]:
import numpy as np

x = torch.from_numpy(np.array([1.0,2.0]))
x.requires_grad=True

$$y_i\bigr\rvert_{x_i=1} = 5(1 + 1)^2 = 5(2)^2 = 5(4) = 20$$

In [10]:
y = 5 * (x + 1) ** 2
y

tensor([20., 45.], dtype=torch.float64, grad_fn=<MulBackward0>)

**Backward should be called only on a scalar (i.e. 1-element tensor) or with gradient w.r.t. the variable**
- Let's reduce y to a scalar then...

$$o = \frac{1}{2}\sum_i y_i$$

In [11]:
o = (1/2) * torch.sum(y)
o

tensor(32.5000, dtype=torch.float64, grad_fn=<MulBackward0>)

<center> <b>Recap `y` equation</b>: $y_i = 5(x_i+1)^2$ </center>

<center> <b>Recap `o` equation</b>: $o = \frac{1}{2}\sum_i y_i$ </center>

<center> <b>Substitute `y` into `o` equation</b>: $o = \frac{1}{2} \sum_i 5(x_i+1)^2$ </center>

$$\frac{\partial o}{\partial x_i} = \frac{1}{2}[10(x_i+1)]$$

$$\frac{\partial o}{\partial x_i}\bigr\rvert_{x_i=1} = \frac{1}{2}[10(1 + 1)] = \frac{10}{2}(2) = 10$$

In [0]:
o.backward()

In [13]:
x.grad

tensor([10., 15.], dtype=torch.float64)

In [14]:
x.requires_grad

True

In [15]:
y.requires_grad

True

In [16]:
o.requires_grad

True

---
# Summary
- Tensor with Gradients
    - Wraps a tensor for gradient accumulation
- Gradients
    - Define original equation
    - Substitute equation with `x` values
    - Reduce to scalar output, `o` through `mean`
    - Calculate gradients with `o.backward()`
    - Then access gradients of the `x` variable through `x.grad`