<a href="https://colab.research.google.com/github/maheer23/Machine-Learning-Notes/blob/main/Gradients.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PyTorch Fundamentals

## Tensors with Gradients

### Creating Tensor with Gradients

* A Variable wraps a Tensor
* Allows accumulation of gradients

In [1]:
import torch

a = torch.ones((2, 2), requires_grad=True)
a

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

In [2]:
a.requires_grad

True

In [3]:
# Not a variable
no_gradient = torch.ones(2, 2)

no_gradient.requires_grad

False

###  Behaves similarly to tensors

In [5]:
b = torch.ones((2, 2), requires_grad=True)
print(a + b)
print(torch.add(a, b))

tensor([[2., 2.],
        [2., 2.]], grad_fn=<AddBackward0>)
tensor([[2., 2.],
        [2., 2.]], grad_fn=<AddBackward0>)


In [6]:
print(a * b)
print(torch.mul(a, b))

tensor([[1., 1.],
        [1., 1.]], grad_fn=<MulBackward0>)
tensor([[1., 1.],
        [1., 1.]], grad_fn=<MulBackward0>)



# Manually and Automatically Calculating Gradients.

### What exactly is requires_grad?

Allows calculation of gradients w.r.t. the variable
$$y_i = 5(x_i+1)^2$$

In [7]:
x = torch.ones(2, requires_grad=True)
x

tensor([1., 1.], requires_grad=True)


$$y_i\bigr\rvert_{x_i=1} = 5(1 + 1)^2 = 5(2)^2 = 5(4) = 20$$

In [8]:
y = 5 * (x + 1) ** 2
y

tensor([20., 20.], grad_fn=<MulBackward0>)

### Backward should be called only on a scalar (i.e. 1-element tensor) or with gradient w.r.t. the variable

Let's reduce y to a scalar then...
$$o = \frac{1}{2}\sum_i y_i$$

In [9]:
o = (1/2) * torch.sum(y)
o

tensor(20., grad_fn=<MulBackward0>)


$$\frac{\partial o}{\partial x_i} = \frac{1}{2}[10(x_i+1)]$$
$$\frac{\partial o}{\partial x_i}\bigr\rvert_{x_i=1} = \frac{1}{2}[10(1 + 1)] = \frac{10}{2}(2) = 10$$

In [10]:
o.backward()

In [11]:
x.grad

tensor([10., 10.])

In [12]:
print(x.requires_grad)
print(y.requires_grad)
print(o.requires_grad)

True
True
True



# Summary

### Tensor with Gradients
* Wraps a tensor for gradient accumulation
Gradients

### Define original equation
* Substitute equation with x values
* Reduce to scalar output, o through mean
* Calculate gradients with o.backward()
* Then access gradients of the x variable through x.grad