# **Auto gradients of tensor objects**

As we saw in the previous chapter, differentiation and calculating gradients play a critical role in updating the weights of a neural network. PyTorch's tensor objects come with built-in functionality to calculate gradients.


*   In this section, we will understand how to calculate the gradients of a tensor object using PyTorch: 



# Define a Tensor object and also specify how to calculate gradient to be calculated

In [3]:
import numpy as np
import torch
x = torch.tensor([[2.,-1.],[1.,1.]],requires_grad=True)
print(x)

tensor([[ 2., -1.],
        [ 1.,  1.]], requires_grad=True)


requires_grad parameter specifies that the gradient is to be calculated on tensor object

# Next, define the way to calculate the output, which in this specific case is the sum of the squares of all inputs:

In [4]:
out = x.pow(2).sum()

In [5]:
out

tensor(7., grad_fn=<SumBackward0>)

tensor(7., grad_fn=<SumBackward0>)

We know that the gradient of the preceding function is 2*x. Let's validate this using the built-in functions provided by PyTorch.

In [9]:
2*x.sum()

tensor(6., grad_fn=<MulBackward0>)

# The gradient of a value can be calculated by calling the backward() method to the value. In our case, we calculate the gradient – change in out (output) for a small change in x (input) – as follows:

In [10]:
out.backward()

# We are now in a position to obtain the gradient of out with respect to x, as follows:

In [12]:
x.grad

tensor([[ 4., -2.],
        [ 2.,  2.]])

Notice that the gradients obtained previously match with the intuitive gradient values (which are two times that of the value of x).

# **Computing gradients for the same case that was present in Chain_rule.ipynb notebook in previous chapter**

In [13]:
x = np.array([[1,1]])
y = np.array([[0]])
x,y = [torch.tensor(i).float()for i in [x,y]]

In [14]:
W = [
    np.array([[-0.0053, 0.3793],
              [-0.5820, -0.5204],
              [-0.2723, 0.1896]], dtype=np.float32).T, 
    np.array([-0.0140, 0.5607, -0.0628], dtype=np.float32), 
    np.array([[ 0.1528, -0.1745, -0.1135]], dtype=np.float32).T, 
    np.array([-0.5516], dtype=np.float32)
]
W = [torch.tensor(i,requires_grad=True)for i in W]

In [15]:
def feed_forward(inputs,outputs,weights):
    pre_hidden = torch.matmul(inputs,weights[0]+weights[1])
    hidden = 1/(1+torch.exp(-pre_hidden))
    out = torch.matmul(hidden, weights[2]) + weights[3]
    mean_squared_error = torch.mean(torch.square(out - outputs))
    return mean_squared_error

In [16]:
loss = feed_forward(x,y,W)
loss

tensor(0.3613, grad_fn=<MeanBackward0>)

In [17]:
loss.backward()

In [18]:
print([w.grad for w in W])

[tensor([[-0.0446,  0.0524,  0.0337],
        [-0.0446,  0.0524,  0.0337]]), tensor([-0.0891,  0.1049,  0.0675]), tensor([[-0.7040],
        [-0.6068],
        [-0.5387]]), tensor([-1.2021])]


In [19]:
updated_w = [w-w.grad for w in W]

In [20]:
updated_w

[tensor([[ 0.0393, -0.6344, -0.3060],
         [ 0.4239, -0.5728,  0.1559]], grad_fn=<SubBackward0>),
 tensor([ 0.0751,  0.4558, -0.1303], grad_fn=<SubBackward0>),
 tensor([[0.8568],
         [0.4323],
         [0.4252]], grad_fn=<SubBackward0>),
 tensor([0.6505], grad_fn=<SubBackward0>)]