<a href="https://colab.research.google.com/github/desaiankitb/pytorch-basics/blob/main/tutorial-pytorch-org/00_4_Automatic_Differentiation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Automatic Differentiation

## Automatic Differentiation with `torch.autograd`

- When training neural networks, the most frequently used algorithm is **back prapagation**. In this algorithm, parameters (model weights) are adjusted according to the **gradient** of the loss function with respect to the given parameter. 

- To compute those gradients, PyTorch has a built-in diffrentiation engine called `torch.autograd`. It supports automatic computation of gradient for any computational graph. 

- Consider the simplest one-layer neural network, which input `x`, parameters `w` and `b`, and some loss function. It can be defined in PyTorch in the following manner: 

In [1]:
import torch

x = torch.ones(5) # input tensor
y = torch.zeros(3) # expected output 
w = torch.randn(5, 3, requires_grad=True) 
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)

## Tensors, Functions and Computational graph

This code defines the following **computational graph**:

![Diagram showing a computational graph with two parameters 'w' and 'b' to compute the gradients of loss.](https://drive.google.com/uc?export=view&id=1_hwyhjaJhQah0YS44R7jfrkHSYEyk_Mu)

- Note: [reference to use image in colab from drive](https://stackoverflow.com/questions/50670920/how-to-insert-an-inline-image-in-google-colaboratory-from-google-drive)

- In this network, `w` and `b` are **parameters**, which we need to optimize. Thus, we need to be able to compute the gradients of loss function with respect to those variables. In order to do that, we set the `required_grad` property of those tensors. 

> **Note:** You can set the value of `requires_grad` when creating a tensor, or later by using `x.requires_grad_(True)` method.

- A function that we apply to tensors to construct computational graph is in fact an object of class `Function`. This object knows how to compute the function in the *forward direction*, and also how to compute its derivative during the *backward prapagation* step. A reference to the backward prapogation function is stored in `grad_fn` property of a tensor. You can find more information of `Function` [in the documentation](https://pytorch.org/docs/stable/autograd.html#function).

In [2]:
print("Gradient function for z =", z.grad_fn)
print("Gradient function for loss = ", loss.grad_fn)

Gradient function for z = <AddBackward0 object at 0x7f344d42d850>
Gradient function for loss =  <BinaryCrossEntropyWithLogitsBackward object at 0x7f344d42d890>
