<a href="https://colab.research.google.com/github/Nilanjan1210/PyTorch-Fundamentals/blob/main/02_Pytorch_Autograd.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### PyTorch Autograd

PyTorch Autograd is the automatic differentiation engine that powers neural network training within the PyTorch framework. It facilitates the computation of gradients for tensor operations, which is crucial for optimizing machine learning models using algorithms like gradient descent.

Neural networks (NNs) are a collection of nested functions that are executed on some input data. These functions are defined by parameters (consisting of weights and biases), which in PyTorch are stored in tensors.

Training a NN happens in two steps:

* **Forward Propagation:** In forward prop, the NN makes its best guess about the correct output. It runs the input data through each of its functions to make this guess.

* **Backward Propagation:** In backprop, the NN adjusts its parameters proportionate to the error in its guess. It does this by traversing backwards from the output, collecting the derivatives of the error with respect to the parameters of the functions (gradients), and optimizing the parameters using gradient descent.

In [1]:
import torch

In [3]:
x = torch.tensor(3.0, requires_grad=True)
y = x**2
print(x)
print(y)
# To calculate differentiation
y.backward()
# To show differentiation
print(x.grad) # dy/dx = 2*x  # 6

tensor(3., requires_grad=True)
tensor(9., grad_fn=<PowBackward0>)
tensor(6.)


In [5]:
import math
def dz_dx(x):
  return 2*x*math.cos(x**2)
print(dz_dx(3))
x = torch.tensor(3.0, requires_grad=True)
print(x)
y = x**2
print(y)
z = torch.sin(y)
print(z)
# To calculate differentiation
z.backward()
# To show differentiation
print(x.grad)


-5.466781571308061
tensor(3., requires_grad=True)
tensor(9., grad_fn=<PowBackward0>)
tensor(0.4121, grad_fn=<SinBackward0>)
tensor(-5.4668)


In [7]:
# Clearing Grad
x = torch.tensor(3.0, requires_grad=True)
y = x**2
z = torch.sin(y)
z.backward() # Calculate the gradient before accessing x.grad
print(x.grad)
x.grad.zero_()
print(x.grad)
print(x)

tensor(-5.4668)
tensor(0.)
tensor(3., requires_grad=True)


In [13]:
# Clearing Grad
x = torch.tensor(3.0, requires_grad=True)
print(x)
y = x**2
z = torch.sin(y)
z.backward() # Calculate the gradient before accessing x.grad
print(x.grad)
x.requires_grad_(False)
print(x)

tensor(3., requires_grad=True)
tensor(-5.4668)
tensor(3.)


In [8]:
# disable gradient tracking
x = torch.tensor(3.0, requires_grad=True)
y = x**2
print(y)
y.backward()
print(x.grad)
##
# Options 1: requires_grad_(False)
# Options 2: detach()
# Options 3: troch.no_grad()
x.requires_grad_(False)
print(x.grad)
y = x**2
print(y)
try :
  y.backward()
  print(x.grad)
except Exception as e:
  print(e)


tensor(9., grad_fn=<PowBackward0>)
tensor(6.)
tensor(6.)
tensor(9.)
element 0 of tensors does not require grad and does not have a grad_fn


In [14]:
x = torch.tensor(3.0, requires_grad=True)
z = x.detach()
print(z)
y = z**2
try:
  y.backward()
  print(x.grad)
except Exception as e:
  print(e)

tensor(3.)
element 0 of tensors does not require grad and does not have a grad_fn


In [15]:
x = torch.tensor(3.0, requires_grad=True)
print(x)
with torch.no_grad():
  y = x**2
  print(y)
try:
  y.backward()
  print(x.grad)
except Exception as e:
  print(e)

tensor(3., requires_grad=True)
tensor(9.)
element 0 of tensors does not require grad and does not have a grad_fn


In [16]:
# TO ENABEL YOUR GPU
print(torch.cuda.is_available())
device = "cuda" if torch.cuda.is_available() else "cpu"
print(device)
print(torch.cuda.is_available())

True
cuda
True


In [23]:
import math
# creating a new tensor in GPU
x = torch.tensor([3.0,7.0], device=device, requires_grad=True)
print(x)
y = x**2
print(y)
z = torch.sin(y)
print(z)
w = torch.exp(z)
print(w)
v = 1/w
print(v)
# To calculate differentiation
t= v.sum()
t.backward()
# To show differentiation
print(x.grad)

tensor([3., 7.], device='cuda:0', requires_grad=True)
tensor([ 9., 49.], device='cuda:0', grad_fn=<PowBackward0>)
tensor([ 0.4121, -0.9538], device='cuda:0', grad_fn=<SinBackward0>)
tensor([1.5100, 0.3853], device='cuda:0', grad_fn=<ExpBackward0>)
tensor([0.6622, 2.5954], device='cuda:0', grad_fn=<MulBackward0>)
tensor([  3.6204, -10.9223], device='cuda:0')


## **Computational Graph**
Conceptually, autograd keeps a record of data (tensors) & all executed operations (along with the resulting new tensors) in a **directed acyclic graph (DAG)** consisting of Function objects. In this DAG, leaves are the input tensors, roots are the output tensors. By tracing this graph from roots to leaves, you can automatically compute the gradients using the chain rule.

In a forward pass, autograd does two things simultaneously:

* run the requested operation to compute a resulting tensor, and
* maintain the operation’s gradient function in the DAG.

The backward pass kicks off when `.backward()` is called on the DAG root. `autograd` then:

* computes the gradients from each `.grad_fn`,
* accumulates them in the respective tensor’s `.grad` attribute, and
* using the chain rule, propagates all the way to the leaf tensors.