In [1]:
import torch

In [2]:
torch.__version__

'2.6.0+cu124'

Autograd is a core component of PyTorch that provides automatic differentiation for tensor operations. It enables gradient computation, which is essential for training machine learning models using optimization algorithms like gradient descent.

In [4]:
def dy_dx(x):

  return (2 * x)


dy_dx(3)

6

In deep learning, gradient means the rate of change of a function (like the loss function) with respect to change in its input parameters (weights and biases), and is used by algorithms like gradient descent, SGD, Adamoptimizer to iteratively adjust these parameters to minimize the loss and improve model accuracy.

In [10]:
x = torch.tensor(3.0, requires_grad = True) # pytorch undertands that it needs to compute gradient of x, initially it is default set to False
y = x ** 2

x

tensor(3., requires_grad=True)

In [9]:
y

tensor(9., grad_fn=<PowBackward0>)

In [15]:
y.backward # now pytorch automatically calculates the derivative (dy_dx)

In [16]:
x.grad # this method helps in eliminating the manual initialization of derivative (dy_dx)

tensor(6.)

Conceptually, autograd keeps a record of data (tensors) and all executed operations (along with the resulting new tensors) in a directed acyclic graph (DAG) consisting of Function objects.


In this DAG, leaves are the input tensors, roots are the output tensors.
By tracing this graph from roots to leaves, you can automatically compute the gradients using the chain rule..


In a forward pass, autograd does two things simultaneously:
1. run the requested operation to compute a resulting tensor
2. maintain the operation's gradient function in the DAG.


The backward pass kicks off when .(backward) is called on the DAG root. autograd then:
1. computes the gradients from each .grad_fn
2. accumulates them in the respective tensor's .grad attribute
3. using the chain rule, goes all the way to the leaf tensors.

In [17]:
# Inputs
x = torch.tensor(6.7)  # Input feature
y = torch.tensor(0.0)  # True label (binary)

w = torch.tensor(1.0)  # Weight
b = torch.tensor(0.0)  # Bias

In [21]:
## Loss Function
# Binary Loss function

def binary_cross_entropy_loss(prediction,target):
  epsilon = 1e-8
  prediction = torch.clamp(prediction,epsilon,1-epsilon)
  loss = -(target * torch.log(prediction) + (1 - target) * torch.log(1 - prediction))
  return loss

In [23]:
## forward pass

z = w * x + b  # weighted sum
y_prediction = torch.sigmoid(z)  # predicted probability using sigmoid activation function

# compute binary cross-entropy loss
loss = binary_cross_entropy_loss(y_prediction, y)
print('loss : ',loss)

loss :  tensor(6.7012)
