<h2 style="text-align:center;color:#0F4C81;">
PyTorch Autograd Engine
</h2>

PyTorch's **autograd** module enables automatic differentiation, making it essential for training neural networks.

### **1 Enabling Gradient Tracking**
Tensors that require gradients must be created with `requires_grad=True`.

In [1]:
import torch

# Creating a tensor with gradient tracking
x = torch.tensor([2.0], requires_grad=True)
print("Tensor:", x)
print("Requires Grad:", x.requires_grad)

Tensor: tensor([2.], requires_grad=True)
Requires Grad: True


### **2 Computing Gradients**
When performing operations on tensors with `requires_grad=True`, PyTorch tracks computations for differentiation.

In [2]:
# Define a function y = x^2
y = x ** 2
print("Function Output:", y)

# Compute gradients (dy/dx)
y.backward()

# Gradient of y with respect to x
print("Gradient (dy/dx):", x.grad)

Function Output: tensor([4.], grad_fn=<PowBackward0>)
Gradient (dy/dx): tensor([4.])


Since $y = x^2$, the derivative $dy/dx = 2x$. Given $x = 2$, we expect $dy/dx = 4$.

### **3 Computing Gradients for Vectors**

For multi-dimensional tensors, you must specify a gradient vector (usually ones for scalar sum).

In [3]:
# Creating a vector with requires_grad=True
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)

# Function: y = x^3
y = x ** 3

# Compute gradients using a gradient vector
y.backward(torch.tensor([1.0, 1.0, 1.0]))  # Equivalent to sum(y).backward()

# Gradients
print("Gradient (dy/dx):", x.grad)

Gradient (dy/dx): tensor([ 3., 12., 27.])


For $y = x^3$, the derivative is $dy/dx = 3x^2$, so at $x = [1,2,3]$, we expect $[3, 12, 27]$.

### **4 Stopping Gradient Computation**
Sometimes, we want to disable gradient tracking to save memory and computation.

- Using `torch.no_grad()`:

In [4]:
x = torch.tensor([5.0], requires_grad=True)

with torch.no_grad():
    y = x * 2
    print("No Grad Computation:", y.requires_grad)  # False

No Grad Computation: False


- Using `.detach()`:

In [5]:
y = x.detach()
print("Detached Tensor Requires Grad:", y.requires_grad)  # False

Detached Tensor Requires Grad: False


### **5 Zeroing Gradients**
Gradients accumulate by default, so always reset them before a new computation.

In [6]:
x = torch.tensor([3.0], requires_grad=True)

for _ in range(3):
    y = x ** 2
    y.backward()
    print("Accumulated Gradient:", x.grad)
    
    # Reset gradients
    x.grad.zero_()
    print("Gradient After Zeroing:", x.grad)

Accumulated Gradient: tensor([6.])
Gradient After Zeroing: tensor([0.])
Accumulated Gradient: tensor([6.])
Gradient After Zeroing: tensor([0.])
Accumulated Gradient: tensor([6.])
Gradient After Zeroing: tensor([0.])


### **6 Higher-Order Gradients**
PyTorch supports second-order differentiation.

In [7]:
x = torch.tensor([2.0], requires_grad=True)

# Function: y = x^3
y = x ** 3

# First derivative: dy/dx
grad1 = torch.autograd.grad(y, x, create_graph=True)[0]

# Second derivative: d^2y/dx^2
grad2 = torch.autograd.grad(grad1, x)[0]

print("First Derivative (dy/dx):", grad1)
print("Second Derivative (d²y/dx²):", grad2)

First Derivative (dy/dx): tensor([12.], grad_fn=<MulBackward0>)
Second Derivative (d²y/dx²): tensor([12.])


For $y = x^3$, $dy/dx = 3x^2$ and $d^2y/dx^2 = 6x$.