# Automatic Differentiation with torch.autograd

Tensor: In PyTorch, the primary data structure is the Tensor, which is similar to NumPy arrays but with added capabilities for automatic differentiation.

Requires Grad: When creating a tensor, you can specify requires_grad=True to indicate that you want PyTorch to track operations on this tensor. This is necessary if you want to compute gradients with respect to this tensor during backpropagation.

Computation Graph: PyTorch constructs a computation graph dynamically as operations are performed on tensors. Nodes in this graph represent the operations, and edges represent the tensors flowing between operations. This graph is used to compute gradients.

Backward Pass: To compute gradients, you call the backward() method on a tensor that represents the output of a computation. PyTorch then traverses the computation graph from this tensor, computing gradients and storing them in the .grad attribute of each tensor.


In [1]:
import torch

x = torch.ones(5)  # input tensor
y = torch.zeros(3)  # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)

print(f"Gradient function for z = {z.grad_fn}")
print(f"Gradient function for loss = {loss.grad_fn}")

loss.backward()
print("w.grad", w.grad)
print("Arithmetic error", b.grad)

Gradient function for z = <AddBackward0 object at 0x1210a18d0>
Gradient function for loss = <BinaryCrossEntropyWithLogitsBackward0 object at 0x1214a2ce0>
w.grad tensor([[0.2996, 0.3159, 0.1794],
        [0.2996, 0.3159, 0.1794],
        [0.2996, 0.3159, 0.1794],
        [0.2996, 0.3159, 0.1794],
        [0.2996, 0.3159, 0.1794]])
Arithmetic error tensor([0.2996, 0.3159, 0.1794])


In [2]:
# Disabling Gradient Tracking
z = torch.matmul(x, w)+b
print("requires grad:", z.requires_grad)

with torch.no_grad():
    z = torch.matmul(x, w)+b
print("requires grad:", z.requires_grad)

requires grad: True
requires grad: False


In [3]:
# Another way to achieve the same result is to use the detach() method on the tensor:
z = torch.matmul(x, w)+b
z_det = z.detach()
print(z_det.requires_grad)

False


In [4]:
# Create tensors with requires_grad=True to track computations
x = torch.tensor(2.0, requires_grad=True)
y = torch.tensor(3.0, requires_grad=True)

# Perform operations on tensors
z = x * y + y ** 2

# Print the computation graph
print(f"Computation graph for z: {z}")

# Perform backward pass to compute gradients
z.backward()

# Gradients are stored in the .grad attribute of tensors
print(f"Gradient of x: {x.grad}")  # dz/dx
print(f"Gradient of y: {y.grad}")  # dz/dy


Computation graph for z: 15.0
Gradient of x: 3.0
Gradient of y: 8.0


# A gradient is a measure of how a function changes as its input changes.

# A computation graph is a visual and mathematical way to represent the sequence of operations that are performed to compute a function. In deep learning, it is used to represent the series of operations that transform the input data through various layers to produce the output.
