 PyTorch stands out as a versatile and powerful tool for building and training neural networks. At the heart of PyTorch lies its automatic differentiation functionality, which enables seamless computation of derivatives, crucial for optimizing parameters in complex models. Additionally, PyTorch's flexibility extends beyond neural networks, offering a wide range of applications, including curve fitting. Let's delve into the core concepts and practical examples that demonstrate the prowess of PyTorch.

### Understanding PyTorch's Differentiation Mechanism

PyTorch's differentiation capabilities revolve around the `torch.nn.Autograd.Function` class, which plays a pivotal role in computing gradients during both forward and backward passes. The forward function calculates outputs based on inputs, while the backward function propagates gradients backward through the network, employing the chain rule of calculus. This mechanism allows PyTorch to automatically compute gradients, simplifying the process of optimizing complex functions.

In [1]:
import torch

# Example demonstrating forward and backward passes in PyTorch's differentiation mechanism
class ExampleFunction(torch.autograd.Function):
    @staticmethod
    def forward(ctx, x, y):
        # Forward pass computes the output
        result = x + y
        ctx.save_for_backward(x, y)
        return result

    @staticmethod
    def backward(ctx, grad_output):
        # Backward pass computes gradients
        x, y = ctx.saved_tensors
        grad_x = grad_y = None
        if ctx.needs_input_grad[0]:
            grad_x = grad_output
        if ctx.needs_input_grad[1]:
            grad_y = grad_output
        return grad_x, grad_y

In order to compute derivatives in our neural network, we generally call `backward` on the Tensor representing our loss. Then, we backtrack through the graph starting from node representing the grad_fn of our loss.

As described above, the backward function is recursively called through out the graph as we backtrack. Once, we reach a leaf node, since the `grad_fn` is None, but stop backtracking through that path.

In [2]:
# Example usage
x = torch.tensor(2.0, requires_grad=True)
y = torch.tensor(3.0, requires_grad=True)
z = ExampleFunction.apply(x, y)
z.backward()
print(x.grad)  # Output: tensor(1.)
print(y.grad)  # Output: tensor(1.)

tensor(1.)
tensor(1.)


### Handling Vector-Valued Tensors

It's important to note that PyTorch raises an error if backward is called on a vector-valued tensor. This means that backward can only be called on a scalar-valued tensor. In scenarios where a tensor is vector-valued, such as when dealing with multiple outputs or multi-class classification, additional processing may be required to obtain a scalar loss value before calling backward.

In [5]:
import torch 

a = torch.randn((3,3), requires_grad = True)

w1 = torch.randn((3,3), requires_grad = True)
w2 = torch.randn((3,3), requires_grad = True)
w3 = torch.randn((3,3), requires_grad = True)
w4 = torch.randn((3,3), requires_grad = True)

b = w1*a 
c = w2*a

d = w3*b + w4*c 

L = (10 - d)

print(L.shape)

try:
    L.backward()
except:
    print("\n Can't call backward on vector valued tensor.")

torch.Size([3, 3])

 Can't call backward on vector valued tensor.


### Exploring Automatic Differentiation

The most important advantage of PyTorch over NumPy is its automatic differentiation functionality which is very useful in optimization applications such as optimizing parameters of a neural network. Let's try to understand it with an example.

To grasp the essence of automatic differentiation in PyTorch, consider a composite function represented as a chain of two functions: $ g(u(x)) $. Leveraging PyTorch, we can compute derivatives analytically, exploiting the chain rule. By defining tensor operations and setting requires_grad to true, PyTorch facilitates the computation of derivatives effortlessly. A practical example involves defining quadratic and linear functions, ultimately demonstrating PyTorch's ability to compute derivatives accurately.

In [7]:
import torch

# Example demonstrating automatic differentiation in PyTorch
x = torch.tensor(1.0, requires_grad=True)

def u(x):
    return x * x

def g(u):
    return -u

dgdx = torch.autograd.grad(g(u(x)), x)[0]
print(dgdx)  # Output: tensor(-2.)

tensor(-2.)


### Curve Fitting

The true power of PyTorch shines in scenarios like curve fitting, where we aim to estimate a function based on sampled data points. Suppose we have samples from a curve $ f(x) = 5x^2 + 3 $ and we intend to approximate $ f(x) $ using a parametric function $ g(x, w) = w_0x^2 + w_1x + w_2 $. Through stochastic gradient descent, PyTorch facilitates the minimization of a loss function, paving the way for efficient parameter estimation. By employing PyTorch's tensor operations and optimization modules, we can seamlessly train our model and achieve a close approximation to the true parameters.

In [9]:
import torch

# Example demonstrating curve fitting with PyTorch
w = torch.tensor(torch.randn([3, 1]), requires_grad=True)
opt = torch.optim.Adam([w], 0.1)

def model(x):
    f = torch.stack([x * x, x, torch.ones_like(x)], 1)
    yhat = torch.squeeze(f @ w, 1)
    return yhat

def compute_loss(y, yhat):
    loss = torch.nn.functional.mse_loss(yhat, y)
    return loss

def generate_data():
    x = torch.rand(100) * 20 - 10
    y = 5 * x * x + 3
    return x, y

def train_step():
    x, y = generate_data()

    yhat = model(x)
    loss = compute_loss(y, yhat)

    opt.zero_grad()
    loss.backward()
    opt.step()

for _ in range(1000):
    train_step()

print(w.detach().numpy())


  w = torch.tensor(torch.randn([3, 1]), requires_grad=True)


[[ 4.9902859e+00]
 [-5.0420949e-04]
 [ 3.5755317e+00]]


PyTorch emerges as a formidable tool for machine learning practitioners, offering not only automatic differentiation but also versatile functionalities like curve fitting. With its intuitive interface and robust optimization capabilities, PyTorch empowers users to tackle diverse optimization tasks, ranging from neural network training to curve approximation. As we navigate through the intricacies of PyTorch, it becomes evident that its utility transcends conventional boundaries, making it a cornerstone in the realm of computational science and artificial intelligence.