#### Autograd
- Autograd is a automatic differentiation tool used to calculate derivatives (for training Neural Networks) in pytorch.
- Autograd is a core component of pytorch that provides automatic differentiation for tensor operations. It enables gradient computation, which is essential for training machine learning modes using optimization algorithms like gradient descent.

- Autograd solves the problem of differentiating nested functions using chain rule.

In [1]:
import torch;
import math;

In [2]:
# derivative function of y=x**2
def dy_dx(x):
    return 2*x;
x = 55;
print(f"dy/dx at x={x}: {dy_dx(x)}");

dy/dx at x=55: 110


#### Training Process of a simple Neural Network:
1. Forward Pass - Compute the output of the network given an input.
2. Calculate Loss - Calculate the loss function to quantify the error.
3. Backward Pass - Compute gradients (partial derivative) of the loss with respect to multiple parameters (weight, bias).
4. Update gradients - Adjust the parameters using an optimization algorithm (ex: gradient descent).

- Neural networks behave like a nested function.

In [3]:
# Autograd example 1
# Leaf tensor/node - Input x, Root tensor/node - Output y=f(x)

x = torch.tensor([4], dtype=torch.float32, requires_grad=True);
# y=f(x)=x**2
y = x**2;
print("x:", x, "y:", y);

# Start calculating dy/dx
y.backward(retain_graph=True);
# Get dy/dx at x=4
print(x.grad);

# sets requires_grad to False for x.
x.requires_grad_(False);
x[0] = 10;
print("x:", x);
# y.backward();  # not allowed
print(x.grad);



x: tensor([4.], requires_grad=True) y: tensor([16.], grad_fn=<PowBackward0>)
tensor([8.])
x: tensor([10.])
tensor([8.])


In [4]:
# Autograd example 2
# Leaf tensor/node - Input x, Root tensor/node - Output z=f(y), Intermediate tensor/node - y=f(x)
# In a computational graph, gradients are not automatically computed for intermediated nodes. (pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html)

x = torch.tensor([4], requires_grad=True, dtype=torch.float32);
# y=f(x)=x**2
y = x**2;
# z=f(y)=sin(y)
z = torch.sin(y);

print("x:", x, "y:", y, "z:", z);

# Start calculating dz/dx
z.backward();
# Get dz/dx at x=4
print(x.grad);



x: tensor([4.], requires_grad=True) y: tensor([16.], grad_fn=<PowBackward0>) z: tensor([-0.2879], grad_fn=<SinBackward0>)
tensor([-7.6613])


#### Autograd example 3
##### Data set of a student's cgpa and whether he is placed (1) or not (0)
| cgpa (x) | is_placed (y) |
| ----- | ----- |
| 6.7 | 0 |
| 7.6 | 1|
| 8.5 | 1 |

1. Linear Transformation (z):
<center> $z = w.x + b$ where `w` = weight & `b` = bias </center>


2. Activation (Sigmoid function) ($\hat{y}$ OR $y_{pred}$):
<center> $\hat{y} = y_{pred} = \sigma(z) = \frac{1}{1 + e^{-z}}$ </center>

3. Loss Function (Binary Cross-Entropy Loss) (L):
<center> $L = -[y.ln(y_{pred}) + (1 - y)ln(1 - y_{pred})]$ </center>

In [37]:
# sample input
x = torch.tensor(6.7, dtype=torch.float32);
# expected output
y = torch.tensor(0, dtype=torch.float32);

# assumed weight, bias
w = torch.tensor(1, dtype=torch.float32, requires_grad=True);
b = torch.tensor(0, dtype=torch.float32, requires_grad=True);

In [38]:
z = w*x + b;
# predicted output
y_pred = torch.sigmoid(z);

print(z);
print(y_pred);
print(f"{y_pred}");

tensor(6.7000, grad_fn=<AddBackward0>)
tensor(0.9988, grad_fn=<SigmoidBackward0>)
0.998770534992218


In [39]:
# binary cross entropy loss
def binary_cross_entropy_loss(pred, target):
    int_small = 1e-8;
    prediction = torch.clamp(pred, max=(1 - int_small), min=int_small);
    return -(target * torch.log(prediction) + ((1 - target) * torch.log(1 - prediction)));

loss = binary_cross_entropy_loss(y_pred, y);
print(loss);
print(f"Loss: {loss}");

tensor(6.7012, grad_fn=<NegBackward0>)
Loss: 6.701176166534424


In [40]:
loss.backward();

print(w.grad, f"{w.grad}");
print(b.grad, f"{b.grad}");

tensor(6.6918) 6.6917619705200195
tensor(0.9988) 0.9987704753875732
