# Part 2: Autograd - The Engine of Learning ðŸš‚

In Part 1, we saw that Tensors are just n-dimensional arrays.

In this notebook, we unlock their true power: **Automatic Differentiation (Autograd)**.

**Why do we need this?**
To train a neural network, we need to know how to adjust its weights to minimize error. This requires calculating gradients (calculus). PyTorch does this calculus for us automatically!

In [None]:
import torch

## 1. Tracking History

By default, tensors don't track their history. We have to tell them to.

In [None]:
# A standard tensor
x = torch.tensor(2.0)

# A tensor we want to optimize (like a weight in a neural net)
w = torch.tensor(3.0, requires_grad=True)

print(f"x requires_grad: {x.requires_grad}")
print(f"w requires_grad: {w.requires_grad}")

## 2. Computational Graph

Any operation we do on `w` will be recorded in a generic graph.

In [None]:
# Let's do some math: y = x * w + 2
# y = 2 * 3 + 2 = 8

y = x * w + 2

print("Result y:", y)
print("Function that created y:", y.grad_fn)

See that `AddBackward0`? PyTorch knows `y` came from an addition.

In [None]:
# Let's go deeper: z = y^2
# z = 8^2 = 64

z = y ** 2
print("Result z:", z)

## 3. Backpropagation (The Magic)

Now we want to know: **How much does `z` change if we change `w`?**

Mathematically, we want $\frac{dz}{dw}$.

We could do this by hand (chain rule):
1. $z = y^2 \rightarrow \frac{dz}{dy} = 2y = 16$
2. $y = xw + 2 \rightarrow \frac{dy}{dw} = x = 2$
3. $\frac{dz}{dw} = \frac{dz}{dy} \cdot \frac{dy}{dw} = 16 \cdot 2 = 32$

Or... we can just call `.backward()`!

In [None]:
# Compute gradients
z.backward()

# Check the gradient of w
print(f"The gradient dz/dw is: {w.grad}")

Boom! It matches our manual calculation (32). 

**Note:** `x.grad` will be None because we didn't set `requires_grad=True` for it.

## 4. Stopping Autograd

When we are just using the model to make predictions (Inference), we don't need to track gradients. It wastes memory. 

We use `torch.no_grad()` to temporarily disable it.

In [None]:
print(f"Before context: {w.requires_grad}")

with torch.no_grad():
    y_new = w * 5
    print(f"Inside context: {y_new.requires_grad}")
    
print(f"After context: {w.requires_grad}")

## ðŸ§  Summary

1. **`requires_grad=True`** enables gradient tracking.
2. **`.backward()`** computes gradients automatically.
3. **`.grad`** stores the computed gradient.
4. **`torch.no_grad()`** disables tracking for efficiency.

Next up: **Linear Regression** - Building our first actual model!