<div style="text-align:left;">
  <a href="https://code213.tech/" target="_blank">
    <img src="code213.PNG" alt="code213">
  </a>
  <p><em>prepared by Latreche Sara</em></p>
</div>

# Autograd in PyTorch  

One of PyTorch’s most powerful features is **autograd**, its automatic differentiation engine.  

Autograd enables PyTorch to:  
- **Record operations** performed on tensors with `requires_grad=True`.  
- **Build a computational graph dynamically**, so we don’t need to define it manually.  
- **Compute gradients automatically** using the `.backward()` method.  
- Provide gradients through the `.grad` attribute, which are essential for **backpropagation** in neural networks.  

In this notebook, we will explore:  
- How to enable gradients with `requires_grad`.  
- How PyTorch builds the computational graph.  
- How to compute gradients with `.backward()`.  
- How gradients are stored and reset.  
- Practical examples of autograd in action.  


## Table of Contents  

- [1 - Enabling Autograd](#1)  
- [2 - Building a Computational Graph](#2)  
- [3 - Computing Gradients with `.backward()`](#3)  
- [4 - Accessing and Resetting Gradients](#4)  
- [5 - Example: Backpropagation in Action](#5)  
- [6 - Practice Exercises](#6)  


<a name='1'></a>
## 1 - Enabling Autograd  

<a name='1'></a>
## 1 - Enabling Autograd  

In PyTorch, the **autograd engine** tracks operations performed on tensors so it can automatically compute gradients.  
By default, tensors are created with `requires_grad=False`.  
To enable gradient tracking, we set:  

```python
x = torch.tensor(2.0, requires_grad=True)



In [1]:
import torch

# Tensor without autograd
a = torch.tensor(2.0)
print("requires_grad for a:", a.requires_grad)

# Tensor with autograd enabled
b = torch.tensor(2.0, requires_grad=True)
print("requires_grad for b:", b.requires_grad)

requires_grad for a: False
requires_grad for b: True


<a name='2'></a>
## 2 - Building a Computational Graph  

When we perform operations on tensors with `requires_grad=True`, PyTorch dynamically builds a **computational graph**.  
- Each node in the graph represents a tensor.  
- Each edge represents a function that produced the tensor.  
- PyTorch uses this graph to compute gradients during backpropagation.  

The graph is built **dynamically**, meaning it is created as operations are executed, and discarded after `.backward()` (unless `retain_graph=True` is specified).  

In [2]:
x = torch.tensor(3.0, requires_grad=True)
y = x**2 + 2*x + 1   # A simple quadratic function
print(y)
# x is a leaf node (user-created tensor).

# y is a result tensor computed from x.

# Autograd has built a graph that knows how to compute dy/dx.

tensor(16., grad_fn=<AddBackward0>)


<a name='3'></a>
## 3 - Computing Gradients with `.backward()`  

Once a computational graph is built, we can compute derivatives using the `.backward()` method.  
PyTorch applies the **chain rule of calculus** to propagate gradients back through the graph.  


In [3]:
import torch

# Create a tensor with autograd enabled
x = torch.tensor(3.0, requires_grad=True)

# Define a function y = x^2 + 2x + 1
y = x**2 + 2*x + 1
print("y:", y.item())

# Compute dy/dx
y.backward()

# Gradient stored in x.grad
print("dy/dx at x=3:", x.grad.item())

y: 16.0
dy/dx at x=3: 8.0


<a name='4'></a>
## 4 - Accessing and Resetting Gradients  

After calling `.backward()`, the computed gradients are stored in the `.grad` attribute of each leaf tensor.  
However, **gradients accumulate by default** in PyTorch. This means if you call `.backward()` multiple times without clearing them, the values will add up.  

To avoid this, we must **reset gradients** after each iteration, usually with:  
```python
x.grad.zero_()


In [4]:
import torch

# Create a tensor with gradient tracking
x = torch.tensor(2.0, requires_grad=True)

# First function: y1 = x^2
y1 = x**2
y1.backward()
print("Gradient after first backward:", x.grad.item())

# Without resetting, compute gradient again with y2 = 3x
y2 = 3*x
y2.backward()
print("Gradient after second backward (accumulated):", x.grad.item())

# Reset gradients to zero
x.grad.zero_()
print("Gradient after reset:", x.grad.item())

# Recompute y2 = 3x
y2 = 3*x
y2.backward()
print("Gradient after second backward (fresh):", x.grad.item())


Gradient after first backward: 4.0
Gradient after second backward (accumulated): 7.0
Gradient after reset: 0.0
Gradient after second backward (fresh): 3.0


<a name='5'></a>
## 5 - Example: Backpropagation in Action  

Autograd can handle **functions with multiple inputs and outputs**.  
Let’s see an example where gradients are computed with respect to two variables.  

### Example  

We define the function:  

$$
z = x \cdot y + y^2
$$  

and compute gradients with respect to $x$ and $y$. 

In [5]:
# Create tensors with autograd enabled
x = torch.tensor(2.0, requires_grad=True)
y = torch.tensor(3.0, requires_grad=True)

# Define a function z = x*y + y^2
z = x * y + y**2
print("z:", z.item())

# Backpropagation
z.backward()

# Gradients
print("dz/dx at (x=2,y=3):", x.grad.item())  # ∂z/∂x = y = 3
print("dz/dy at (x=2,y=3):", y.grad.item())  # ∂z/∂y = x + 2y = 2 + 6 = 8

z: 15.0
dz/dx at (x=2,y=3): 3.0
dz/dy at (x=2,y=3): 8.0


<a name='6'></a>
## 6 - Practice Exercises  

Now it’s your turn! Try the following exercises to reinforce your understanding of **Autograd**.  

---

### **Exercise 1: Single Variable Gradient**  
Define the function:  

$$
y = 3x^3 - 4x
$$  

- Create a tensor $x = 2.0$ with `requires_grad=True`.  
- Compute $y$ and use `.backward()` to get $\frac{dy}{dx}$.  
- Verify by hand:  

$$
\frac{dy}{dx} = 9x^2 - 4
$$  

---

### **Exercise 2: Two Variables Function**  
Define the function:  

$$
z = x^2 \cdot y + 5y
$$  

- Create $x = 1.0$, $y = 2.0$ with gradients enabled.  
- Compute $\frac{\partial z}{\partial x}$ and $\frac{\partial z}{\partial y}$.  

---

### **Exercise 3: Gradient Reset**  
- Create $x = 3.0$ with `requires_grad=True`.  
- Define $y = 2x^2$.  
- Call `.backward()` **twice** without resetting gradients.  
- Observe what happens to $x.grad$.  
- Then reset gradients with `x.grad.zero_()` and recompute.  

---

### **Challenge (Optional)**  
Define a function of three variables:  

$$
f(x,y,z) = x \cdot y + y \cdot z + x \cdot z
$$  

- Compute $\frac{\partial f}{\partial x}$, $\frac{\partial f}{\partial y}$, and $\frac{\partial f}{\partial z}$ using autograd.  
