---
## ReLU Activation Function

### What is ReLU?

**ReLU** (Rectified Linear Unit) is the most popular activation function in modern neural networks.

**Formula:**
$$\text{ReLU}(x) = \max(0, x)$$

**In simple terms:**
- If input > 0: output = input (pass through)
- If input ≤ 0: output = 0 (zero out)

**Advantages:**
1. ✅ Computationally efficient (simple max operation)
2. ✅ Helps mitigate vanishing gradient problem
3. ✅ Introduces non-linearity (crucial for learning complex patterns)
4. ✅ Sparse activation (many zeros)

**Why non-linearity matters:**
Without activation functions, stacking multiple linear layers would be equivalent to a single linear layer. Activation functions allow networks to learn complex, non-linear relationships.

### Example 3.1: Basic ReLU on a Vector (Functional API)

In [None]:
import torch.nn.functional as F

# Example input tensor
x = torch.tensor([-2.0, -0.5, 0.0, 0.5, 2.0])
relu_output = F.relu(x)

print("Input: ", x)
print("ReLU Output: ", relu_output)
print("\n⚡ Notice: negative values → 0, positive values unchanged")

### Example 3.2: Basic ReLU on a Vector (Module API)

**Two ways to use ReLU:**
1. `F.relu(x)`: Functional API (for one-time use)
2. `nn.ReLU()`: Module API (for use in `nn.Sequential` or custom modules)

In [None]:
x = torch.tensor([-2.0, -0.5, 0.0, 0.5, 2.0])
relu = nn.ReLU()

output = relu(x)
print("Input: ", x)
print("After ReLU: ", output)

### Example 3.3: ReLU on a Matrix

ReLU operates **element-wise** on matrices (and tensors of any dimension).

In [None]:
x = torch.tensor([[-1.0, 2.0], [0.0, -3.5]])
output = nn.ReLU()(x)

print("Input:\n", x)
print("\nOutput:\n", output)
print("\n⚡ Element-wise: negatives→0, positives unchanged")

### Example 3.4: ReLU in a Neural Network

In practice, ReLU is placed between linear layers to introduce non-linearity.

**Network flow:** Input → Linear(4→3) → ReLU → Linear(3→2) → Output

In [None]:
model = nn.Sequential(
    nn.Linear(4, 3),
    nn.ReLU(),
    nn.Linear(3, 2)
)

x = torch.randn(1, 4)
output = model(x)

print("Input:", x)
print("Output:", output)

### Example 3.5: Visualizing the ReLU Function

This graph shows:
- **Left side (x < 0)**: flat at 0
- **Right side (x ≥ 0)**: diagonal line (identity)
- **At x = 0**: there's a "kink" or bend

In [None]:
import matplotlib.pyplot as plt

x = torch.linspace(-5, 5, 100)
y = F.relu(x)

plt.plot(x.numpy(), y.numpy(), linewidth=2)
plt.title("ReLU Activation Function")
plt.xlabel("Input")
plt.ylabel("Output")
plt.grid(True)
plt.axhline(y=0, color='k', linewidth=0.5)
plt.axvline(x=0, color='k', linewidth=0.5)
plt.show()