## 📚 Gradient Descent: What is it?

Gradient Descent is an optimization algorithm that minimizes a loss function by iteratively updating model parameters in the opposite direction of the gradient:

$$
\theta = \theta - \eta \cdot \nabla_{\theta} J(\theta)
$$

Where:

* $\theta$: model parameters (e.g., weights, bias)
* $\eta$: learning rate
* $\nabla_{\theta} J(\theta)$: gradient of the loss function with respect to $\theta$

---

## ✅ Example: Gradient Descent for Linear Regression

### 🔹 Problem:

Fit a line $y = wx + b$ to some data points using gradient descent.

---

### 🧱 Step-by-Step Implementation

In [1]:
import numpy as np

# 1. Generate dummy data
X = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])  # true function is y = 2x

# 2. Initialize weights
w = 0.0
b = 0.0

# 3. Set hyperparameters
learning_rate = 0.01
epochs = 10

# 4. Gradient Descent Loop
for epoch in range(epochs):
    # Forward pass: compute prediction
    y_pred = w * X + b
    
    # Compute loss (MSE)
    loss = np.mean((y - y_pred) ** 2)
    
    # Compute gradients
    dw = -2 * np.mean((y - y_pred) * X)
    db = -2 * np.mean(y - y_pred)
    
    # Update weights
    w -= learning_rate * dw
    b -= learning_rate * db

    # Print every 2 epochs
    if epoch % 2 == 0:
        print(f"Epoch {epoch}: Loss = {loss:.4f}, w = {w:.4f}, b = {b:.4f}")

Epoch 0: Loss = 44.0000, w = 0.4400, b = 0.1200
Epoch 2: Loss = 14.9735, w = 1.0326, b = 0.2804
Epoch 4: Loss = 5.1157, w = 1.3783, b = 0.3725
Epoch 6: Loss = 1.7676, w = 1.5802, b = 0.4247
Epoch 8: Loss = 0.6302, w = 1.6982, b = 0.4538
