# 03: Calculus for Machine Learning
In this lesson, we’ll explore the calculus behind how machine learning models learn — especially via optimization techniques like gradient descent.

## 🎯 Objectives
- Understand derivatives and gradients
- Apply the chain rule to multivariable functions
- Use calculus to derive loss functions
- Visualize gradient descent

## 📐 Derivatives & Cost Functions

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Define a quadratic cost function
def f(x): return (x - 3)**2
def df(x): return 2 * (x - 3)

x = np.linspace(-1, 7, 100)
y = f(x)
dy = df(x)

plt.plot(x, y, label='f(x)')
plt.plot(x, dy, label="f'(x)", linestyle='--')
plt.axvline(3, color='gray', linestyle=':')
plt.legend()
plt.title("Function and Derivative")
plt.grid(True)
plt.show()

## 🔗 Chain Rule in Action

In [None]:
# Example: f(g(x)) = (2x + 1)^2
# df/dx = df/dg * dg/dx
x = 2
g = 2*x + 1
f = g**2
df_dx = 2 * g * 2
print("df/dx at x=2:", df_dx)

## 🧗 Gradient Descent Demo

In [None]:
# Simple gradient descent
x = 0.0
learning_rate = 0.1
history = []

for _ in range(20):
    grad = df(x)
    x -= learning_rate * grad
    history.append(x)

plt.plot(history, marker='o')
plt.title("Gradient Descent Progression")
plt.xlabel("Iteration")
plt.ylabel("x value")
plt.grid(True)
plt.show()

## ✅ Summary Quiz
1. What does a derivative represent?
2. What does the gradient tell us in multiple dimensions?
3. Why is the learning rate important in gradient descent?