# Calculus Made Simple: A Middle Schooler's Guide

**What is this notebook about?**

Everything around you is *changing*. Your phone battery is draining. A basketball is arcing through the air. Your follower count is (hopefully) going up. Calculus is just a set of tools for understanding **how things change** and **finding the best answer** to a problem.

You don't need to know anything fancy to start. If you can multiply, divide, and read a graph, you're ready. We'll build up every idea from scratch.

> **Audience:** Middle schoolers (or anyone who wants a jargon-free introduction!)  
> **What you'll need:** Python with numpy and matplotlib (already set up here).

In [None]:
# Tools we'll use — just two libraries
import numpy as np                # for math stuff
import matplotlib.pyplot as plt   # for drawing graphs

print("All set! Let's learn about how things change.")

## What is a Rate of Change?

Imagine you check your phone battery a few times during the day. You write down the numbers:

| Time (hours) | Battery % |
|:---:|:---:|
| 0 | 100 |
| 1 | 85 |
| 2 | 72 |
| 3 | 61 |
| 4 | 52 |
| 5 | 45 |

A natural question: **"How fast is my battery dying?"**

That question — "how fast is something changing?" — is the *entire* starting point of calculus. The answer is called a **rate of change**. Let's compute it.

In [None]:
# Your phone battery over time
hours = [0, 1, 2, 3, 4, 5]
battery = [100, 85, 72, 61, 52, 45]

# How fast is it draining each hour?
for i in range(1, len(hours)):
    drain_rate = battery[i] - battery[i-1]
    print(f"Hour {hours[i]}: Battery dropped by {abs(drain_rate)}%")

### Making it Precise

The numbers above tell us the *average* drain per hour. But what if we want the **exact** speed at one specific moment?

Think about a car:
- The **odometer** tells you how far you've gone in total — that's like the battery readings above.
- The **speedometer** tells you how fast you're going *right now* — that's what we want!

The trick to getting an exact speed: **zoom in really, really close.** Instead of looking at a whole hour, look at a tiny, tiny fraction of a second. The closer you zoom in, the more exact your speed reading becomes.

Let's write a little function that does that "zoom in" trick for us.

In [None]:
import numpy as np

def how_fast_is_it_changing(rule, x, tiny_step=0.0001):
    """Find the exact rate of change at a point by zooming in really close."""
    return (rule(x + tiny_step) - rule(x - tiny_step)) / (2 * tiny_step)

# Example: A ball thrown in the air
# Height = -5t² + 20t  (it goes up then comes down)
ball_height = lambda t: -5*t**2 + 20*t

times = [0, 1, 2, 3, 4]
for t in times:
    speed = how_fast_is_it_changing(ball_height, t)
    print(f"At t={t}s: height={ball_height(t):.1f}m, speed={speed:.1f} m/s")
    if speed > 0:
        print("  ↑ Ball is going UP")
    elif speed < 0:
        print("  ↓ Ball is going DOWN")
    else:
        print("  ★ Ball is at the TOP! Speed is zero!")

### The Mathematician's Name for This

Mathematicians call the exact rate of change at a single point the **derivative**. That's it — just a fancy name for "how fast is it changing right now."

- When you say "the ball is going 10 m/s upward at t=1," you've found the **derivative** at t=1.
- When you say "my battery is draining at 15% per hour right at this moment," that's also a derivative.

From now on we'll use both names — "rate of change" and "derivative" — but they mean the same thing.

Let's draw the ball's flight path and mark the speed (derivative) at a few points. The red dashed lines are like little speedometer needles — they show which direction the ball is moving and how fast.

In [None]:
import matplotlib.pyplot as plt

t = np.linspace(0, 4, 100)
height = -5*t**2 + 20*t

plt.figure(figsize=(10, 6))
plt.plot(t, height, 'b-', linewidth=2, label='Ball Height')

# Show "speedometer" at a few points
for t0 in [0.5, 2.0, 3.5]:
    h0 = -5*t0**2 + 20*t0
    speed = how_fast_is_it_changing(ball_height, t0)
    # Draw short tangent line
    dt = 0.5
    plt.plot([t0-dt, t0+dt], [h0-speed*dt, h0+speed*dt], 'r--', linewidth=2)
    plt.plot(t0, h0, 'ro', markersize=8)
    plt.annotate(f'Speed: {speed:.1f} m/s', (t0, h0), 
                textcoords="offset points", xytext=(10, 10), fontsize=10)

plt.xlabel('Time (seconds)', fontsize=12)
plt.ylabel('Height (meters)', fontsize=12)
plt.title('Ball Thrown in the Air — with "Speedometer" Readings', fontsize=14)
plt.legend(fontsize=12)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

### Quick Recipes for Common Rates of Change

Before we can understand more advanced stuff, let's learn some shortcuts. Mathematicians figured out simple "recipes" so you don't have to zoom in every single time.

| If the original rule is... | The rate of change (derivative) is... | Why it makes sense |
|:---|:---|:---|
| t squared (t²) | 2 times t (2t) | The higher t is, the faster it grows |
| t cubed (t³) | 3 times t squared (3t²) | Cubic things speed up even faster |
| 5t + 3 | Just 5 | Straight lines have a constant speed |
| A plain number (like 7) | 0 | Constants don't change! |

**The pattern (called the Power Rule):** If the rule is t raised to some power n, the derivative is n times t raised to (n minus 1). In short:

> t to the n --> n times t to the (n-1)

Let's check that these recipes actually give the same answers as our "zoom in" method.

In [None]:
# Let's verify our "recipes" actually work!
rules_and_recipes = [
    ("t²",     lambda t: t**2,   lambda t: 2*t,     "2t"),
    ("t³",     lambda t: t**3,   lambda t: 3*t**2,  "3t²"),
    ("5t + 3", lambda t: 5*t+3,  lambda t: 5.0,     "5"),
]

t = 2.0
print(f"Checking at t = {t}:")
print(f"{'Rule':<10} {'Recipe':<8} {'Recipe says':<12} {'Zoom-in says':<12} {'Match?'}")
print("-" * 55)
for name, fn, recipe, recipe_name in rules_and_recipes:
    recipe_val = recipe(t)
    zoom_val = how_fast_is_it_changing(fn, t)
    match = "Yes!" if abs(recipe_val - zoom_val) < 0.01 else "No"
    print(f"{name:<10} {recipe_name:<8} {recipe_val:<12.4f} {zoom_val:<12.4f} {match}")

## Chained Changes: When One Thing Affects Another

Before we can understand this next idea, let's make sure we're comfortable with something from everyday life: **chain reactions.**

Think about social media:
1. Posting at the right **time** gets you more **likes**.
2. More **likes** get you more **followers**.
3. More **followers** lead to bigger **sponsorship deals**.

So changing *one* thing (posting time) has a ripple effect all the way to your income. The question is: **if I post one hour earlier, how much more money do I make?**

To answer that, you just follow the chain and **multiply the rates at each step**. That's called the **chain rule**, and it's one of the most useful ideas in all of calculus.

In [None]:
# Chain of effects: posting time -> likes -> followers -> monthly income
# Each step has a "multiplier" (rate of change)

# Step 1: Each hour earlier you post, you get 50 more likes
likes_per_hour = 50

# Step 2: Every 100 likes, you get 3 new followers
followers_per_like = 3 / 100  # = 0.03

# Step 3: Every follower is worth $0.10/month
income_per_follower = 0.10

# The chain rule says: multiply all the rates!
total_effect = likes_per_hour * followers_per_like * income_per_follower
print("Chain of Effects:")
print(f"  Posting 1 hour earlier -> {likes_per_hour} more likes")
print(f"  {likes_per_hour} more likes -> {likes_per_hour * followers_per_like:.1f} new followers")
print(f"  {likes_per_hour * followers_per_like:.1f} followers -> ${likes_per_hour * followers_per_like * income_per_follower:.2f}/month more income")
print(f"\nTotal: posting 1 hour earlier = ${total_effect:.2f}/month more income")
print(f"\nThe Chain Rule: {likes_per_hour} x {followers_per_like} x {income_per_follower} = {total_effect}")

### The Chain Rule in One Sentence

> **When effects chain together, you MULTIPLY the rates. That's the chain rule!**

It works for two steps, three steps, or a hundred steps. Just multiply each rate of change together, like links in a chain.

Now let's see what this looks like with actual math functions, not just social media numbers.

In [None]:
# Mathematical version: y = (3x + 2)²
# Think of it as two steps:
#   Step 1: inner = 3x + 2  (multiply by 3 and add 2)
#   Step 2: y = inner²       (square the result)

# Rate of change of step 1: 3 (it's always 3)
# Rate of change of step 2: 2 * inner (recipe for squaring)
# Chain rule: multiply them! -> 2 * (3x + 2) * 3 = 6(3x + 2)

def y(x):
    return (3*x + 2)**2

def y_rate_of_change(x):
    return 6 * (3*x + 2)  # chain rule result

x_val = 1.0
print(f"Function: y = (3x + 2)²  at x = {x_val}")
print(f"Chain rule says rate of change = 6(3x{x_val:.0f} + 2) = {y_rate_of_change(x_val)}")
print(f"Zooming in confirms: {how_fast_is_it_changing(y, x_val):.4f}")
print("They match!")

## Finding the Best Answer: Trial and Error, But Smart

Now for the really cool part. We know how to measure how fast things change (derivatives). Can we use that to **find the best setting for something?**

Imagine you're blindfolded on a hilly field and you want to find the **lowest point** (maybe there's a treasure there!). You can't see, but you CAN feel which way the ground slopes under your feet.

Here's your strategy:
1. Feel the slope under your feet.
2. If it slopes downhill to the left, take a step left.
3. If it slopes downhill to the right, take a step right.
4. Repeat until the ground feels flat — you've found the bottom!

This strategy has a fancy name: **gradient descent**. "Gradient" means slope, and "descent" means going down. So it just means "follow the slope downhill."

This is *exactly* how AI learns. Seriously — every time your phone's autocorrect gets a little smarter, it's doing this.

In [None]:
# Finding the best score in a game
# Score depends on your strategy setting: score = (setting - 7)²
# Lower is better! (like golf)
# Best setting is 7, but we don't know that -- let's find it!

score_function = lambda setting: (setting - 7)**2
score_slope = lambda setting: 2 * (setting - 7)

# Start with a guess
current_setting = 0
step_size = 0.3  # How much we adjust each time
history = [(current_setting, score_function(current_setting))]

print("Finding the best game setting...")
print(f"{'Try':<5} {'Setting':<10} {'Score':<10} {'Slope':<10} {'Action'}")
print("-" * 50)

for attempt in range(15):
    score = score_function(current_setting)
    slope = score_slope(current_setting)
    
    if slope < 0:
        action = "-> Move UP (slope says go right)"
    elif slope > 0:
        action = "<- Move DOWN (slope says go left)"
    else:
        action = "FOUND IT!"
    
    print(f"{attempt+1:<5} {current_setting:<10.2f} {score:<10.2f} {slope:<10.2f} {action}")
    
    # Smart adjustment: move opposite to the slope
    current_setting = current_setting - step_size * slope
    history.append((current_setting, score_function(current_setting)))

print(f"\nBest setting found: {current_setting:.2f} (actual best: 7)")

In [None]:
settings = np.linspace(-2, 14, 100)
scores = [(s - 7)**2 for s in settings]

fig, ax = plt.subplots(figsize=(10, 6))
ax.plot(settings, scores, 'b-', linewidth=2, label='Score (lower = better)')

# Plot the path
history_settings = [h[0] for h in history]
history_scores = [h[1] for h in history]
ax.plot(history_settings, history_scores, 'ro-', markersize=6, label='Our guesses')

# Number the first few
for i in range(min(5, len(history))):
    ax.annotate(f'Try {i+1}', (history_settings[i], history_scores[i]),
               textcoords="offset points", xytext=(5, 10), fontsize=9)

ax.set_xlabel('Strategy Setting', fontsize=12)
ax.set_ylabel('Score (lower is better)', fontsize=12)
ax.set_title('Finding the Best Setting by Following the Slope', fontsize=14)
ax.legend(fontsize=11)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

### How Big Should Your Steps Be?

There's one important detail: **how big of a step should you take each time?** This is called the **step size** (or "learning rate" in AI).

Think of it like adjusting the volume on your phone:

- **Too cautious** (tiny step size): You turn the knob one tiny click at a time. You'll eventually get to the right volume, but it takes forever.
- **Just right** (medium step size): You make reasonable adjustments. You find the right volume quickly.
- **Too wild** (huge step size): You slam the knob back and forth and never settle on the right volume. You keep overshooting!

Let's see this in action with our game setting example.

In [None]:
fig, axes = plt.subplots(1, 3, figsize=(15, 4))

step_sizes = [0.01, 0.3, 0.95]
labels = ['Too Cautious (0.01)', 'Just Right (0.3)', 'Too Wild (0.95)']

for ax, step_size, label in zip(axes, step_sizes, labels):
    x = 0.0
    path = [x]
    for _ in range(20):
        slope = 2 * (x - 7)
        x = x - step_size * slope
        path.append(x)
    
    t_vals = np.linspace(-2, 14, 100)
    ax.plot(t_vals, (t_vals-7)**2, 'b-', alpha=0.3)
    ax.plot(path, [(p-7)**2 for p in path], 'ro-', markersize=4)
    ax.set_title(label, fontsize=11)
    ax.set_ylim(-5, 100)
    ax.grid(True, alpha=0.3)

plt.suptitle('Step Size Matters!', fontsize=14, y=1.02)
plt.tight_layout()
plt.show()

## Why This Matters for AI

Everything we've done in this notebook is exactly what happens inside AI systems:

1. **Derivatives (rates of change)** tell the AI how its mistakes change when it tweaks its settings.
2. **The chain rule** lets the AI figure out how a change at the very beginning (the input) ripples through hundreds of steps to affect the final answer. This is called **backpropagation** in AI, but it's just the chain rule used over and over!
3. **Gradient descent** is how the AI actually learns. It starts with random settings, checks how bad its answers are, uses the slope to figure out which way to adjust, and repeats millions of times.

When your phone's autocorrect gets better at predicting what you'll type next, it's doing **gradient descent** — following the slope to find better and better settings, one small step at a time.

When a photo filter identifies your face, it learned to do that by using **the chain rule** to trace back through its layers and figure out which settings to change.

You now know the core ideas behind all of it.

## What You Learned Today

Here's a quick cheat-sheet connecting the simple words we used to the official math words:

| What we called it | Official math name | What it means |
|:---|:---|:---|
| Rate of change | **Derivative** | How fast something is changing at one moment |
| Zooming in really close | **Taking a limit** | Making the time gap smaller and smaller |
| Recipes (like t squared -> 2t) | **Differentiation rules** | Shortcuts so you don't have to zoom in every time |
| Multiplying rates in a chain | **Chain rule** | How to handle functions inside functions |
| Following the slope downhill | **Gradient descent** | A method for finding the lowest point |
| Step size | **Learning rate** | How much you adjust each time |
| The slope at a point | **Gradient** | The direction and steepness at that point |

**The big picture:** Calculus is about change. Derivatives measure change. The chain rule handles chains of change. Gradient descent *uses* change to find the best answer. And AI uses all three, millions of times, to learn.

## Transformation Notes (Appendix)

This notebook was adapted for a **Middle Schooler** audience using the following principles:

**Language choices:**
- Replaced all formal notation (limits, sigma, epsilon) with plain English
- Used "rate of change" before introducing the word "derivative"
- Used "zooming in" instead of "taking a limit"
- Used "recipe" instead of "differentiation rule"
- Used "slope" instead of "gradient" (then connected the terms at the end)

**Examples chosen for relatability:**
- Phone battery draining (everyone has a phone)
- Throwing a ball in the air (played in PE class)
- Social media likes/followers/income (relatable ambition)
- Game settings optimization (gamers understand tuning)
- Phone volume knob (intuitive step-size analogy)

**Scaffolding strategy:**
- Every section starts with a concrete example before any formula
- The "zoom in" function is built once and reused, so students see it working before trusting it
- Recipes are verified numerically so students don't have to take them on faith
- Chain rule is introduced with real-world multipliers before abstract functions
- Gradient descent is framed as a game before connecting it to AI

**What was NOT included (by design):**
- Formal limit notation
- Sigma/integral notation
- Proofs or formal derivations
- Epsilon-delta definitions
- Multi-variable calculus (kept to single-variable throughout)