# 12.b: Finding Linear Patterns

> Data! Data! Data! I can't make bricks without clay.
>
> — [Arthur Conan Doyle](https://en.wikipedia.org/wiki/Arthur_Conan_Doyle)

## 🎯 Learning Objectives

By the end of this notebook, you will be able to:
- Analyze the differences of a constant function (degree 0).
- Use the Method of First Differences to identify linear patterns (degree 1).
- Visualize a sequence and its differences using plots.

## 📚 Prerequisites

This notebook builds on concepts from the previous lesson. Before you begin, make sure you are comfortable with:
- Concepts from [Notebook 12.a: Functions, Sequences, and Plots](https://colab.research.google.com/github/sguy/programming-and-problem-solving/blob/main/notebooks/12.a-functions-sequences-and-plots.ipynb), including passing functions as arguments and basic plotting.

*Estimated Time: 30 minutes*

---

[Return to Table of Contents](https://colab.research.google.com/github/sguy/programming-and-problem-solving/blob/main/notebooks/table-of-contents.ipynb)

## Introduction: From Data to Rules

In the last notebook, we started with a rule (a Python function) and used it to generate a sequence of data, which we then plotted. We acted like engineers building a model.

Now, we're going to put on our detective hats and do the reverse. We will start with a sequence of data and try to uncover the hidden mathematical rule that created it. This is a fundamental skill in science, finance, and engineering, where the goal is to find a model that explains the data you observe.

Our primary tool will be the **Method of Differences**. By calculating the difference between consecutive data points, we can reveal the underlying pattern.

### What is the Method of Differences?

It sounds complicated, but the idea is simple. You start with a sequence of numbers, which we can call `T`. You find the difference between consecutive terms to create a new, shorter sequence, which we call the first difference, $d_1$. Then you do it again on $d_1$ to get a $d_2$, and so on, until you find a pattern.

Let's look at an example. Here is a sequence generated by the function $T(n) = n^3$.

| n | T(n) = n³ | $d_1$ | $d_2$ | $d_3$ |
|:-:|:---:|:---:|:---:|:---:|
| 0 | 0 | | | |
| 1 | 1 | 1 | | |
| 2 | 8 | 7 | 6 | |
| 3 | 27 | 19 | 12 | 6 |
| 4 | 64 | 37 | 18 | 6 |
| 5 | 125 | 61 | 24 | 6 |

Notice how we had to go all the way to the **third difference** ($d_3$) to find a constant pattern! The first and second differences were not constant.

This raises the key question that we will answer in this notebook: **What does the number of steps it takes to find a constant difference tell us about the original sequence?**

## A Clue for Our Detective: The Polynomial's "Degree"

The secret to the Method of Differences lies in a concept from algebra called the **degree** of a polynomial. A polynomial is just an expression with variables (like `x`), numbers, and whole number exponents. For example:
$$y = 4x^2 - 5x + 10$$
The **degree** is simply the highest exponent of the variable. In the example above, the exponents are 2 (from $4x^2$), 1 (from $-5x^1$), and 0 (from $10x^0$), so the degree is **2**.

Why is this our big clue? Because a polynomial's degree tells us exactly how many times we need to calculate the differences to find a constant pattern. It's the key that unlocks the secret of the sequence!

- A **degree 0** polynomial (like `y = 15`) will have a constant sequence to begin with.
- A **degree 1** polynomial (like `y = 10x - 2`) will have a constant *first* difference.
- A **degree 2** polynomial (like `y = 4x^2 - 5x + 10`) will have a constant *second* difference. (We'll see this in the next notebook!)

### ✅ Check Your Understanding

What is the degree of each of the following polynomials?

- a) $y = 5x^3 + 2x^2 - 7$
- b) $y = 10x - 2$
- c) $y = 15$
- d) $y = x^4 + x$

<details><summary>Click for the answers</summary>

- a) **Degree 3** (the highest exponent is 3)
- b) **Degree 1** (since $10x$ is the same as $10x^1$)
- c) **Degree 0** (since $15$ is the same as $15x^0$)
- d) **Degree 4** (the highest exponent is 4)

</details>

## 🛠️ Our Detective Kit: The Tools

Now that we understand the theory, let's assemble the tools for our investigation. Below is the code for the helper functions we'll use. **You don't need to understand *how* this code works right now.** The important thing is to read the comment above each function to understand *what* it does for us. We can treat them as 'black boxes' and come back to study them later if we're curious.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# A helper to generate a list of y-values from a function and a list of x-values.
def get_function_values(func, domain):
    codomain = []
    for x in domain:
        codomain.append(func(x))
    return codomain

# A helper to calculate the differences between items in a list.
def calculate_differences(sequence):
    if not sequence or len(sequence) < 2:
        return [np.nan] * len(sequence)

    differences = [np.nan]
    for i in range(1, len(sequence)):
        # Check if either value is nan before appending
        if np.isnan(sequence[i]) or np.isnan(sequence[i-1]):
            differences.append(np.nan)
        else:
            differences.append(sequence[i] - sequence[i-1])
    return differences

## Case Study: A Cell Phone Plan

Let's use a modern, concrete example. Imagine a cell phone plan that has two parts:

1. A **$25 per month** base fee that you have to pay no matter what.

2. An extra charge of **$10 per gigabyte (GB)** for any data you use.

First, let's consider the simplest case: what if you don't use any data? Your bill is always $25. This is a **constant function (degree 0)**.

In [None]:
# A sequence representing a $25 bill for 5 months
y_constant_bills = [25, 25, 25, 25, 25]

# Calculate the first difference
d1_constant = calculate_differences(y_constant_bills)

print(f"Original Sequence: {y_constant_bills}")
print(f"First Difference:  {d1_constant}")

As expected, the first difference is `[nan, 0.0, 0.0, 0.0, 0.0]`. Since the value isn't changing, the rate of change is zero.

> 💡 A sequence is **constant (degree 0)** if and only if its first difference is zero.

### The Linear Case (Degree 1)

Now let's look at the more interesting case where you use a different amount of data each month. The rule is `Cost = $10 * GB + $25`. This is a linear function.

In [None]:
# 1. Define the rule as a function
def calculate_bill(gb_used):
    return 10 * gb_used + 25

# 2. Define the domain (GBs used each month)
x_gb_used = [0, 1, 2, 3, 4, 5]

# 3. Generate the sequence of bills
y_linear_bills = get_function_values(calculate_bill, x_gb_used)

# 4. Calculate the first difference
d1_linear_bills = calculate_differences(y_linear_bills)

print(f"GBs Used: {x_gb_used}")
print(f"Monthly Bills: {y_linear_bills}")
print(f"First Difference:  {d1_linear_bills}")

### 🐍 New Python Tool: Subplot Visualization

A powerful way to compare a sequence to its difference is to plot them together. We can do this using `matplotlib`'s `subplots` feature, which lets us create a figure with multiple plots stacked vertically. This is useful for seeing the relationship between a function and its rate of change.

In [None]:
# Let's visualize the linear sequence and its differences
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(8, 6), sharex=True)
fig.suptitle('Analysis of a Linear Function (Cell Phone Bill)', fontsize=16)

# Plot Original Sequence
ax1.plot(x_gb_used, y_linear_bills, 'o-')
ax1.set_title("Original Sequence (Monthly Bill)")
ax1.set_ylabel("Bill Amount ($)")
ax1.grid(True)

# Plot First Difference
ax2.plot(x_gb_used, d1_linear_bills, 'o-', color='r')
ax2.set_title("First Difference (Cost per GB)")
ax2.set_ylabel("Change in Bill ($)")
ax2.set_xlabel("Gigabytes Used")
ax2.grid(True)

plt.show()

This time, the plot of the first difference is a horizontal line at `10.0`. This constant value is the **rate of change**, or the slope of the line. In our cell phone example, it represents the **$10 cost per GB**.

> 💡 A sequence is **linear (degree 1)** if and only if its first difference is a non-zero constant.

### Callback: Revisiting Our Thrown Ball

Now for the exciting part. In the last notebook, we plotted the *height* of a thrown ball, which produced a curve (a quadratic pattern). What do you think we'll get if we look at the first difference of that sequence? 

The first difference of height with respect to time is **velocity**. Let's analyze the ball's average velocity and see what we find.

In [None]:
# This is the height data from notebook 12.a for t = 0, 1, 2, 3, 4, 5
ball_heights = [10, 55, 90, 115, 130, 135]

# The first difference of height is average velocity
ball_velocities = calculate_differences(ball_heights)

print(f"Ball Heights: {ball_heights}")
print(f"Ball Velocities (d1): {ball_velocities}")

# Now let's find the difference of the velocities
d1_of_velocity = calculate_differences(ball_velocities)
print(f"d1 of Velocity: {d1_of_velocity}")

Amazing! The first difference of the `ball_heights` is `[nan, 45, 35, 25, 15, 5]`. This is the average velocity, and it is clearly **not constant**.

However, when we take the first difference of the **velocity sequence**, we get `[nan, nan, -10, -10, -10, -10]`. A constant! This proves that the velocity of the ball is a **linear function**. This is a perfect example of the Principle of Degree Reduction: we started with a degree-2 quadratic function for height, and its first difference (velocity) is a degree-1 linear function.

### ✅ Check Your Understanding

You are given a sequence of numbers and after calculating its first differences, you get `[nan, 4, 4, 4, 4]`. What can you conclude about the original sequence?

a) It is a constant sequence.
b) It is a linear sequence.
c) It is a quadratic sequence.
d) You cannot determine the pattern from the first differences.

<details><summary>Click for the answer</summary>
The answer is **b) It is a linear sequence**. A constant first difference indicates a linear pattern.
</details>

### 🎯 Mini-Challenge: The Leaky Bucket

A bucket is leaking water at a steady rate. The volume of water (in liters) is measured every minute. The data is recorded in the `water_volume` list below.

Your task is to:
1. Calculate the first differences for this sequence.
2. Based on the result, print whether the leak is linear or not.
3. Explain what the constant difference represents in this context.

In [None]:
water_volume = [10.0, 9.5, 9.0, 8.5, 8.0]

# 1. Calculate the first differences
d1_water = [] # YOUR CODE HERE

print(f"First differences: {d1_water}")

# 2. Check if the result is constant and print your conclusion
# YOUR CODE HERE

# 3. Explain what the difference means in a print statement
# YOUR CODE HERE

<details><summary>Click to see a possible solution</summary>

```python
water_volume = [10.0, 9.5, 9.0, 8.5, 8.0]

# 1. Calculate the first differences
d1_water = calculate_differences(water_volume)
print(f"First differences: {d1_water}")
# Expected output: First differences: [nan, -0.5, -0.5, -0.5, -0.5]

# 2. Check if the result is constant and print your conclusion
# We can see by looking that the difference is a constant -0.5.
print("The leak is linear because the first difference is constant.")

# 3. Explain what the difference means in a print statement
print("The constant difference of -0.5 means the bucket is losing 0.5 liters per minute.")
```

</details>

## 🎉 Well Done!

In this notebook, we used the Method of First Differences to test if a sequence of data was generated by a constant or linear rule. By calculating and plotting the differences, we can quickly identify the nature of the underlying function.

### Key Takeaways
- The first difference of a **constant** sequence is **zero**.
- The first difference of a **linear** sequence is a **non-zero constant**.
- This constant difference is the function's **rate of change** (or slope).

### Next Up: Notebook 12.c: Cracking the Quadratic Code 🚀

But what happens if the first difference *isn't* constant? In our next notebook, [Notebook 12.c: Cracking the Quadratic Code](https://colab.research.google.com/github/sguy/programming-and-problem-solving/blob/main/notebooks/12.c-cracking-the-quadratic-code.ipynb), we'll become data detectives again and see what happens when we take the difference of the differences.

---
[Return to Table of Contents](https://colab.research.google.com/github/sguy/programming-and-problem-solving/blob/main/notebooks/table-of-contents.ipynb)