# Mathematical induction 

## Overview 

__Summary:__ This lesson is a review of _proof by mathematical induction_, a technique for proving that a mathematical proposition that involves recursion, is always true. We discuss the idea of a _predicate_, the _induction hypothesis_ and _induction step_ in a mathematical induction proof, and two variations on the basic induction technique called _strong induction_ and _structural induction_. 

---

## Introduction

Consider the following sequence of integers: 

$$1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, \dots$$

This sequence is known as the the _Fibonacci sequence_ (due to its inventor, who [came up with the idea when modeling the breeding of rabbits](http://science.jrank.org/pages/2705/Fibonacci-Sequence-History.html)). The Fibonacci sequence has a distinctive pattern: 

+ The first two sequence terms, let's call them $F_1$ and $F_2$, are both equal to $1$. 
+ Each of the third and subsequent sequence terms ($F_3, F_4, F_5, \dots$) are equal to the sum of the previous two terms. For example $F_3$ is equal to $F_1 + F_2$; $F_4 = F_3 + F_2$; and so on. 

As a _recurrence relation_ we can write a _recursive definition_ of the Fibonacci sequence like this:

$$F_n = F_{n-1} + F_{n-2} \ (\text{for} \ n \geq 3) \qquad F_1 = 1, F_2 = 1$$

The Fibonacci sequence is inherently recursive because __each part (except the beginning) is built out of previously-computed parts__. Here is some Python code that finds $F_n$ for any positive integer $n$, using recursion: 

In [3]:
# Download this file using the icon in the upper-right of the screen 
# and put it in your Jupyter installation if you'd like to play with 
# this code. 
# Warning: It slows down a lot for larger values of n. 

def fib(n):
    if n == 1 or n == 2: 
        return 1
    else: 
        return fib(n-1) + fib(n-2)

The Fibonacci numbers have a lot of really interesting properties and patterns. In fact [there is an entire scholarly journal](http://www.fq.math.ca/) devoted solely to the study of the Fibonacci numbers! 

Let's suppose that I wanted to give you a quick oral exam to see whether you could _prove_ to me that you really understood how the Fibonacci sequence works. Your grade on this exam is dependent on whether you can convince me, beyond all doubt, that you know how to generate the elements of this sequence. And suppose I am really skeptical and hard to convince. 

One way that _wouldn't_ work is if you just recited the first few numbers in the sequence -- that is, you just said that the Fibonacci numbers are 1, 1, 2, 3, 5 "and so on". This doesn't  prove, conclusively, that you understand the Fibonacci numbers because _anybody_ could just memorize the first five elements of the sequence. This doesn't show me that you understand the essence of how this sequence is constructed. 

Another way that wouldn't work is if you recited a _longer_ list of numbers in the sequence -- that is, if you said that the Fibonacci numbers are 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89 "and so on". While this is harder to do than reciting the first five numbers in the sequence, it's still just memorization, and anybody with good memory could do it without having a clue of how the Fibonacci sequence works. 

In fact, _no amount of examples is enough_ to prove to me that you understand the sequence if all you are doing is giving examples. Because, how do I know that you understand the math, and not just that you have an excellent memory? 

A third way that wouldn't work would be for you to list _all_ the Fibonacci numbers. This won't work because there are infinitely many of them, and we'd both be dead before you really even got started. 

What _would_ demonstrate sufficient evidence that you understand the Fibonacci numbers? Well, what would have to happen is that you would __explain how ANY Fibonacci number regardless of its location in the sequence is built__. That is, proof that you understand the Fibonacci sequence would require not a demonstration of examples but an _explanation of why_ the examples are what they are, or perhaps you could _show me how_ to find any Fibonacci number. (Now this sounds like MTH 325, where the primary emphases are on _explaining why_ and _showing how_.) 

Here's how I, as an instructor, might structure this hypothetical exam, that would convince me utterly that you know how to generate this sequence. It would go in two parts: 

1. First, I'd ask you to give me a list of the first three Fibonacci numbers. This would show me that you know how the sequence begins. 
2. Then, I'd pick a random positive integer, let's call it $k$, and give you a list of the first $k$ Fibonacci numbers; and then I'd ask you to explain how to get the _next_ one (that is, the $(k+1)$-st one). 

Why would this convince me, the ultimate skeptic? Well, the first part would show me that you know how to _begin_ listing the Fibonacci numbers --- how to start the listing process. And the second part, where the real meat of this argument resides, would show me that you're not just memorizing because it would show me that you could start _anywhere_ in the sequence and go from the previously-computed sequence to the _next element_. Putting those two parts together, I know you could list from memory the numbers $F_1$, $F_2$, and $F_3$; then using the second part I know you could compute $F_4$. Then using the second part I know you could compute $F_5$. Then using the second part I know you could compute $F_6$. And so on. You don't have to _actually list_ those numbers; you've _shown how to make the list_ from an arbitrary starting point, and that amounts to the same thing. 

Therefore given enough time, you could build any Fibonacci number in the list, and it doesn't depend on memorizing. 

## Why induction is important for computer scientists 

The above example is typical of how you, as a computer scientist, will need to _reason about computing_ both now and in the future. For example, you will be called upon in your career to write code or design algorithms that claim to accomplish a certain task -- say, for example, quickly searching a database to find a record. You can write the code, and it may _appear_ to work based on a few examples that you run, but to make this code really workable in the real world there are tough questions you have to answer such as 

+ How do you know that your program _always_ produces the correct output _any time_ the correct input is put in? 
+ How do you know that your program _always_ terminates? 
+ How do you know that your program _always_ runs efficiently? 

Just like how merely reciting a list of Fibonacci numbers does not mean that you understand the Fibonacci numbers, merely producing a list of correct outputs -- even a maassive one -- does not prove that your program _always_ works. How do we know that the _next_ example, the one you _didn't_ try, doesn't crash the program? Something more is needed in order to make that knowledge certain. That "something" is a __proof__, which was the subject of [the lesson for Learning Target P.1](https://goo.gl/YAlpsO). 

Proofs that pertain to _recursive structures_ like the Fibonacci numbers --- in which the structure is built out of previously-computed or "smaller" versions of that structure --- have a special place in computer science, and almost always they involve the concept of __proof by mathematical induction__. Keep in mind: 

>__Mathematical induction is typically used to prove things about recursive structures.__

Before we get into mathematical induction, we need to review one piece of logic that plays a central role. 


## Predicates

In the [lesson for Learning Target P.1](https://goo.gl/YAlpsO) we reviewed the idea of a _proposition_:

>Definition: A __proposition__ is a complete, well-formed sentence that has a definite truth value of _True_ or _False_.

Not all logical expressions we choose to work with are propositions exactly. For example, consider this sentence: 

>The number $n$ is even. 

This is not a proposition because although it's a well-formed complete sentence, it does not have a definite truth value. Is $n$ even? We don't know, because we don't know what $n$ is. In fact this sentence is true for _some_ values of $n$ but false for others. The truth value depends on the value of $n$. The sentence contains a __variable__ ($n$) and since we lack information about that variable, we cannot conclude that the statement is true or that it's false. We say that the variable is _unquantified_. 

However, if we _quantified_ the variable --- that is, we gave complete information about the value or values of $n$ --- then we _could_ determine whether that statement above is true. Here's an example of where we quantify the variable:

>If $n = 25$, the number $n$ is even. 

Here, we've specified the scope of the values of $n$ --- we've "quantified" the variable. (In fact we are really just plugging in a single value for it.) The resulting statement is definitely false! The fact that we can say so, means that the quantified statement is now really a proposition. 

Here's another example of quantifying the variable: 

>If $n = F_{21}$ (the 21st Fibonacci number), then $n$ is even. 

This also completely specifies the scope of the variable. Is the statement now true or false? We'll we'd need to go figure out what $F_{21}$ equals. A [quick check](https://www.mathsisfun.com/numbers/fibonacci-sequence.html) says that $F_{21} = 10946$. That's even, so the statement in this case is true. Because we could state its truth value, it means the statement is a proposition now. 

So the original statement is true for some $n$'s and false for others. That means the statement "The number $n$ is even", is sort of like a _function_: We plug in a value of $n$, and we get a _Boolean_ as the output --- either `True` or `False`. We could actually write code for this: 

In [5]:
def my_statement(n):
    if n % 2 == 0:   # <-- This is the "mod" operator. 
        return True
    else:
        return False

Statements like "The number $n$ is even" --- which attain true/false values once a value of the variable is plugged in --- have a special name:

>Definition: A __predicate__ is a complete, well-formed English sentence that contains one or more _variables_, that takes on a definite True or False value once the variable has been assigned a value or range of values.

Another way of thinking about a predicate, is that it is a function from a set into the set {True, False}. That is, it's a function where you plug something in, and get a Boolean value out. 

Here are some examples of predicates other than the one above: 

+ __The number $\sqrt{x}$ is a rational number.__ The variable here is $x$. This statement is true for some values of $x$ (like $x = 9$ because $\sqrt{9} = 3$ which is rational) but false for others (like $x = 10$ because $\sqrt{10}$ is irrational). 
+ __The number $n!$ is greater than $2^n$.__ The variable here is $n$. This statement is true for some values of $n$ (example: $n = 5$, because $5! = 120$ and $2^5 = 32$) but not true for other values of $n$ (example: $n = 3$, because $3! = 6$ and $2^3 = 8$). 
+ __The Fibonacci number $F_{3n}$ is even.__ The variable here is $n$. This statement is true for some values of $n$; for example if $n=5$ then we would look at $F_{15}$, which is an even number (610). It's also true for other examples, like $n = 6$ (look at $F_{18}$) and $n=7$ (look at $F_{21}$). 

About that third predicate: In fact this predicate _appears_ to be true _for all_ positive integers $n$. We can make a simple table and record some of the results: 

| n  | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|$F_{3n}$ | $F_3 = 2$ | $F_6 = 8$ | $F_9 = 34$ | $F_{12} = 144$ | $F_{15} = 610$ | $F_{18} = 2584$ | $F_{21} = 10946$ | $F_{24} = 46368$ | 

Of course this is just a list of eight examples, and so it's not definitive, but it certainly does suggest that $F_{3n}$ is even _every time_ $n$ is a positive integer. 

We can phrase this as a conjecture (= a mathematically educated guess based on evidence): 

>__Conjecture:__ For all positive integers $n$, the Fibonacci number $F_{3n}$ is even.

Notice that __this is just the predicate, with a quantifier in front of it__. By quantifying the variable over a range of values (in this case, "all positive integers") we are transforming a predicate --- which does not have a definite truth value --- into a _proposition_ which _does_ have a definite truth value. So we can quantify variables in a predicate not only by plugging in specific numbers but also by specifying a range of variable values over which we think the predicate is true. 

But, is this conjecture actually true? If so, how do we explain why? Now we get to the heart of the lesson. 

## Proof by mathematical induction

__Mathematical induction__ is a technique for proving statements that 

1. Are about phenomena that are recursive in nature, and 
2. Can be phrased as a claim that a predicate is true over a range of values. 

These two items an be used as cues for deciding to prove a statement using mathematical induction. For example, "For all positive integers $n$, the Fibonacci number $F_{3n}$ is even" is a  proposition that would be a good candidate for proof by mathematical induction because (1) we've seen that the Fibonacci numbers are recursive in nature, and (2) it's claiming that a predicate is true over a range of values. On the other hand, a proposition like this one: 

>Proposition: For all positive integers $a,b$ and $c$, if $a$ divides $b$ and $b$ divides $c$, then $a$ divides $c$. 

...would _not_ be a likely candidate for proof by mathematical induction mainly because integer divisibility does not have any obvious connection to recursion. (However there _is_ a predicate involved here: the statement "if $a$ divides $b$ and $b$ divides $c$, then $a$ divides $c$".) 

In a situation where you are going to use a proof by mathematical induction, __the first order of business is to identify what the predicate is, and what the variable is__. What statement is being claimed to be true over the range of values given? If the variable is "$n$", we often use $P(n)$ to denote the predicate --- which is a way of using the notion that a predicate is actually a function that sends values of $n$ into the set {True, False}. Here are some examples: 

| Proposition | Variable | The predicate $P(n)$ | 
| :---- | :----: | :--- | 
| For all positive integers $n$, the Fibonacci number $F_{3n}$ is even. | $n$ | $P(n) = $ "The Fibonacci number $F_{3n}$ is even" | 
| For each natural number $a \geq 4$, $a! > 2^a$. | $a$ | $P(a)=$ "$a! > 2^a$" | 
| For all positive integers $k$, $1+2+3+\cdots+k = \frac{k(k+1)}{2}$. | $k$ | $P(k) =$ " $1+2+3+\cdots+k = \frac{k(k+1)}{2}$" | 

Looking at the third column note that 

>__The predicate is not a formula. The predicate is a _statement_, either an English sentence or an equation or an inequality__. 

For example in the third example above it would be wrong to say that $P(k)$ is the expression $1+2+3+\cdots+k$. Instead, $P(k)$ is the _statement_ that the left side ($1+2+3+\cdots+k$) equals the proposed right side ($\frac{k(k+1)}{2}$). 

If seems wierd that something called $P(k)$ would equal not a number or an expression, but rather a _statement_, consider the following code snippet, that has something to do with the second example above: 

In [9]:
from math import factorial

def P(a):
    return (factorial(a) > 2**a)

What happens when we plug in integer values for $a$? What does $P(a)$ return? Run the code block below to find out: 

In [10]:
for a in range(1, 10): 
    print(P(a))

False
False
False
True
True
True
True
True
True


So $P(a)$ is not a formula per se. It is neither $a!$ nor $2^a$. Instead, $P(a)$ is a _statement about whether $a!$ is greater than $2^a$._ It returns `True` if that inequality holds and `False` if it doesn't. This is the case for every predicate. 

Now let's talk about the __framework__ for a proof by mathematical induction. Inductions proofs are nice to work with because they have a very well-defined framework. 

A proof by mathematical induction has __three distinct steps__: 

1. __Base case:__ Show via an example that the predicate is true in the smallest possible case. 
2. __Induction hypothesis:__ Assume that the predicate is true for some arbitrary positive integer, say $k$. 
3. __Inductive step:__ Having assumed the induction hypothesis, prove that the predicate is true for the _next_ positive integer, that is $k+1$. 

In a minute, we'll consider _why_ an induction proof structured along these lines actually works. That is, why should we consider an argument with this structure to be convincing and correct? But first, let's consider an example of a __framework__ for an induction proof. This framework would consist of the following: 

1. __A statement of what needs to be shown for the base case.__
2. __A clear, unambiguous, fleshed-out statement of the induction hypothesis.__ 
3. __A clear, unambgiuous, fleshed-out statement of the inductive step.__ 

For example, here is a completed framework for a proof by induction for the proposition "For every natural number $n$, the Fibonacci number $F_{3n}$ is even". 

__Step 0: Identify the predicate.__ Before the framework really begins, we need to identify the predicate and the name of the variable. As we saw above, the predicate is "the Fibonacci number $F_{3n}$ is even" (basically it's the proposition minus the quantifier) and the variable is called $n$. 

__Step 1: Write out the base case.__ We need to demonstrate that the proposition is true in the smallest possible case. Well, the smallest natural number is $n=1$. So we would need to show that 

>_Base case_: Show that $F_3$ is even. 

(Question: Why doesn't this say "Show that $F_1$ is even"?) 

__Step 2: State the induction hypothesis.__ This is merely restating the predicate for an arbitrary natural number, along with making it clear that we are _assuming_ this: 

>_Induction hypothesis_: Assume that for some natural number $k$, the Fibonacci number $F_{3k}$ is even. 

(Questions: Why does it say "for some natural number" (singular) instead of "for all natural numbers" (plural)? Why are we using natural numbers at all? And why does the last part say $F_{3k}$ instead of $F_{3n}$?) 

__Step 3: State the inductive step.__ This is merely stating the predicate for the next value of the variable, along with making it clear that we are going to _prove_ this: 

>_Inductive step_: Prove that the Fibonacci number $F_{3(k+1)}$ is even. 

(Question: Why does this say $F_{3(k+1)}$ and not $F_{3k+1}$? Note the difference in parentheses.) 


Please note the following important fact: 

>__Once you identify the predicate in the proposition, writing a framework for an induction proof s stupidly easy.__

This is because the base case, inductive hypothesis, and inductive step _are all just instances of the predicate_. On the flip side, if you attempt an induction proof without taking the time to clearly identify the predicate, it often ends badly. 

### Aside: Why does induction work? 

Remember the oral exam from earlier, where I wanted to be convinced you knew how to generate the entire Fibonacci sequence without actually trying to generate it all? The key components were 

- Your ability to start the sequence 
- Your ability to go from an arbitrary point in the sequence to the next step 

Those two parts were sufficient to prove to me that you know the Fibonacci sequence. This is how an induction proof works. The base case shows you can start the process. Then, if you assume that you have continued the process up to a certain point, you then prove you can get to the next point. Those two things put together mean you can, in theory, demonstrate the truth of the proposition _for all_ values of the variable that are claimed to produce truth. 

For more information, see the videos that are included in this assignment separately. 

## A completed proof 

Here is a complete proof of the proposition that "For every natural number $n$, the Fibonacci number $F_{3n}$ is even". See if you can identify the waypoints that are provided by the framework, and question any step you don't understand. 


>__Proof:__ For the base case, we will show that the predicate is true when $n=1$. That is, we show that $F_3$ is even. By definition, $F_3 = 2$ and this is even, so the base case is established. 

>Now for the inductive hypothesis, assume that $F_{3k}$ is even for some positive integer $k$. We want to show that $F_{3(k+1)}$ is even. That is, we want to show that $F_{3k+3}$ is even. By the definition of the Fibonacci sequence, $F_{3k+3}$ is the sum of the previous two Fibonacci numbers: 
$$F_{3k+3} = F_{3k+2} + F_{3k+1}$$
Using the same logic, we can split up $F_{3k+2}$:
$$F_{3k+3} = (F_{3k+1} + F_{3k}) + F_{3k+1}$$
This equals $2 F_{3k+1} + F_{3k}$. By the inductive hypothesis we have assumed that $F_{3k}$ is even. Note that $2 F_{3k+1}$ is even because it is a multiple of 2. And we know* that the sum of two even integers is another even integer. Therefore $F_{3k+3}$ is even, which is what we wanted to show. 

$\ast$ _In the context of MTH 225 and MTH 325, we usually prove that the sum of two even integers is an even integer as a basic exercise; and it's OK to assume the results of such exercises in later proofs, just like you reuse code from earlier projects in new projects._ 

## Two variations on mathematical induction

All of the examples you saw above involve propositions that are claimed to be true over some subset of the natural numbers. But sometimes we encounter propositions that seem amenable to induction but involve different sets. There are two variations on the basic induction theme that address these situations.

### Strong Induction

The flow of an induction proof went like this: 

1. __Base case:__ Show that the predicate is true in the smallest possible case. 
2. __Induction hypothesis:__ Assume that the predicate is true for some arbitrary positive integer, say $k$. 
3. __Inductive step:__ Having assumed the induction hypothesis, prove that the predicate is true for the _next_ positive integer, that is $k+1$. 

Sometimes Step 3 can be an issue because it involves going from an arbitrary step to __the next step__ (from $k$ to $k+1$), and this sometimes gets us into trouble. For example consider this proposition: 

>__Proposition:__ For all positive integers $n \geq 2$, the number $n$ can be written as the product of one or more prime numbers. 

Remember that a _prime number_ is an integer whose factorization is only $1$ and itself, like $7$ or $23$; on the other hand numbers like $8$ and $24$ are not prime because they can factor into other products ($8 = 2 \times 2 \times 2$ and $24 = 8 \times 3$.) 

Everybody "knows" that positive integers can be factored into a product of prime numbers. For example $1998 = 2 \times 3 \times 3 \times 3 \times 37$. You learn this in elementary school, but you've probably never _proved_ it. We can actually prove this with induction, but we have to do things a little differently. 

First let's set up the framework. The predicate $P(n)$ in this case is the statement "the number $n$ can be written as the product of one or more prime numbers". We believe this predicate is true for all positive integers $n$. The framework would look like: 

1. (_Base case_) Show that the number $2$ can be written as a product of prime numbers. 
2. (_Inductive hypothesis_) Assume that for some positive integer $k \geq 2$, that $k$ can be written as a product of prime numbers. 
3. (_Inductive step_) Prove that $k+1$ can be written as a product of prime numbers. 

The base case is easy because $2$ is a prime number itself, and therefore a "product" of prime numbers. Where we run into difficult here is going from the inductive hypothesis to the inductive step. If we assume $k$ can be factored into a product of primes, how on Earth are to conclude anything about the factorization of $k+1$? __There is no natural connection between the prime factorization of $k$ and $k+1$.__ To see this better, look at a few examples: 

| k | Factorization of $k$ | Factorization of $k+1$ | 
|:--: | :---- | :------- | 
| 3 | $3$ (prime) | $2 \times 2$ | 
| 10 | $2 \ times 5$ | $11$ (prime) | 
| 2000 | $2^4 \times 5^3$ |  $5 \times 23 \times 29$ | 

Again, __there's no obvious connection between the truth of $P(k)$ and the truth of $P(k+1)$.__ So induction seems not to fit. 

The variation we introduce here is the following change to the framework, and the change happens in the induction hypothesis: 

1. (_Base case_) Show that the number $2$ can be written as a product of prime numbers. 
2. (_New Inductive hypothesis_) Assume that for some positive integer $k \geq 2$, __every integer $2, 3, 4, \dots, k$__ can be written as a product of prime numbers. 
3. (_Inductive step_) Prove that $k+1$ can be written as a product of prime numbers. 

Do you see the difference? Instead of assuming that the predicate $P(n)$ is true for _just one_ integer $k$, we are assuming that $P(n)$ is true _for all values less than or equal to $k$_. We assume the truth of the predicate for a _range_ of values ending at $k$, not just for the single value $k$. 

This form of induction has a name: 

>Definition: __Strong induction__ is a method of proof which shares the same framework as mathematical induction except in the inductive hypothesis, we assume that the predicate is true for all values of the variable less than or equal to an arbitrary value $k$. 

To avoid confusion, the mathematical induction we learned above is sometimes called "weak induction" to distinguish it from strong induction. 

We gave the framework for a proof by strong induction earlier for our proposition about prime factorizations. Here is a completed proof. Can you spot where strong induction was used? 

>__Proof:__ For the base case, we need to show that $2$ is a product of prime numbers. Since $2$ is itself prime, it is therefore a product of primes. 

>Now assume that for some integer $k \geq 2$, every integer $2,3,4,\dots,k$ can be written as a product of prime numbers. We need to show that $k+1$ can be written as a product of prime numbers. We proceed by considering two possible cases that cover all possibilities: Either $k+1$ is prime itself, or it isn't prime. 

>Case 1: If $k+1$ is prime itself, then it is a product of primes, and we're done. 

>Case 2: If $k+1$ is not prime itself, then it can be factored into a product of two integers (which are not necessarily prime), say $k+1 = ab$ with $a,b > 1$. Now, both $a$ and $b$ have to be less than $k+1$ because they both divide $k+1$ and are bigger than 1. This means that both $a$ and $b$ are less than or equal to $k$. The induction hypothesis now states that both $a$ and $b$ are a product of prime numbers, say $a = p_1p_2p_3\cdots p_i$ and $b = q_1q_2q_3\cdots q_j$. Therefore $ab$ is a product of prime numbers, $ab = p_1p_2p_3\cdots p_i \cdot q_1q_2q_3\cdots q_j$, and that's what we wanted to show. 

By using strong induction, we didn't need to start at $k+1$ and go back _just one step_ to $k$. We could go back as many steps as we needed and use the induction hypothesis. 


### Structural Induction

Yet another form of induction is useful when the thing we are working with is _something other than numbers_, like a set or a graph or a tree that has a recursive flavor but is not numerical. Here is an example. 

Consider the set $S$ that is defined recursively as follows: 

+ __Basis step:__ The set $S$ contains the strings `a` and `b`.
+ __Recursion rule:__ If $\lambda$ is a string in $S$, then $a + \lambda + b$ is also in the set (where `+` means concatenation of strings). 

So we are given that `a` and `b` belong to $S$. By the recursion rule this also means that the following strings are in $S$: 

+ `aab`, because I can take `a` (which is in $S$ by the basis step) and concatenate `a` on the left and `b` on the right. 
+ `abb`, because I can take `b` (which is also in $S$ by the basis step) and concatenate `a` on the left and `b` on the right.. 
+ `aaabb` because I can take `aab` from earlier and concatenate `a` on the left and `b` on the right. 
+ `aabba` for the same reason except I use `abb` from the second point above. 
+ Also the strings `aaaabbb`, `aaabbab`, `aaaaabbbb`, `aaaabbabb`, and so on. 

This set is apparently infinite and note that it has absolutely nothing to do with integers. Here's something that appears to be true about this recursively-defined set: 

>__Conjecture:__ For each string $x \in S$, the length of $x$ is odd. 

Because this is a conjecture about a recursively defined structure, an appropriate proof technique would seem to be induction. It's even simple to identify the predicate: __"The length of $x$ is odd."__ But, both weak and strong induction have to do with predicates over the _natural numbers_ --- and that's not the case here. So what do we do? 

What we do, is invent a new form of induction. This is a technique called __structural induction__ because it is induction on _structures_ instead of numbers. 

>Definition: __Structural induction__ is a proof technique that applies to recursively-defined structures whose framework consists of the following: 
>1. (_Base case_) Show that the predicate is true for all objects specified in the basis step of the recursive definition. 
>2. (_Inductive hypothesis_) Assume that the predicate is true for some arbitrary object that has already been constructed. 
>3. (_Inductive step_) Prove that by applying the recursive step of the definition to the object from the inductive hypothesis, that the predicate is true for the result. 

This is the same philosophy as strong and weak induction: (1) Establish that the beginning of the process works. Then (2) assume that the process has worked up to some arbitrary point. Then (3) prove that the process continues to work if you generate a new step. Structural induction decouples this idea from numbers, because we don't always work with numbers. 

Here's a structural induction framework, along with a completed proof, for the proposition above about the odd length of strings in the set $S$. First of all, remember that the predicate $P(x)$ is the statement "The length of $x$ is odd". 

+ __Base case:__ Show that $P(x)$ is true for all the objects specified in the basis step of the definition of $S$. Those elements are the strings `a` and `b`, so the base case amounts to __showing that `a` and `b` have odd length__. (This is easy, right?) 
+ __Inductive hypothesis:__ Assume that $P(x)$ is true for some arbitrary object $x$ that has already been constructed. So in other words, suppose that $x$ is odd where $x$ is some element of $S$ that has already been constructed. 
+ __Inductive step:__ Prove that when we apply the recursive step of the definition to $x$, we get a new string whose length is odd. 

From here, a completed proof is shockingly simple: 

>__Proof:__ For the base case, we need to show that `a` and `b` have odd length. Well, their length is 1, so this is obvious. 
>
>Now suppose $x \in S$ has odd length. We want to show that applying the recursive step to $x$ results in a new string of odd length, say $2k+1$ where $k$ is an integer. Applying the recursive step to $x$ results in the string $a + x + b$ (where `+` means concatenation). Since `a` and `b` have length 1, the length of the resulting string is $1 + (2k+1) + 1$ (where this time `+` means addition) and that equals $2k+3$, which is also odd. This is what we wanted to prove. 

(Review: An integer is odd if it can be written as $2k+1$ for some integer $k$. The number $2k+3$ is also odd because it can be written as $(2k+2) + 1$.) 


## Recap

+ __(Weak) induction__ is a proof technique that works well for proving propositions that claim that a certain predicate is true over a range of positive integers.
+ __Induction is best used when there is a naturally recursive nature__ to the thing we are proving something about. 
+ The __framework__ for a (weak) induction proof is: (1) State the base case, (2) State the inductive hypothesis, and (3) State the inductive step. 
+ __You have to identify the predicate being addressed before setting up the framework__, or it becomes very hard to do anything right. 
+ __Strong induction__ is used when induction seems appropriate, but proving $P(k+1)$ from just $P(k)$ is difficult. In the inductive hypothesis we assume not just that $P(k)$ is true but that $P(i)$ is true for all $i \leq k$. 
+ __Structural induction__ is used when proving things about recursively-defined _structures_ rather than proving things about numbers --- things like sets, graphs, and trees.  


### Postscript: An anecdote about my nephew

I have a nephew who recently completed his M.S. degree in Computer Science at a major American university. He's a smart guy but really struggled in his first year because _his undergrad courses never dealt much with proof by induction_. I asked him as I was putting the course together what he wished he had known better before starting graduate school and he gave me three answers: (1) induction, (2) induction, and (3) induction. 

Mathematical induction is the primary method of proof used by computer scientists because it is the flip side of recursion. Did you notice in the induction proof above that we used a recursive step to revert the proposition back to a simpler version of itself? This natural relationship between induction and recursion is one of the main unifying themes of the entire MTH 325 course. We will use induction ~~repeatedly~~ incessantly throughout the course to prove useful properties about relations, graphs, and trees -- anywhere recursive structures show up. Which is a lot. 



### Activities

1. Consider the proposition: For all positive integers $n$, 
$$F_2 + F_4 + F_6 + \cdots + F_{2n} = F_{2n+1} - 1$$
There are three text areas at the submission form: one for you to state clearly what you would do to establish the base cae, one to state the induction hypothesis and another to state what you would prove in the inductive step. Fill in the blanks appropriately. Make sure in the inductive hypothesis to give the _quantifier_ that is required. Notation note: Enter your subscripts in LaTeX form -- that is, `F_{2n}` for $F_{2n}$.

2. Consider the proposition: __For all positive integers $n$, a set that has $n$ elements has $2^n$ subsets.__ There are three text areas at the submission form: one for you to state clearly what you would do to establish the base cae, one to state the induction hypothesis and another to state what you would prove in the inductive step. Fill in the blanks appropriately. Make sure in the inductive hypothesis to give the _quantifier_ that is required. 

