# Lesson: Identifying and Analyzing Proofs

__MTH 325: Discrete Structures for CS 2__, Winter 2017

## Overview 

In this lesson we discuss: 

+ How to read a proof, identify the technique being used, and examine its overall logical flow; and 
+ How to perform a _critical analysis_ of a proof to test for correctness. 

This lesson addresses the following Learning Targets: 

+ **P.3**: I can identify the parts of a proof, including the technique used and the assumptions being made.
+ **P.4**: I can perform a critical analysis of a written proof and provide a detailed explanation of the steps used in the proof.


## Recap of proof techniques

So far, we've learned four different kinds of proof techniques. The first three are generally used for proving propositions that involve conditional statements: _direct proof_, _proof by contrapositive_, and _proof by contradiction_. If we are proving the conditional statement $P \rightarrow Q$, then the framework of assumptions and conclusions for each of these looks like this: 

| Proof method | Assumptions made | Conclusions to draw | 
|:-----------  | :--------------- | :------------------ |
| Direct       | Assume $P$       | Prove $Q$           |
| Contrapositive | Assume $\neg Q$ | Prove $\neg P$     |
| Contradiction | Assume $P$ and $\neg Q$ | Arrive at a contradiction | 

The fourth method is mathematical induction, which is often used not for conditional statements but for propositions involving a claim that a certain predicate (say, $P$) is true over a certain range of value of the variable (say, $x$). Induction itself can be broken down into three types: _weak_ induction, _strong_ induction, and _structural_ induction. No matter what the type, induction proofs always have the same framework of assumptions and conclusions: 

1. Establish the base case. 
2. Assume the induction hypothesis. 
3. Prove the inductive step. 

The specifics of those three pieces of the framework depend on the type. Here we suppose the predicate is called $P$. 

| Type of induction | Base case | Induction Hypothesis | Inductive Step | 
|:----------------  | :-------- | :------------------- | :------------- |
| Weak              | Show  $P$ is true in smallest case | Assume $P(k)$ is true for some $k \in \mathbb{N}$ | Prove $P(k+1)$ is true | 
| Strong              | Show  $P$ is true in smallest case | Assume $P(k)$ is true for all $1,2,3,\dots k$ for some $k \in \mathbb{N}$ | Prove $P(k+1)$ is true | 
| Structural              | Show  $P$ is true for each object in the basis step | Assume $P$ is true for some already-constructed object | Prove $P$ is true if that object undergoes the recursive step| 

Note that weak and strong induction are used for proofs of predicates whose variables are natural numbers; structural induction is specifically for proofs about things that are _not_ natural numbers, but rather recursively-defined objects like sets, strings, graphs, trees, etc. 

## Who cares about frameworks? 

We care about frameworks, for two reasons: 

1. Writing a proof that is correct, complete, and convincing is complex and hard. And like all complex forms of communication, having a detailed outline --- a framework --- helps you get your thoughts across. 
2. A proof that has been written using a framework, is easier to understand if you're the reader. 

It's that second point that we're focusing on in this lesson: _Reading_ a proof and x-raying it to find its framework, and then using that framework to _critcally analyze_ the proof. 

## X-raying a proof to find the framework

Here is a written proof. Skim it, and see if you can (1) identify what the author is assuming, (2) determine what the author was trying to prove, and then (3) tell what the method was that the author was using. 

Notice two things before you do this. First, we are _not_ stating the proposition itself; if this proof has a good framework, you should be able to tell what the proposition was, just by the framework. Second, you may not understand the words in the proof but this doesn't prevent you from finding the framework. 

>**Proof:** Suppose that $G$ is a 2-colorable graph whose nodes are colored either "red" or "blue". Since there are no edges between nodes with different colors, any cycle in $G$ must alternate between red nodes and blue nodes. In order to complete the cycle, therefore, the number of nodes in the cycle that are red, must be the same as the number that are blue. Therefore the cycle has even length.

Again -- **before reading further, answer the three questions above: What was the writer assuming? What was the author proving? And what method does the proof use?**

---

Here are the answers: 

1. The assumptions in this proof are clearly laid out because they follow the word "suppose" (which is a synonym for "assume"). The author is assuming here that $G$ is a 2-colorable graph (whatever that means) and the nodes (whatever those are) are colored red or blue. (The second part of this, that the nodes are colored red or blue, is not actually an assumption --- this is just an additional step of assigning actual colors to the nodes. Really the only assumption being made is that $G$ is a 2-colorable graph.) 
2. A bunch of words ensues, but the conclusion is clearly stated at the end: The cycle (whatever that is) has even length. 
3. OK, so what method was used? We can rule some things out first. This was clearly _not_ an induction proof because there was no base case, no inductive hypothesis, no inductive step --- no recursion. This was also clearly not a contradiction proof, because the argument doesn't seem to arrive at a contradiction. It arrives at some kind of _conclusion_ but there is no apparent contradiction of a known fact. That leaves either direct proof or contrapositive. 

In fact it could be either direct proof or contrapositive because these are really the same --- proof by contrapositive _is_ direct proof, just using the contrapositive of the statement being proved. So this could either be: 

1. Direct proof of the statement "If $G$ is a 2-colorable graph, then all cycles have even length." Or, 
2. Contrapositive proof of the statement "If $G$ has a cycle of odd length, then $G$ is not 2-colorable."

Both of those are "right answers" but the first one is simpler. 

This is actually a proposition in graph theory that we are going to visit later in the course when we learn Learning Target G.6, which involves the powerful idea of a _vertex coloring_ of a graph. 

---

Here's another example. Again, see if you can find the framework of the proof and tell what the method was that the author was using. This time it will be a bit more helpful to state the proposition going along with it. 

>**Proposition:** For all natural numbers $n$, $1 + 2 + 3 + \cdots + n = \dfrac{n(n+1)}{2}$. 

>**Proof:** For $n=1$, the left hand side of the statement is just $1$; and the right hand side is $\frac{1(1+1)}{2}$ which is also $1$. Therefore the proposition holds when $n=1$. 
>
>Now assume that for some $k \in \mathbb{N}$, 
$$1 + 2 + 3 + \cdots + k = \dfrac{k(k+1)}{2}$$
>Begin with the expression $1 + 2 + 3 + \cdots + k + (k+1)$. We have: 
$$
\begin{align}
1 + 2 + 3 + \cdots + k + (k+1) &= (1 + 2 + 3 + \cdots + k) + (k+1) \\
&= \frac{k(k+1)}{2} + (k+1) \\
&= \frac{k(k+1)}{2} + \frac{2k+1}{2} \\
&= \frac{k^2 + k + 2k + 1}{2} \\
&= \frac{k^2 + 3k + 1}{2} \\
&= \frac{(k+1)(k+2)}{2}
\end{align}
$$
>Therefore we see that $1 + 2 + 3 + \cdots + k + (k+1) = \dfrac{(k+1)((k+1) + 1)}{2}$ so the proposition is proven. 

What technique was the author using here and what was the framework? 

----

It's pretty clear that this proof above is an induction proof. They have an unmistakeable framework of assumptions and conclusions: there's a base case, an inductive hypothesis, and an inductive step. Another hallmark of an induction proof is that _recursion always shows up somewhere_. Here, it was in the second line in the big equation stack above where we replaced $1+2+\cdots+k$ with $\frac{k(k+1)}{2}$. 

What kind of induction is this? It has to do with positive integers, so it's not structural. To distinguish between _weak_ versus _strong_ induction, we need to look at the induction hypothesis: Was something assumed _just for a single value_ or _for a range of values_? It was just a single value here, so this is _weak induction_. 

Being able to identify the framework and method of a written proof helps you to understand the proof itself, but it's no guarantee that you will understanding it on first reading. We have to read proofs carefully, using the framework as a guide or a map, and __question everything we see__ to ensure that each step in the argument is clear, complete, and convincing. We expect this of written work (unfortunately, many math books and articles fall way short of being clear or complete or convincing!) and it's also what we demand of our own work. 

## What it means for a proof to be "correct" 

The purpose of writing a proof is to convince someone that the pattern that you've discovered of some mathematical phenomenon is realy true. The purpose of _reading_ a proof is to hold the author's explanation up to the light, considering it "guilty until proven innocent" --- that is, we accept nothing less than absolute clarity and correctness in determining whether a proof is "correct" or not. 

This is a lot like running a program through a compiler. The compiler has no mercy. It doesn't care about you or your stupid code because it doesn't care about _anything_. It is a mindless machine whose sole purpose is to ensure that _no code makes it out alive unless it has no syntactical errors_. A proof is in many ways like a computer program. It's an attempt to "explain why" or "show how", except with a proof, _the reader is the compiler_. Maybe we care a little bit about the proof and the author, and we're certainly not mindless machines. But, we have similar standards: **No proof survives unless and until it has no major errors**. 

What are the kinds of errors we can introduce in a proof? There are four: 

1. **Computation errors.** This occurs when a mathematical computation (calculus, algebra, arithmetic, etc.) is incorrectly carried out, either by hand or on a computer. For example: Solving the equaton $3x = 9$ to get $x = 2$ is a computational error. 
2. **Logic errors.** This happens when a conclusion is drawn erroneously from a set of information, _or_ when the writer adds an unwarranted assumption into the argument. For example suppose you're trying to explain how to solve the equation $x^2 = 9$. There are two logical errors that can happen. First, you can go straight from $x^2 = 9$ to conclude that $x = 3$ and then stop. That's a logical error because you didn't _compute_ anything wrong, but the _conclusion_ is wrong because you didn't consider the other possibility, $x = -3$. Second, you can start with $x^2 = 9$ and then say, "Let's assume that $x$ is positive" and then conclude $x=3$. This isn't wrong in itself, but what allowed you to assume $x > 0$? Nothing, that's what. It's an unwarranted assumption. 
3. **Syntax errors.** These are failures in the grammar of a language, whether the English language, or mathematical definitions, or the notational language of mathematics. For example, using an incomplete sentence in a proof is a syntax error because incomplete sentences are incomplete thoughts. Mis-stating the definition of "even number" is a syntax error. Switching variables mid-solution (for example, solving $3t = 9$ to get $x = 3$) is a syntax error. Mismatching parentheses is a syntax error, both in mathematics and in coding. 
4. **Semantic errors.** These happen when the _grammar_ of a language is used correctly but the resulting statements are nonsensical or meaningless. For example, the statement "Colorless green ideas sleep furiously" is correct English syntax but has no meaning. It's shockingly easy to fall into semantic errors in mathematics, and it often happens by applying properties or operations to things that are "the wrong data type". For example, taking the square root of a set is a semantic error. Writing $|x| = 6$ and then dividing both sides by $x$ to get $|\hspace{0.2in} | = 6/x$ is a semantic error. Saying something like "We will solve the formula by plugging the problem into the number" is a semantic error. 

## Critical analysis of proofs 

A _critical analysis_ of a proof is a thorough investigation of a written proof to determine answers to the following questions: 

1. Is the proposition that is being proved, actually true in the first place? 
2. Does the proof contain any significant computation, logical, syntax, or semantic errors that invalidate the argument? 
3. If there are no significant errors, are there ways to improve the clarity, correctness, or convincing-ness of the proof? 

The process of proof analysis can be summarized in this flowchart. Use this when reading other people's proofs --- and especially when reading your own: 

![ProofAnalysis.png](ProofAnalysis.png)

The only way to practice critical analysis is to do it, so we will save examples for class. But, here are some tips: 

First: **Critical analysis always begins with identifying the technique and the framework of the proof.** For example, consider the following modification of the proof from above: 

>**Proposition:** For all natural numbers $n$, $1 + 2 + 3 + \cdots + n = \dfrac{n(n+1)}{2}$. 

>**Proof:** Assume that for some $k \in \mathbb{N}$, 
$$1 + 2 + 3 + \cdots + k = \dfrac{k(k+1)}{2}$$
>Begin with the expression $1 + 2 + 3 + \cdots + k + (k+1)$. We have: 
$$\begin{align}
1 + 2 + 3 + \cdots + k + (k+1) &= (1 + 2 + 3 + \cdots + k) + (k+1) \\
&= \frac{k(k+1)}{2} + (k+1) \\
&= \frac{k(k+1)}{2} + \frac{2k+1}{2} \\
&= \frac{k^2 + k + 2k + 1}{2} \\
&= \frac{k^2 + 3k + 1}{2} \\
&= \frac{(k+1)(k+2)}{2}
\end{align}$$
>Therefore we see that $1 + 2 + 3 + \cdots + k + (k+1) = \dfrac{(k+1)((k+1) + 1)}{2}$ so the proposition is proven. 

As before, we identify this proof as a proof by induction (weak) and so we expect the usual framework of base case, induction hypothesis, and inductive step. But when you begin to read the proof with the framework in mind, something jumps out at you that's seriously wrong. Can you tell what it is? Think about it. 

----

Seriously, think about it. 

---

The problem with this proposed proof is: It's an induction proof but it has no base case! That's a killer mistake (a logical error of epic proportions, because the logic of the induction proof is not complete). We _might_ have picked up on the fact there's no base case in this induction proof if we had not thought about the framework first. But it's a lot more likely that we would catch that, if we _do_ think about the framework. 

Second, **critical analysis of a proof involves focused, deliberate skepticism**. You have to be on the _lookout_ for errors and just assume that they are there, and it's your job to find them. If this seems depressing or negative, just keep in mind that you're doing the author a favor by giving "tough love" and only letting the best ideas see the light of day. Also remember that sometimes the author is _you_, and your work is a Challenge Problem or something more important than this, and you _want_ to find the errors _before_ they get submitted. 

Third, and related, **applying critical analysis to your own work will dramatically improve its quality just by paying focused attention to it**. Too often, schools train students to churn through _as much_ work as they can _in as little time as possible_. The result is that many smart students hit a wall when they get to college and are asked to do _big things_ that take _a lot of time_. Critical analysis is a step toward reversing that situation by having you _go slow_ and _go deep_ on work. It's refreshing. 