Previously, we could rule out that `dup1` is "like a parabola" and `dup2` is "like a line".

In this video, we're going to be more careful about what we mean when we say a function is "like" a parabola and "like" a line.

## Intuitive Simplification 1 - Consider Only the Worst Case

**Justification**: When comparing algorithms, we often care only about the worst case. We're effectively focusing on the case where there are no duplicates, because this is where there's a performance difference.

## Intuitive Order of Growth Identification

Consider the algorithm below. What do we expect will be the order of growth of the runtime for the algorithm?

1. $N$ (linear)
2. $N^2$ (quadratic)
3. $N^3$ (cubic)
4. $N^6$ (sextic)

| operation | count |
| --- | --- |
| less than (`<`) | 100$N^2$ + 3$N$ |
| greater than (`>`) | 2$N^3$ + 1 |
| and (`&&`) | 5,000 |

In other words, if we plotted total runtime vs. $N$, what shape would we expect?

**Ans**: 3. Cubic. Why?

Suppose:
* `<` takes $\alpha$ nanoseconds
* `>` takes $\beta$ nanoseconds
* `&&` takes $\gamma$ nanoseconds

Then the total runtime will be :
$$ \alpha (100N^2 + 3N) + \beta(2N^3 + 1) + 5000 \gamma$$

For very large $N$, the $2 \beta N^3$ term is much larger than the others.

Mathematically, imagine the total runtime divided by $N^3$ as $N$ goes to infinity; the only term that survives would be the $2 \beta$ term. 

## Intuitive Simplification 2: Restrict Attention to One Operation

Pick some representative operation to act as a proxy for the overall runtime.

If we look at the `dup1`,

In [None]:
for (int i = 0; i < A.length; i += 1) {
    for (int j = i + 1; j < A.length; j += 1) {
        if (A[i] == a[j]) {
            return true;
        }
    }
}
return false;

| operation | symbolic count|
| --- | --- |
| `i = 0` | 1 |
| `j = i + 1` | 1 to N|
| less than (`<`) | 2 to $\frac{N^2 + 3N + 2}{2}$|
| increment (`+=1`) | 0 to $\frac{N^2 + N}{2}$|
| equals (`==`) | 1 to $\frac{N^2 - N}{2}$ |
| array accesses| 2 to $N^2 - N$ |

A good choice would be to pick the increment (`+=1`) as the representative operations since it has the parabolic term since our goal is to show that the runtime is "like" quadratic. 

**The choice of representative operation is sometimes called the `cost model`.**

A bad choice would be to pick `j= i + 1` or `i = 0`.

![](images/cost.png)

## Intuitive Simplification 3 - Eliminate Low Order Terms

Simply ignore lower order terms.

![](images/ignore.png)

## Intuitive Simplification 4 - Eliminate Multiplicative Constants

Why? It has no real meaning. We already threw away information when we chose a single proxy operation.

![](images/multi.png)

## Simplification Summary

![](images/simple.png)

## Repeating the Process for `dup2`

Now try to do the same thing for `dup2`. 

![](images/dup2.png)

## Summary of Our Analysis Process

Our process:

* Construct a table of exact counts of all possible operations
* Convert table into a worst case order of growth using 4 simplifications

![](images/summary.png)

By using simplification from the outset, we can avoid building the table at all!