# Analysis of Algorithms

This analysis of algorithms will focus on **running time** - how many times does a discrete computation happen to get a result?

Reasons to analyze algorithms:

- Predict performance
- Compare algorithms
- Provide guarantees
- Understand theoretical basis

The primary practical reason is to avoid performance bugs.

You want to know if your program will be able to solve a large practical input, you use the **scientific method** to understand its performance:

- **Observe** some feature of the natural world
- **Hypothesize** a model that is consistent with the observations
- **Predict** events using the hypothesis
- **Verify** the predictions by making further observations
- **Validate** by repeating until the hypothesis and observations agree

Experiments must be **reproducible** and hypotheses must be **falsifiable**.

## Observations

The first step is to make some observations about the running time of a program.

1. Run empirical analysis: time how long a program takes to run
2. Analyze the data: plot the running time $T(N)$ versus input size $N$
    - Usually plot $\lg(T(N))$ vs $lgN$, and check the slope
    - Regression analysis fits a straight line through the data, where **power law** states: $aN^b$, where $b$ is the slope. Example:

$$
y = mx + b \\
\lg(T(N)) = b \; \lg(N) + c \\
T(N) = a \; N^b \text{ where } a=2^c
$$

Most algorithms have some form of the power law involved when you're analyzing them.

The system you run the experiments on will make a difference in some areas. System independent effects are the **algorithm** and the **input data**, which determine the exponent $b$ in the power law (slope of the line in a log-log scale graph). System dependent effects include **hardware** (CPU, memory, cache), **software** (compiler, interpreter, garbage collector), and the **system** (operating system, network, other applications). These determine the constant $a$ in the power law.


## Mathematical Models

Observing the behavior of an algorithm helps to predict performance, but to understand what the algorithm is doing you need mathematical models.

The **total running time** is the sum of cost X frequency for all operations. You need to analyze the program to determine the set of operations. The cost depends on the machine, the compiler, etc. The frequency depends on the algorithm and input data.

Earlier days, computers would list the time it took to perform certain operations (integer addition, float addition, etc.). With modern machines, you'd run an analysis if you really wanted to know.

One simplification is to look at the costs and use some basic operation as a proxy for running time. For example, array accesses in a nested loop adding all integers in an array to see if any pair sums to zero.

Another simplification is use **tilda notation** which ignores the lower-order terms in the cost formulas you derive. For example, $\frac{1}{6} N^3 + 20N + 16$ would simplify to ~$\frac{1}{6}N^3$.

In principle, accurate mathematical models are available. In practice, formulas can be complicated, advanced mathematics might not be required, and exact models are best left for the experts. **Approximate models are usually good enough.**


## Order of Growth Classifications

There's only a small set of functions that describe order-of-growth of typical algorithms: $logN$, $N$, $NlogN$, $N^2$, $N^3$, and $2^N$. These classifications come from patterns in the code. No loops? Will be constant time.

| Order of Growth | Name | Typical Code Framework | Description | Example |
| - | - | - | - | - |
| 1 | Constant | $a = b + c$ | Statement | Add two numbers |
| $\log N$ | Logarithmic | while (N > 1) {N = N / 2;...} | Divide in half | Binary search |
| $N$ | Linear | for loop | Loop | Find the maximum |
| $N \log N$ | Linearithmic | mergesort | Divide and conquer | Mergesort |
| $N^2$ | Quadratic | nested for loops | Double loop | Check all pairs |
| $N^3$ | Cubic | triple nested for loops | Triple loop | Check all triples |
| $2^N$ | Exponential | Combinatorial search | Exhaustive search | Check all subsets |


## Binary Search Example

In [4]:
def binarySearch(a, key):
    lo, hi = 0, len(a) - 1
    while lo <= hi:
        mid = lo + (hi - lo) // 2
        if key < a[mid]:
            hi = mid - 1
        elif key > a[mid]:
            lo = mid + 1
        else:
            return mid
    return -1

In [5]:
a = [6, 13, 14, 25, 33, 43, 51, 53, 64, 72, 84, 93, 95, 96, 97]
key_1 = 33
print("Index of {} is {}".format(key_1, binarySearch(a, key_1)))
key_2 = 34
print("Index of {} is {}".format(key_2, binarySearch(a, key_2)))

Index of 33 is 4
Index of 34 is -1


## Theory of Algorithms

There are three types of analyses:

1. **Best case:** lower bound on cost, determined by the "easiest" input, and provides a goal for all inputs
2. **Worst case:** upper bound on cost, determined by "most difficult" input, and provides a guarantee for all inputs
3. **Average case: ** expected cost for random input, you need a model for "random" input, but it provides a way to predict performance

What if the actual data doesn't match your input model? You need to understand the input to effectively process it. One approach is to design for the worst case scenario, other approach is to randomize, depending on some kind of probabilistic guarantee.

Goals:

- Establish "difficulty" of a problem
- Develop "optimal" algorithm

Approach:

- Suppress details in analysis: analyze to "within a constant factor"
- Eliminate variability in the input model by focusing on the worst case

Optimal algorithm:

- Performance guarantee (to within a constant factor) for any input
- No algorithm can provide a better performance guarantee

Commonly-used notations in the theory of algorithms:

| Notation | Provides | Example | Used To |
| - | - | - | - |
| Big Theta | Asymptotic order of growth | $\Theta (N^2)$ | Classify algorithms |
| Big Oh | $\Theta (N^2)$ and smaller | $O(N^2)$ | Develop upper bounds |
| Big Omega | $\Theta (N^2)$ and larger | $\Omega(N^2)$ | Develop lower bounds |

The algorithm design approach has been successful over the past several decades. Methodology:

- Develop an algorithm
- Prove a lower bound
- Is there a gap between the upper and lower bound?
- Lower the upper bound (discover a new algorithm)
- Raise the lower bound (more difficult)

One mistake is to interpret the big-Oh as an approximate model for how long an algorithm runs.

## Analysis Summary

Empirical analysis:

- Execute program to perform experiments
- Assume power law and formulate a hypothesis for running time
- Model enables you to **make predictions**

Mathematical analysis:

- Analyze algorithm to count frequency of operations
- Use tilde notation to simplify analysis
- Model enables you to **explain behavior**

Scientific method:

- Mathematical model is independent of a particular system; applies to machines not yet built
- Empirical analysis is necessary to validate mathematical models and to make predictions


## Sorting Algorithms Summary

| Algorithm | Inplace? | Stable? | Worst-Case | Average-Case | Best-Case | Comments |
| - | - | - | - | - | - | - |
| Selection | X |  | $N^2 / 2$ | $N^2 / 2$ | $N^2 / 2$ | $N$ exchanges |
| Insertion | X | X | $N^2 / 2$ | $N^2 / 4$ | $N$ | Use for small $N$ or partially ordered |
| Shell | X |  | ? | ? | $N$ | Tight code, subquadratic |
| Quick | X |  | $N^2 / 2$ | $2N \lg N$ | $N \lg N$ | $N \lg N$ probabilistic guarantee fastest in practice |
| 3-Way Quick | X |  | $N^2 / 2$ | $2 N \lg N$ | $N$ | Improves quicksort in presence of duplicate keys |
| Merge |  | X | $N \lg N$ | $N \lg N$ | $N \lg N$ | $N \lg N$ guarantee, stable |
| Heap | X |  | $2N \lg N$ | $2N \lg N$ | $N \lg N$ | $N \lg N$ guarantee, in-place |
| ??? | X | X | $N \lg N$ | $N \lg N$ | $N \lg N$ | Holy sorting grail |

## Search Tree Summary

The worst case (WC) is after $N$ inserts, and the average case (AC) is after $N$ random inserts.

The height of any red-black BST on $n$ keys (regardless of the order of insertion) is guaranteed to be between $\log⁡_{2} n$ and $2 \log_{⁡2}n$.

| Implementation | WC Search | WC Insert | WC Delete | AC Search | AC Insert | AC Delete | Ordered Iteration? |
| --- | --- | --- | --- | --- | --- | --- | --- |
| Sequential Search (unordered list) | $N$ | $N$ | $N$ | $N/2$ | $N$ | $N/2$ | No |
| Binary Search (ordered array) | $\lg N$ | $N$ | $N$ | $\lg N$ | $N/2$ | $N/2$ | Yes |
| Binary Search Tree (BST) | $N$ | $N$ | $N$ | $1.39 \lg N$ | $1.39 \lg N$ | ? | Yes |
| 2-3 Tree | $c \lg N$ | $c \lg N$ | $c \lg N$ | $c \lg N$ | $c \lg N$ | $c \lg N$ | Yes |
| Red-Black BST | $2 \lg N$ | $2 \lg N$ | $2 \lg N$ | $1.00 \lg N$\* | $1.00 \lg N$\* | $1.00 \lg N$ | Yes |
| Hash: Separate Chaining | $\lg N$\*\* | $\lg N$\*\* | $\lg N$\*\* | $3 \cdot 5$\*\* | $3 \cdot 5$\*\* | $3 \cdot 5$\*\* | No |
| Hash: Linear Probing | $\lg N$\*\* | $\lg N$\*\* | $\lg N$\*\* | $3 \cdot 5$\*\* | $3 \cdot 5$\*\* | $3 \cdot 5$\*\* | No |

\* Exact coefficient unknown but extremely close to 1  
\*\* Under uniform hashing assumption