# Chapter 03: Algorithm Analysis

* **Data structure** is a systematic way of organizing and accessing data.
* **Algorithm** is a step-by-step procedure for performing some task in a finite amount of time.

* Limitation of Experimental Analysis
  1. Experimental running times is not an objective measure for comparing several algorithms since it is heavily rely on hardware and software environments.
  2. Experiments can be done only on a limited set of test inputs. We don't know what will happen for cases not included in the experiment. 
  3. Experimental analysis requires actual implementation of code for measuring performance.

Therefore, our way of analyzing the efficiency of algorithms should overcome the defects above.

1. Counting Primitive Operations
  
  We define a set of **primitive operations** such as following:
  * Assigning an identifier to an object
  * Determining the object associated with an identifier
  * Performing an arithmetic operation
  * Comparing two numbers
  * Accessing a single element of a Python `list` by index
  * Calling a function (excluding operations executed wihtin the function)
  * Returning from a function
  
  Formally, a primitive operation corresponds to a low-level instruction with an execution time that is constant. Instead of trying to determine the specific execution time of each primitive operation, we will simply count how many primitive operations are executed, and use this number, $t$, of primitive operations an algorithm performs will be proportional to the actual running time of that algorihtm.
  
2. Measuring Operations as a Function of Input Size

  To capture the order of growth of an algorithm's running time, we will associate, with each algorithm, a function $f(n)$ that characterizes the number of primitive operations that are performed as a function of the input isze $n$.
  
3. Focusing on the Worst-Case Input

  An algorithm may run faster on some inputs than it does on others of the same size. Thus, we may wish to express the running time of an algorithm as the function of the input size obtained by taking the average over all possible inputs of the same size. Unfortunately, such an **average-case** analysis is typically quite challenging. It requires us to define a probability distribution on the set of inputs, which is often a difficult task.
  
  An average-case analysis usually requires that we calculate expected running times based on a given input distribution, which usually involves sophisticated probability theory. Therefore, we will characterize runnning times in terms of the **worst case**, as a function of the input size, $n$, of the algorithm.
  
  Worst-case analysis si much easier than average-case analysis, as it requires only the ability to identify the wors-case input, which is often simple.

## 3.2 The Seven Functions Used in This Book

1. The Constant Function
    
    $$f(n) = c$$

2. The Logarithm Function

    $$f(n) = \log_b n$$
    
3. The Linear Function

    $$f(n) = n$$

4. The N-Log-N Function

    $$f(n) = n \log n$$

5. The Quadratic Function

    $$f(n) = n^2$$

6. The Cubic Function and Other Polynomials

    $$f(n) = n^3$$
    
    $$f(n) = a_0 + a_1n + a_2n^2 + a_3n^3 + \cdots + a_dn^d$$
    
7. The Exponential Function

    $$f(n) = b^n$$
    

## 3.3 Asymptotic Analysis

### 3.3.1 The "Big-O" Notation
Let $f(n)$ and $g(n)$ be the functions mapping positive integers to positive real umbers. We say that $f(n)$ is $O(g(n))$ if there is a real constant $c > 0$ and an integer constant $n_0 \geq 1$ such that

$$f(n) \leq cg(n), \quad \text{for} \quad n \geq n_0$$

This definition is often referred to as the "big-O" notation, for it is sometimes pronounced as $``f(n) \ is \ \boldsymbol{big-O} \ of \ g(n)."$

The big-O notation allows us to say that a function $f(n)$ is "less than or equal to" another function $g(n)$ up to 
a constant factor and in the **asymptotic** sense as $n$ grows toward infinity.

### 3.3.2 Comperative Analysis
Supposee two algorithms solving the same problem are available: an algorithm $A$, which has a running time of $O(n)$, 
and an algorithm $B$, which has a running time of $O(n^2)$. Which algorithm is better? We know that $n$ is $O(n^2)$, which implies that algorithm $A$ is 
**asymptotically better** than algorithm $B$. 

#### Some Words of Caution
A few words of caution about asymptotic anotation are in order at this point. First, note that the use of the big-O and raelated notations can e somewhat misleading should the constant factors they "hide" be very large.
For example, while it is true that the function $10^{100}n$ is $O(n)$. if this is the running time of an algorithm being compared to one whose running time is $10n\log n$, we should prefer the $O(n\log n)$ time algorithm, even though the linear-time algorithm is asymptotically faster.

### 3.3.3 Examples of Algorithm Analysis

#### Prefix Averages
Given a sequence $S$ consisting of $n$ numbers, we want to compute a sequence $A$ such that $A[j]$ is the average of elements $S[0], \ldots, S[j]$, for $j=0,\ldots,n-1$, that is,

$$A[j] = \frac{\sum_{i=0}^j s[i]}{j + 1}$$


##### A Quadratic-Time Algorithm

```python
def prefix_quadratic(S):

    n = len(S)                      #  O(1)
    A = [0] * n                     #  O(n)
    for j in range(n):              #  O(n)
        total = 0
        for i in range(j + 1):      #  O(n^2)
            total += S[i]           #  O(n^2)
        A[j] = total / (j+1)        #  O(n)
    return A
```

Therefore, the running time of `prefix_quadratic` is $O(n^2)$



```python
def prefix_linear(S):

    n = len(S)                      #  O(1)
    A = [0] * n                     #  O(n)
    total = 0
    for j in range(n):              #  O(n)
        total += S[j]               #  O(n)
        A[j] = total / (j+1)        #  O(n)
    return A
```

Therefore, the running time of `prefix_linear` is $O(n)$.


#### Three-Way Set Disjointness
Suppose we are given three sequences of numbers, $A, B$ and $C$. We will assume that no individual sequence contains duplicate values, but that there may be some numbers that are in two or three of the sequences.
The **three-way set disjointness** problem is to determine if the intersection of the three sequences is empty, namely, that there is no element $x$ such that $x \in A, x \in B$ and $x \in C$.

```python
def disjoint1(A, B, D):
    for a in A:
        for b in B:
            for c in C:
                if a == b == c:
                    return False
    return True
```

If each of the original sets has size $n$, then the worst-case running time of this function is $O(n^3)$.

```python
def disjoint2(A, B, D):
    for a in A:
        for b in B:
            if a== b:
                for c in C:
                    if a == c:
                        return False
    return True
```

In the improved version, it is not simply that we save time if we get lucky. We claim that the *worst-case* running time for `disjoint2` is $O(n^2)$, since the innermost loop, over $C$, executes at most $n$ times.


#### Element Uniqueness
A problem that is closely related to the three-way set disjointness problem is the **element uniqueness problem**. In the former, we are given three collections and we presumed that there were no duplicates within a single collection. IN the element uniqueness problem, we are given a single sequence $S$ with $n$ elements and asked whether all elements of that collection are distinct from each other.

```python
def unique1(S):
    for j in range(len(S)):
        for k in range(j+1, len(S)):
            if S[j] == S[k]:
                return False
    return True
```

The approach is $O(n^2)$.


#### Using Sorting as a Problem-Solving Tool

```python
def unique2(S):
    temp = sorted(S)
    for j in range(1, len(temp)):
        if S[j-1] == S[j]:
        return False
    return True
```

It guarantees a worst-case running time of $O(n \log n)$.