# Data Structures and Algorithms in Python - Ch.3: Algorithm Analysis
### AJ Zerouali, 2023/09/13

## 0) Introduction

This notebook is the first one in a series of notes on data structrues and algorithms, following the book of the same name by Goodrich, Tamassia and Goldwasser. Other than this book, I am also following two Udemy courses:
* "Python for Data Structures, Algorithms, and Interviews!" by Jose Portilla (introductory).
* "The Complete Data Structures and Algorithms Course in Python" by Elshad Karimov (Leetcode oriented).

This is part of my interview prep.

#### To do (23/08/28):
Write about the following:
- Code fragment 3.1 (p.123) to find the max of an array in linear time.
- Code fragment 3.4 (p.133): Trick to reduce prefix averages from $O(n^2)$ to $O(n)$. (DONE)
- Code fragment 3.6 (p.135): Reducing verifications on 3 sets from $O(n^3)$ to $O(n^2)$.
- Code fragment 3.8 (p.136): Sorting arrays to solve other problems, reduction from $O(n^2)$ to $O(n\log(n))$.
- Record binary search as an example of $O(\log(n))$ algorithm.

## 1) Definitions

This section provides the mathematical definitions of the notions used for the asymptotic analysis of the running time of an algorithm. The reference for this part is section 3.3 of Goodrich-Tamassia-Goldwasser. I will be skipping a lot of what I consider to be obvious.

### 1.a - Big-Oh notation

Here is our central definition

#### Definition: Big-Oh notation

Let $f,g:\mathbb{N}\to\mathbb{R}$ be real-valued functions on integers. We say that *f is $O\left(g(n)\right)$* if there exists a real constant $c>0$ and an integer $n_0$ such that:
$$f(n)\le c\cdot g(n), \ \ \forall n>n_0.$$

To put this definition into context, the integer $n$ typically represents the size of a data structure, $f(n)$ is typically the running time of a given algorithm for an input of size $n$, and $g(n)$ could be a monomial, $a^n$ for some $a>0$, $n!$, $\log(n)$, or $n\log(n)$.

It is important to emphasize that this notion:
1) Expresses the ***asymptotic*** (worst-case) behavior of an algorithm as $n$ grows.
2) Describes the behavior of $f(n)$ in terms of the dominant term in $g(n)$.

Many things will be clarified with the concrete examples given in the next section. For now, we continue with definitions.

### 1.b - Big-Omega and Big-Theta

Big-Oh is the most used notation for time complexity, given that we are often interested in the worst-case execution time. The next definitions complement this notion with other asymptotic measures.

#### Definition: Big-Omega notation

Let $f,g:\mathbb{N}\to\mathbb{R}$ be real-valued functions on integers. We say that *f is $\Omega\left(g(n)\right)$* if there exists a real constant $c>0$ and an integer $n_0$ such that:
$$f(n)\ge c\cdot g(n), \ \ \forall n>n_0.$$

Saying that *$f(n)$ is $O(g(n))$* is essentially saying that as $n\to\infty$, $f$ is asymptotically lower than $g$. In contrast, big-Omega expresses the idea that **f is asymptotically larger than $g$** as $n$ grows to infinity.

#### Definition: Big-Theta notation

Let $f,g:\mathbb{N}\to\mathbb{R}$ be real-valued functions on integers. We say that *f is $\Theta\left(g(n)\right)$* if there exist real constants $c,C>0$ and an integer $n_0$ such that:
$$c\cdot g(n)\le f(n)\le C\cdot g(n), \ \ \forall n>n_0.$$

Saying that *$f$ is big-Theta of $g$* is the same as saying that **$f$ is both $O(g(n))$ and $\Omega(g(n))$**.

**Comments:** To conclude this section, here are some important complements to keep in mind from Goodrich-Tamassia-Goldwasser.
- Part 3.3.2 gives some general rules of thumb for the comparative analysis of algorithm running times. There are several tables for illustration purposes, and an important takeaway is the following classification of growth rates, from best to worse:
$$O(1), \  O(\log(n)), \  O(n), \ O(n\log(n)), \  O(n^2), \cdots, \ O(a^n).$$
- See the comments on pp.129-130 on very large constants (e.g. $f(n)=10^100 n$) and exponential running time.
- Portilla provides the following visual resource: https://www.bigocheatsheet.com/.

## 2) Algorithm analysis examples

We illustrate the definitions above with concrete implementations. Most of the upcoming examples will deal with *arrays* in the sense of theoretical computer-science (i.e. basic containers to memory addresses of objects).

An important point that we do not explain here is the following: If $f(n)=\sum_{j=0}^k a_j n^j$, we will say that $f$ is $O(n^k)$ (take the dominant term and drop the constant, as per the definition). Also, it is true that $f$ is $O(n^m)$ for any $m>k$, but to be able to compare the asymptotic running times of algorithms, we conventionally use the highest order/fastest growing term in the expression of $f$.

### 2.a - Constant-time operations

All of the following operations are $O(1)$:
- Assigning a value to a variable.
- Elementary arithmetic operations.
- Printing the value of a variable.


### 2.b - $O(n)$ Example: Finding the max value in an array



### 2.c - $O(\log(n))$ Example: Binary search

A crucial example to know and one of the best illustrative examples of $O(\log(n))$ complexity.

Here we consider an array that is sorted and indexable, and we want to write a function that searches for a value in said array and returns its index. The obvious solution, which loops over all indices in the array has $O(n)$ complexity.

The trick of binary search is to reduce the size of the search pool at every iteration. To do this, one introduces three markers:
- The lower bound index *low*;
- The upper-bound index *high*;
- The midpoint *mid = (low+high)//2*.

At each step, the input value is compared to the array element at index *mid*. Here is the implementation:


In [None]:
def binary_search(arr, target):
    # Initialization of high and low indices
    low = 0
    high = len(arr)-1
    
    # Main loop (low is increased and 
    # high is decreased at each step)
    while low <= high:
        
        # Get midpoint idx
        mid = (low+high)//2
        
        # Case where we found target
        if target == arr[mid]:
            return mid
        
        # Increase low if target > arr[mid]
        elif target > arr[mid]:
            low = mid + 1
        
        # Decrease high if target < arr[mid]
        elif target < arr[mid]:
            high = mid - 1
            
    # If low > high was reached 
    # then the search was unsuccessful
    return False
        
    

For this algorithm, the worst case is when the target value lies at one of the endpoints of the array. To see why the worst case time complexity is $O(\log(n))$ for an array of length $n$, we notice that at each iteration of the worst case, *mid* is decreased by half its previous value. If $T$ is the number of iterations required for *mid* to reach index $1$ (or $(n-1)$) approximately satisfies  $1 = \frac{n}{2^T}$, meaning that $T\approx \log_2(n)$. Up to an additive constant depending on the other operations, $T$ is essentially the execution time of binary search in the worst case, from which we get that the complexity is $O(\log(n))$

### 2.d - From $O(n^2)$ to $O(n)$ Example: Prefix averages

A prefix average of a finite sequence $\{a_i\}_{i=0}^{n-1}$ is a mean of the form:
$$S_j = \tfrac{1}{j+1}\sum_{i = 0}^{j-1}a_j, \ \ j=0,\cdots, (n-1).$$

In this example, we want to write a function that takes an array of numbers as input and outputs the array of prefix averages. The naive implementation would have an $O(n^2)$ time complexity, but this algorithm can be reduced to linear complexity. To see this, we note the recursion:
$$(j+1)S_j+a_{j+1} = (j+2)S_{j+1}, \ \ j=0,\cdots, (n-2),$$
so that $S_{j+1}$ can be computed from $S_j$ and $a_{j+1}$:
$$S_{j+1} = \tfrac{1}{(j+2)} \left( (j+1)S_j+a_{j+1}\right), \ \ j=0,\cdots, (n-2).$$

A further simplification is to simply keep track of the current sum of elements in the array, and divide it by the number of elements.

In the implementation below, we initialize our output list by filling it with a known number of zeros. We do this to avoid the use of the *append()* method.

In [None]:
def get_prefix_averages(arr):
    '''
        Function to compute prefix averages
        
        :param arr: Array of numbers [a_0, ..., a_(n-1)]
        :return avg_arr: Array whose i-th index is the average of [a_0, ..., a_i]
    '''
    N = len(arr)
    avg_arr = [0]*N
    current_sum = 0.0
    
    for i in range(N):
        
        current_sum += arr[i]
        avg_arr[i] = current_sum/(i+1)
        
    '''
        # My alternative
    
    avg_arr[0] = arr[0]
    
    for i in range(1,N):
        avg_arr[i] = current_sum/(i+1)
    '''
    
    return avg_arr
        

### 2.e - From $O(n^2)$ to $O(n\log(n))$ Example: Three-way set disjointness



### 2.f - From $O(n^2)$ to $O(n\log(n))$ Example: Sorting as a problem solving tool