# 605.621 - Foundations of Algorithms

## Assignment 02

Sabbir Ahmed

February 14, 2021

### Question 1

\[20 pts, divide-and-conquer\]

Find and explain a divide-and-conquer solution to a problem (suitable) that we have not
discussed in class or presented in the book. (For example, at a restaurant, friends
searching an empty table in different parts of the dining area.)

### Answer

An example of a divide-and-conquer solution to an everyday problem is gathering items at the grocery store. If I head to the store by myself, I would have to find and gather the items in my grocery list by traversing through each of the aisles that hold the wanted items. If I brought another person with me to help me, we can divide up portions of the grocery list so that we each have to traverse through the aisles on our sides of the store; i.e. I gather items from the produce section while the other person gathers items at the poultry section. The more people I bring with me, the faster all the items can be found and gathered. After all the items are gathered, they can be all be checked out.

-----------------------------------------

### Question 2

\[20 pts, recurrence\]

(a) Show that the recurrence $T(n)=T(7n/10)+n$ has an upper bound complexity $T(n)=O(n)$

(b) Find and show an upper bound complexity for the recurrence $T(n)=2T(n/2)+n^4$

### Answer

(a) Using the Master method's form, $T(n)=aT(n/b)+f(n)$,

$T(n)=T(7n/10)+n \Rightarrow a=1, b=10/7, f(n)=n^1$

According to the Master method, if $f(n) = O(n^c)$, where $c < log_b(a)$, then $T(n)=\Theta(f(n))$

$\Rightarrow log_b(a) \Rightarrow log_{10/7}(1)=0$

$\Rightarrow n = n^1 \Rightarrow c = 1$

$\Rightarrow c > log_b(a)$

Therefore, $T(n)=\Theta(n)$

(b) Using the Master method's form, $T(n)=aT(n/b)+f(n)$,

$T(n)=2T(n/2)+n^4 \Rightarrow a=2, b=2, f(n)=n^4$

According to the Master method, if $f(n) = O(n^c)$, where $c < log_b(a)$, then $T(n)=\Theta(f(n))$

$\Rightarrow log_b(a) \Rightarrow log_{2}(2)=1$

$\Rightarrow n^4 \Rightarrow c = 4$

$\Rightarrow c > log_b(a)$

Therefore, $T(n)=\Theta(n^4)$

-----------------------------------------

### Question 3

\[20 pts, binary search\]

Given the binary (i.e. 2-ary) search algorithm as in the following, write the 4-ary search
function. Empirically show that 4-ary search is faster with a sufficiently high input array size.
```python
def bsearch(A, l, r, key): # i.e. bsearch(A,0,len(A)-1,key)
    if l <= r:
        N_2 = (l+r)//2
        if A[N_2] == key:
            return N_2
        elif A[N_2] > key:
            return bsearch(A, l, N_2-1, key)
        else:
            return bsearch(A, N_2+1, r, key)
    else:
        return -1
```

### Answer

In [1]:
def bsearch(A, l, r, key): # i.e. bsearch(A, 0, len(A)-1, key)
    if l <= r:
        N_2 = (l + r) // 2
        if A[N_2] == key:
            return N_2
        elif A[N_2] > key:
            return bsearch(A, l, N_2 - 1, key)
        else:
            return bsearch(A, N_2 + 1, r, key)
    else:
        return -1

In [2]:
def qsearch(A, l, r, key):
    if l <= r:
        q = (r - l) // 4  # compute the quarter key
        p1 = l + q  # first quarter key

        # if value is found in the first quarter key
        if A[p1] == key:
            return p1

        # search the first quarter
        elif A[p1] > key:
            return qsearch(A, l, p1 - 1, key)

        # value not found in first quarter
        else:

            p2 = p1 + q  # second quarter key

            # if value is found in the second quarter key
            if A[p2] == key:
                return p2

            # search the second quarter
            elif A[p2] > key:
                return qsearch(A, p1 + 1, p2 - 1, key)

            # values not found in second quarter
            else:

                p3 = p2 + q  # third quarter key

                # if value is found in the third quarter key
                if A[p3] == key:
                    return p3

                # search the fourth (last) quarter
                elif A[p3] < key:
                    return qsearch(A, p3 + 1, r, key)

                # value not found in last quarter
                else:
                    return qsearch(A, p2 + 1, p3 - 1, key)
    else:
        return -1


In [3]:
from random import sample, seed
import time
seed(time.time())

def gen_input_lists(input_list_len, linear=True):
    """
    Generate an input list of length `input_list_len`

    Args:
        `input_list_len` <int>: length of the input list
        `linear` <bool:True>:   flag to decide the type of input list generated.
                                If True, generate an input list of linearly ascending integers.
                                If False, generate an input list of random ascending integers.

    Returns:
        <list>: input list of length `input_list_len`
    """
    return list(range(input_list_len)) if linear \
        else sorted(sample(range(input_list_len * 10), input_list_len))

def gen_rand_keys(input_list_len, input_list):
    """
    Generate a list of `input_list_len`/2 random keys to gauge the performances of the 2 searches

    Args:
        `input_list_len` <int>: length of the input list
        `input_list` <list(int)>: input list of `input_list_len` integers

    Returns:
        <list>: a list of `input_list_len`/2 random keys
    """
    return sample(input_list, input_list_len // 2)


def time_search(search_func, input_list_len, input_list, rand_keys):
    """
    Loop through the keys generated in `rand_keys` and search them with the chosen search function
    to return the total time
    
    Args:
        `search_func` <function pointer:[bsearch, qsearch]>: pointer to the search function
        `input_list_len` <int>: length of the input list
        `input_list` <list(int)>: input list of `input_list_len` ascending integers
        `rand_keys` <list(int)>: list of random keys
        
    Returns:
        `total_time` <int>: the total time it took to search through the entire list of keys
    """
    total_time = 0
    for rand_key in rand_keys:
        begin = time.time()
        search_func(input_list, 0, input_list_len - 1, rand_key)
        total_time += (time.time() - begin)
        
    return total_time

#### Computing total runtimes with input lists of linearly increasing integers

In [4]:
print(f"{'Length':<4} {'2-ary time':11} {'4-ary time':12}")
print("-" * 32)
for i in range(1, 8):

    input_list_len = 10**i  # scale input list length exponentially
    
    # generate the input list and the random keys
    input_list = gen_input_lists(input_list_len)
    rand_keys = gen_rand_keys(input_list_len, input_list)

    # compute the total times for bsearch(A, l, r, key) and qsearch(A, l, r, key)
    bsearch_time = time_search(bsearch, input_list_len, input_list, rand_keys)
    qsearch_time = time_search(qsearch, input_list_len, input_list, rand_keys)
    print(f"10^{i:<2} {bsearch_time:10.7f} {qsearch_time:11.7f}")

Length 2-ary time  4-ary time  
--------------------------------
10^1   0.0000570   0.0000594
10^2   0.0003664   0.0003524
10^3   0.0066559   0.0069966
10^4   0.1218367   0.0431719
10^5   0.2811115   0.2392831
10^6   3.6670234   3.4766536
10^7  48.2982547  44.6696570


#### Computing total runtimes with input lists of random increasing integers

In [5]:
print(f"{'Length':<4} {'2-ary time':11} {'4-ary time':12}")
print("-" * 32)
for i in range(1, 8):

    input_list_len = 10**i  # scale input list length exponentially
    
    # generate the input list and the random keys
    input_list = gen_input_lists(input_list_len, linear=False)
    rand_keys = gen_rand_keys(input_list_len, input_list)

    # compute the total times for bsearch(A, l, r, key) and qsearch(A, l, r, key)
    bsearch_time = time_search(bsearch, input_list_len, input_list, rand_keys)
    qsearch_time = time_search(qsearch, input_list_len, input_list, rand_keys)
    print(f"10^{i:<2} {bsearch_time:10.7f} {qsearch_time:11.7f}")

Length 2-ary time  4-ary time  
--------------------------------
10^1   0.0000179   0.0000088
10^2   0.0001783   0.0001037
10^3   0.0016456   0.0014770
10^4   0.0246756   0.0192897
10^5   0.2975643   0.3028965
10^6   3.9130564   3.7128305
10^7  53.3614299  51.7778995


It is clear from the above comparisons that the 4-ary search performs consistently with a better runtime than the 2-ary search when the input list is sufficiently large.

The implementation of the 4-ary search used is a modified version of the original I came up with:

```python
def qsearch(A, l, r, key):
    if l <= r:
        q = (r - l) // 4  # compute the quarter key
        p1 = l + q  # first quarter key
        p2 = p1 + q  # second quarter key
        p3 = p2 + q  # third quarter key

        # if value is found in any of the quarter keys, return the key
        if A[p1] == key:
            return p1

        elif A[p2] == key:
            return p2

        elif A[p3] == key:
            return p3

        # search the first quarter
        elif A[p1] > key:
            return qsearch(A, l, p1 - 1, key)

        # search the second quarter
        elif A[p2] > key:
            return qsearch(A, p1 + 1, p2 - 1, key)

        # search the last quarter
        elif A[p3] < key:
            return qsearch(A, p3 + 1, r, key)

        # search the third quarter
        else:
            return qsearch(A, p2 + 1, p3 - 1, key)
    else:
        return -1
```

However, the total runtimes were very inconsistent. I realized afterwards that it was possible to further optimize by moving the additional arithmetic inside nested conditionals that required the computations. 

-----------------------------------------

### Question 4

\[40 pts, divide-and-conquer\]

Solve problem 4-5 (page 109), chip testing problem.

Given Problem 4-5:

**4-5 Chip testing**

Professor Diogenes has $n$ supposedly identical integrated-circuit chips that in principle are capable of testing each other. The professor’s test jig accommodates two chips at a time. When the jig is loaded, each chip tests the other and reports whether it is good or bad. A good chip always reports accurately whether the other chip is good or bad, but the professor cannot trust the answer of a bad chip. Thus, the four possible outcomes of a test are as follows:

|Chip A says|Chip B says|Conclusion|
|:----------|:----------|:----------|
|B is good|A is good|both are good, or both are bad|
|B is good|A is bad|at least one is bad|
|B is bad|A is good|at least one is bad|
|B is bad|A is bad|at least one is bad|

(a) Show that if more than $n/2$ chips are bad, the professor cannot necessarily determine which chips are good using any strategy based on this kind of pairwise test. Assume that the bad chips can conspire to fool the professor.

(b) Consider the problem of finding a single good chip from among $n$ chips, assuming that more than $n/2$ of the chips are good. Show that $\lfloor n/2 \rfloor$ pairwise tests are sufficient to reduce the problem to one of nearly half the size.

(c) Show that the good chips can be identified with $\Theta(n)$ pairwise tests, assuming that more than $n/2$ of the chips are good. Give and solve the recurrence that describes the number of tests.

### Answer

(a) Let $b$ denote the number of bad chips and $n-b$ denote the number of good chips. If $b \ge n/2$ or $b \ge n - b$, then there exists at least one bad chip for every good chips. This implies that if each of the $n-b$ good chips will report that the other chip is bad. However, the greater of equal number of bad chips will report that the good chip is actually bad. Furthermore, the bad chips can conspire to report each other as good chips and confuse the professor even more.

(b) Let $g$ denote the number of good chips and $n-g$ denote the number of bag chips. Given $g > n/2$, implies that for every $n-g$ bad chips there exists at least one good chip.

To test through all these chips, we can test each pairs and select a random chip from the pairs that report "good" and "good" in a subset $G$. $G$ now contains at most half the number of the original size and preserves the assumption that there are more good chips than bad ones.

If $n$ is odd and if there is exactly one or more more good chips than bad ones, then the last chip is bad and can be discarded or the last chip is good. This last chip will be determined after recursively going through each of the chips left in $G$.

(c) If at least $n/2$ chips are good, we can find one good chip using the deduction above. To find the other good chips, we can use the good chip to test the rest of the chips in $n-1$ comparisons. Therefore, we get the following recurrence:

$T(n) \le T(\lceil n/2 \rceil) + n/2$

Using the Master method, from $T(n) \le T(\lceil n/2 \rceil) + n/2$ we get $a=1, b=2, f(n)=1/2*n^1$

If $f(n) = O(n^c)$, where $c < log_b(a)$, then $T(n)=\Theta(f(n))$

$\Rightarrow log_b(a) \Rightarrow log_{2}(1)=0$

$\Rightarrow n = n^1 \Rightarrow c = 1$

$\Rightarrow 1 > 0$

Therefore, $T(n)=O(n)+(n-1)=\Theta(n)$