# CSPB 3104 Assignment 4:

***
# Instructions

This assignment is to be completed as a python3 notebook.  When you upload, please upload the completed notebook (ipynb file).

The questions  provided  below will ask you to either write code or 
write answers in the form of markdown.

 Markdown syntax guide is here: [click here](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)

Using markdown you can typeset formulae using latex.
This way you can write nice readable answers with formulae like thus:

The algorithm runs in time $\Theta\left(n^{2.1\log_2(\log_2( n \log^*(n)))}\right)$, 
where $\log^*(n)$ is the inverse _Ackerman_ function.

__Double click anywhere on this box to find out how your instructor typeset it. Press Shift+Enter to go back.__

***

## Question 1



 Your professor has the brilliant idea of using heaps to select the pivot in the quicksort algorithm as follows:
   - Heapify the array $a$.
   - Choose a leaf element at random  (i.e, an element in $A[\lceil \frac{n}{2} \rceil ] , \ldots, A[n]$ ) and use it as a pivot.
   - Apply Lomuto's partitioning. 

 If this scheme is used in quicksort, what is the __worst case__ complexity of the resulting algorithm?







The worst case complexity for heapifying an array is $O(n)$. Choosing a random leaf element would be equivalent to doing $O(1)$, which is a constant operation. For Lomuto's Partitioning, the worst case comes when the array is already sorted or the pivot is the largest element. This complexity is of $O(n^2)$. So quicksorting this array would have a worst case time complexity of $O(n^3)$

---
## Question 2a: Move Negatives To Left

 You are given an input array $a$ with negative and positive numbers. Write an algorithm that partitions the array so that the negative numbers are moved to the left hand side and positive numbers to the right hand side. However, the relative ordering between the negative numbers should not be altered. However you may alter the ordering amongst the positive numbers.



 Input: array a with positive and negative numbers. Size = n. 

 Output: partitioned array a, index j such that $a[0], \ldots, a[j-1]$ are negative and $a[j], \ldots, a[n-1]$ are positive.

 Note since arrays are passed by reference in python, you just need to return j

 Constraints: must be done in place. Relative ordering between negative elements unchanged.

 Example: 

 Input array a = [-2, 3, -1, 4, 5, -3, -4, -1, -2, 5]

 Output array a = [-2, -1, -3, -4, -1, -2, 3, 5, 5, 4]

 Output       j = 6


In [63]:
def move_negatives_to_left(a):
    j = neg_rec(a, 0, -1)
    return j

def neg_rec(a, i, j):
    if i == len(a):
        return j + 1
    else:
        if a[i] < 0:
            j += 1
            swap(a, i, j)
        return neg_rec(a, i+1, j)

def swap(A, i, j):
    temp = A[i]
    A[i] = A[j]
    A[j] = temp

# arr = [-2, 3, -1, 4, 5, -3, -4, -1, -2, 5]
# print(arr)
# move_negatives_to_left(arr)
# print(arr)


__2(b):__ Give the running time of your solution and briefly explain the logic by clearly writing down
the loop invariants that hold during the operation of your algorithm and why these invariants lead to the correct result.

The running time for this solution is of linear time $\Theta(n)$ as the algorithm only contains one recursive call and swapping operations are of constant time. j+1 represents the index of the last negative number identified in the array.

Loop Invariants:
- The position of j indicates the position of the final negative number in the iteration. So from 0 to j, all the numbers are negative. The left of the array only contains the negative numbers.
- From j+1 to i-1, all the elements are positive, this is the right side of the array (the left contains all of the negative numbers)
- The numbers from i to n-1 have not been iterated through; they could go to the right or to the left of the array.

The invariants lead to the correct result because the position of the negative numbers is preserved by j. This ensures every negative number is in the expected position. The order for positive numbers does not matter, but using j to swap the numbers to the desired spot yields the ordered array. The base case ensures every number has been checked.

---
## Question 3: Median of Median Selection.

 In the class, we analyzed an approach for pivot selection that used median of 5 medians.  Here we explore what happens
with median of 3 medians.

 1. Divide the input array $a$ of size n into $\frac{n}{3}$ groups of $3$ elements each.
 2. Calculate the median of each group of 3 to create a new array $\hat{a}$ of these medians.
 3. Recursively apply the algorithm to find the median of $\hat{a}$. Let it be $m$.
 4. Use $m$ as the pivot element to partition the original array $a$.

__3(a)__ How many elements in the array $a$ are guaranteed to be less than the chosen pivot $m$? How many are guaranteed to be greater? Assume all elements in the array $a$ are distinct.











To determine the amount of elements that are guaranteed to be less and the amount of elements that are guaranteed to be greater than m, the first step (of dividing the array in $\frac{n}{3}$) needs to be put in a sorted order:

Doing so makes the array look like this: $M = [M_1, M_2, M(n/6), M_4, M(n/3)]$

Around each median, there is group of 3 elements. Meaning $M_1$ has a number above it that is of a larger value and a number below it that is of a smaller value. This is true for each of the medians. In the upper right block starting from $M_1$ and going up to the largest number of the $M_1$ group and stopping at $M(n/6)$, we have guaranteed that each of these numbers are smaller than the median. This means that The median of medians, called $x$, has $2 * \frac{n}{6} = \frac{n}{3}$ numbers that are smaller than x. There are about $\frac{n}{3}$ that are going to be smaller than the median of medians $x$. The same is true for the other side. Counting the amount of elements that are greater than $x$ means counting the group of elements to the right side of the median. Starting from $M_4$ and going through a number below $M(n/3)$, means that there are also $2 * \frac{n}{6} = \frac{n}{3}$ elements that are larger than the median of medians x.

So, there are $\frac{n}{3}$ elements that are guaranteed to be less than the chosen pivot $m$ and $\frac{n}{3}$ elements that are guaranteed to be greater than the chosen pivot $m$.

 __3(b)__ If $m$ computed using the median of 3 medians were used to partition the array $a$ for a *quickselect* algorithm that is used to find the median of an array $a$, write down the recurrence for $T(n)$, the time taken to find the median of an array of size $n$ using the quick select algorithm with the median of 3 medians trick.


We already know that at least $\frac{2n}{3}$ elements are smaller than the pivot element, the same case goes from larger. This means that the Time Recurrence will be written as:

T(n) = $\Theta(n) + T(\frac{n}{3}) + C_n + T(\frac{2n}{3})$

Where:
- $\Theta(n)$ is the time taken to select the pivot element
- $T(\frac{n}{3})$ is the time to calculate the median on an array of size $\frac{n}{3}$
- $C_n$ is the time taken to partition
- $T(\frac{2n}{3})$ is the time taken to find the median recursively. In the worst case, there will be $\frac{2n}{3}$ elements on one side, so $\frac{2n}{3}$ is used.

 __3(c)__ The celebrated "Akra-Bazzi" method shows that the recurrence $S(n) = S(\alpha n) + S( (1-\alpha)n) + \Theta(n)$ with base case $S(1) = \Theta(1)$ has solution $S(n) = \Theta(n \log (n) )$. Use this to show that median of 3 medians trick fails to achieve a linear time algorithm for quickselect. (**Note** However, as we saw in the lecture, median of 5 medians works to provide $\Theta(n)$ deterministic selection algorithm or $\Theta(n \log(n))$ quicksort that does not depend on randomization in any way).

For using the "Akra-Bazzi" method, we have $\alpha = \frac{1}{3}$ and $\beta = \frac{2}{3}$. The Akra-Bazzi method states that if $\alpha + \beta < 1$, then $S(n) = \Theta(n)$. Given that $\alpha + \beta = \frac{1}{3} + \frac{2}{3} = 1$, then it is not true that the median of 3 medians trick runs in linear time.

---

## Question 4: Detective Work on Pre-Order Traversal of a BST


 An BST with integer keys in each node is traversed using pre-order traversal and the keys in each node are presented in the order
they are visited as an array $a$ of $n$ elements -- $a: [a[1], \ldots, a[n]]$. Assume that the elements of this array are all distinct.



 __4(a)__ Describe an algorithm to reconstruct the tree in pseudocode. What is the complexity of your algorithm? 
 
 **Hint:** First identify the root of the tree. Next, how do we identify which elements of the array belong to the left subtree of the root, and which elements to the right subtree? Once that is done, can you recursively perform the reconstruction.

 Note that you will learn how to build trees properly in your CSPB 2270 class. Here, assume a pseudocode function called `build_tree(n, T1, T2)` that build a tree with root node n and subtrees T1, T2 and returns it.



``` 
construct_preorder(a):
    if length == 1:
        return
    root := a[1]
    T1 := [] // left subtree
    T2 := [] // right subtree
    for idx = 1 to length
        if root < a[idx]
            T2.append(a[idx]) // apppend to right because it is bigger
        else:
            T1.append(a[idx]) // append to left because it is smaller
    left := construct_preorder(T1)
    right := construct_preorder(T2)
    
    build_tree(n, left, right)
    

```

As the left and right subtrees are BSTs themselves, the function needs to be called recursively for the left and right subtree, to achieve the pre-order convention. As the right subtree contains only the numbers bigger than the root, the code iterates through the list to check: if the number is smaller than the root, it goes to the left subtree, if it is not, then to the right subtree.

The time complexity of this algorithm is of $\Theta(n^2)$ , as the algorithm contains recursive calls after a for loop, doing $n^2$ operations.


 __4(b)__ Describe an algorithm that converts the array obtained using the pre-order traversal of a BST into an array representing the post-order traversal without reconstructing the tree. **Hint:** Use the previous part but now instead of reconstructing the tree, think of how pre and post order traversals differ.

The difference between a post-order and a pre-order is that the post-order has the order of: left, right, node.

1) Obtain the pre-order traversal array from the past example
2) Create an empty array array to store the post-order output
3) The first element of the array is the root, so create a root variable with the first index.
4) Given that all the elements to the left of the BST are less than the root, find the index of the one that is greater than the root. That is the right subtree.
5) Left subtree: elements in the array that are less than the root (index goes from start to the index of the right subtree found in step 4). Right subtree: index found on step 4 to final index of array.
6) Call the function on the left subtree (recursive step).
7) Call the function on the right subtree (recursive step). Calling this on both subtrees generates the post-order traversal for each.
8) Join the root with the resulting arrays into the post-order array.

## Testing your solutions -- Do not edit code beyond this point

In [64]:
import random

def unequalArrays(a, b):
    n = min(len(a), len(b))
    for j in range(n):
        if a[j] != b[j]:
            return True
    if len(a) != len(b):
        return True
    return False

def test_move_negatives(a):
    b = [e for e in a if e < 0]
    j0 = len(b)
    j = move_negatives_to_left(a)
    res = True
    if j != j0:
        print('Failed: input =', a)
        print('Failed: expected value j = ', j0, ' Your code obtained j = ', j)
        res = False
    if unequalArrays(b, a[0:j]):
        if res:
            print('Failed: input =', a)
        print('Failed: the LHS portion should be = ', b)
        print('\t Your code returned: ', a[0:j])
        res = False
    return res

def createRandomArray(n):
    a = []
    for i in range(n):
        j = random.randint(-1000,1000)
        if j == 0: 
            j = 1
        a.append(j)
    return a

nPassed = 0
nTests = 10000
for i in range(0, nTests):
    a = createRandomArray(9)
    res = test_move_negatives(a)
    if res: 
        nPassed = nPassed + 1
print('Num Tests = ', nTests, ' Passed = ', nPassed)

Num Tests =  10000  Passed =  10000
