# Reflection poll

**Briefly discuss the time complexity of quicksort under the following two scenarios using examples from class today and knowing that the input array is already sorted: a) we use an O(n) randomization strategy, and b) we combine both randomization and the median of 3.**

In the case of a), randomizing the array allows us to avoid the worst case O(n^2) of deterministic quicksort when all elements are to one side of the pivot, and the recursion tree height would be O(n), leading to the overall complexity of O(n^2). If we do a), the expected runtime would be O(nlogn) + O(n) = O(nlogn). Doing b) is pointless in regards to a) because we have the same chance to set a particular element as a pivot, with or without median of 3.

# Pre-class

In [38]:
import timeit
import random

eps = 1e-16
N = 10
locations = [0.0, 0.5, 1.0 - eps]


def median(x1, x2, x3):
    for a in range(7):
        if x1 <= x2 <= x3:
            return x2
        # Every loop I'm shufflin
        (x1, x2, x3) = (x2, x1, x3)
        if a % 2:
            (x1, x2, x3) = (x3, x1, x2)


def qsort(lst):
    indices = [(0, len(lst))]

    while indices:
        (frm, to) = indices.pop()
        if frm == to:
            continue

        # Find the partition:
        N = to - frm
        inds = [frm + int(N * n) for n in locations]
        values = [lst[ind] for ind in inds]
        partition = median(*values)

        # Split into lists:
        lower = [a for a in lst[frm:to] if a < partition]
        upper = [a for a in lst[frm:to] if a > partition]
        counts = sum([1 for a in lst[frm:to] if a == partition])

        ind1 = frm + len(lower)
        ind2 = ind1 + counts

        # Push back into correct place:
        lst[frm:ind1] = lower
        lst[ind1:ind2] = [partition] * counts
        lst[ind2:to] = upper

        # Enqueue other locations
        indices.append((frm, ind1))
        indices.append((ind2, to))
    return lst


def randomized_quicksort():
    lst = [i for i in range(N)]
    random.shuffle(lst)
    return qsort(lst)


def test_quicksort():
    lst = randomized_quicksort()
    return (lst == [i for i in range(N)])


# Is our algorithm correct
print(test_quicksort())

# How fast is our algorithm
print(timeit.timeit(randomized_quicksort, number=1))

True
0.0001643149998926674


## Task 1
**Change the quicksort algorithm in a way that you don’t separate the items that are equal to the partition.**<br/>
Q. What is the time complexity of the new quicksort when dealing with a list of duplicates?<br/>
A. O(1), since I just ignore the elemets equal to partition. Then, if all elements are duplicate, lower and upper arrays will be empty.

Q. What was the time complexity before your modifications?<br/>
A. O(1), since we would have empty subarrays to work with after the first iteration.

In [116]:
def qsort(lst):
    indices = [(0, len(lst))]

    while indices:
        (frm, to) = indices.pop()
        if frm == to:
            continue

        # Find the partition:
        N = to - frm
        inds = [frm + int(N * n) for n in locations]
        values = [lst[ind] for ind in inds]
        partition = median(*values)

        # Split into lists:
        lower = [a for a in lst[frm:to] if a < partition]
        upper = [a for a in lst[frm:to] if a > partition]

        ind1 = frm + len(lower)
        ind2 = to - len(upper)

        # Push back into correct place:
        lst[frm:ind1] = lower
        lst[ind1:ind2] = [partition] * (ind2 - ind1)
        lst[ind2:to] = upper
    

        # Enqueue other locations
        indices.append((frm, ind1))
        indices.append((ind2, to))
     
    return lst

print(test_quicksort())
print(timeit.timeit(randomized_quicksort, number=1))

True
0.00013740499980485765


## Task 2
**Remove the median-of-3 partitioning, and just use the first element in the array.**<br/>
Q. Does this modification change the time complexity? Explain your answer.<br/>
A. No, but it does change the probability of the worst case scenario, since the chance of selecting a bad pivot is not 1/n^3 at each step, assuming that elements are all different.

Q. Will this change the practical performance? Why or why not?<br/>
A. Yes, we will get more or less even splits, then we would loop fewer times and perform fewer steps.

In [86]:
def qsort(lst):
    indices = [(0, len(lst))]

    while indices:
        (frm, to) = indices.pop()
        if frm == to:
            continue

        # Find the partition:
        N = to - frm
        partition = lst[frm]
        

        # Split into lists:
        lower = [a for a in lst[frm:to] if a < partition]
        upper = [a for a in lst[frm:to] if a > partition]
        counts = sum([1 for a in lst[frm:to] if a == partition])

        ind1 = frm + len(lower)
        ind2 = ind1 + counts

        # Push back into correct place:
        lst[frm:ind1] = lower
        lst[ind1:ind2] = [partition] * counts
        lst[ind2:to] = upper

        # Enqueue other locations
        indices.append((frm, ind1))
        indices.append((ind2, to))
        
    return lst

print(test_quicksort())
print(timeit.timeit(randomized_quicksort, number=1))

True
6.123099956312217e-05


## Task 3
**Implement a recursive version of qsort. Given the limitation of Python so that it can only make 500 recursive calls, estimate the maximum size of the list that can be sorted by Python. Explicitly state all assumptions you make in getting to an answer.**

At each partitioning, we make two recursive calls. Therefore, the total number of recursive calls would be $2^1 + 2^2 + 2^3 + ... + 2^{log_2 n}$. $\Sigma_{i=1}^{log_2 n} 2^i \leq 500$ => $n \leq 251$

In [108]:
def partition(arr, begin, end):
    idx = random.randint(begin, end - 1)
    pivot_elem = arr[idx]
    arr[idx], arr[end - 1] = arr[end - 1], arr[idx]
    i = begin
    for j in range(begin, end - 1):
        if arr[j] < pivot_elem:
            arr[i], arr[j] = arr[j], arr[i]
            i += 1
    arr[i], arr[end - 1] = arr[end - 1], arr[i]
    return i

def _quicksort(arr, begin, end):
    if begin + 1 < end:
        pivot = partition(arr, begin, end)
        _quicksort(arr, begin, pivot)
        _quicksort(arr, pivot + 1, end)
        
    return arr

def quicksort(arr):
    copy_arr = arr[:]
    return _quicksort(copy_arr, 0, len(copy_arr))

quicksort([1, 4, 5, 2, 3, 4])

[1, 2, 3, 4, 4, 5]

# Learning gist
**Generate a short, concise synthesis of the learning (~300 words) from the class session that the student will share with classmates and submit via the form to the instructor.**

In today's class, we compared deterministic and randomized quicksort. Doing such a comparison provided an insight why randomized quicksort would yield a better practical result, even though it takes additional O(n) to randomize the input. This is because in randomized quicksort, even if the input was intentionally put as a sorted array (for deterministic quicksort, this would be O(n^2)), we would still have the expected running time O(nlogn) when we do the probabilistic analysis. Instead of randomizing, we could also use the median-of-3 approach to deal with the sorted array case, in which we would more likely select a pivot that will provide a more even split.