# CSPB-3104 Programming Assignment 3



1) (10 points) Implement quicksort

Your function must sort the array in place, and return the number of direct comparisons (is a[i]  < a[j]?) made.

In [1]:
def quicksort(a): # a will be our array sorted in place 
    # partition the array 
    def partition(low, high, comparison_count): # low == starting index of sub array, high = ending index of subarray 
        # comparison count has to be a mutable list to keep track of # of comparisons 
        # if comparison count is not a list, when we execute comparison_count += 1 later, it will create a new local variable that 
        # shadows outside variable , and don't change comparison_count defined in the outer scope, so even after all the recursive
        # calls, comparison_count will still be 0 since only the local variables have been updated
        
        pivot = a[high] # choose last element in the array as the pivot point
        # this call is going to be a[8] = 8, and make j iterate through 0 -> 7
        i = low - 1 # index
        
        # iterate through the subarray to rearrange elements
        for j in range(low, high): # use same low high from earlier
            comparison_count[0] += 1 # count the comparisons for values that a[j] are less tahn the pivot 
            if a[j] < pivot: # if a[j] less than pivot, then increment index of the smaller elements 
                i += 1
                a[i], a[j] = a[j], a[i] # and swap the elements 
        
        # now put the pivot in the right place, swapping it front of region 2
        a[i+1], a[high] = a[high], a[i+1]
        
        return i+1 #return position 
    
    # helper function for quicksort recursively calling itself 
    def recursion_helper(low, high, comparison_count): 
        if low < high: # checking for the base case (if the low is not less than high, we should exit)
            pivot_index = partition(low, high, comparison_count) # assign pivot index (the i+1 from above) to pivot_index var
            recursion_helper(low, pivot_index - 1, comparison_count) # recursion for sorting region 1 and 2 
            recursion_helper(pivot_index + 1, high, comparison_count)
            
    comparison_count = [0] # initialize counter for comparisons and this needs to be a list 
    recursion_helper(0, len(a) - 1, comparison_count) # recursively quickosrt 
    
    return comparison_count[0] 

    raise NotImplementedError 

Notes for review (how the data is travelling):

example array: [3, 1, 4, 6, 2, 5, 9, 7, 8]

initial call:
- quicksort_recursive(0, 8, comparison_count)
    - low = 0, high = 8, comparison_count = [0]

first partition call: 
- partition(0, 8, comparison_count):
    - pivot chosen: a[8] = 8
    - partitioning:
        - iterate through the array with j from 0 to 7
        - make comparison: a[0] < 8, a[1] < 8, ..., a[7] < 8
        - swap elements less than pivot to the left
        - looks like [3, 1, 4, 6, 2, 5, 7, 8, 9], with pivot 8 at index 7

    - returns: pivot index = 7, comparison_count = [8]
    
recursive calls after first partition:
   - left subarray (region 1): quicksort_recursive(0, 6, comparison_count):
        - low = 0, high = 6
   - right subarray (region 2): quicksort_recursive(8, 8, comparison_count):
        - low = 8, high = 8 (Base case: no action taken)
        
second partition call for left subarray:
- partition(0, 6, comparison_count):
    - pivot chosen: a[6] = 7
    - partitioning:
        - iterate  through j from 0 to 5
        - make comparisons and swaps
        - looks like [3, 1, 4, 6, 2, 5, 7, 8, 9], with pivot 7 at index 6
    - returns: pivot index = 6, comparison_count = [14]

continue recursive calls for smaller subarrays:
- recursively, partitioning smaller and smaller subarrays until each subarray has one or zero elements

wrapping it up:
- when all recursive calls are finished, the array is fully sorted in place: [1, 2, 3, 4, 5, 6, 7, 8, 9].
- the total number of comparisons is stored in comparison_count[0]


2) (15 points) Quickselect with Median-of-7-Medians

You must implement quickselect with m-o-7-m.  

Recall quickselect(a, j) will find the jth largest element in the list a.  (So quickselect(a, n//2) will be the median of the list a, where length(a) == n.  j starts counting from 1, 
so quickselect(a, 1) is the largest element of the list).
Using the median-o-7-m trick goes like this:

On a call to quickselect(a,j):
1. First, split a into n//7 lists of length 7 and sort them.
2. Next, make a list of their medians (the middle element of each list)
3. Next, use quickselect on this list of medians to find *its* median,
4. Partition the original list using this median as pivot
5. recurse quickselect on the side containing the sought for element.


In [2]:
def quickselect(a, j): # a will be our array, j is the largest element, and earliest in the array (descending)
    
    # helper function where input array around a given pivot number
    def partition(arr, pivot):
        left = [x for x in arr if x > pivot] # create a new array conntaining all elements of array that are greater than pivot
        # the goal is to find the j-th largest element 
        right = [x for x in arr if x < pivot] # created array of elements less than pivot
        pivot_count = len(arr) - len(left) - len(right) # calculates number of elements in arr = to the pivot (cover edge cases)
        
        return left, right, pivot_count # returns 3 values, elements greater & less than pivot, and count of elements = pivot
    
    def median_of_sevens(arr): # supposed to find the median of medians 
        groups = [sorted(arr[i:i+7]) for i in range(0, len(arr), 7)] # splits the arr into sublists of length 7
        # each 7-length chunk is sorted and then are stored in groups, sorting makes finding median easier 
        medians = [group[len(group)//2] for group in groups] 
        # makes a list of medians containing middle element of each sorted group in groups 
        # this list is used to find the median of medians 
        
        return quickselect(medians, len(medians)//2 + 1) # recursively find the median of the "medians"-list where the length
        # of the len(medians)//2 +1 is the median's position - finding the pivot for quickselect algorithm that is optimizing 
        # for problem size 
    
    if len(a) == 1: 
        # if input list has 1 element, this means the recursive function reached its base case and should exit 
        return a[0]
    
    pivot = median_of_sevens(a)
    # assign median function to a variable and pass that as an argument, calls the function to find the best pivot for median
    # of 7 medians, and this also guarantees the pivot is chosen from balanced partitions (avoiding worst-case scenarios)
    
    left, right, pivot_count = partition(a, pivot) # partition list a around the pivot nubmer, pivot_count is num of elements = 
    # pivot still, divides the list into parts for more recursion basically 
    
    if j <= len(left): # check jth largest element is on the left of partition, if it is, recursion quickselect on the left side 
        return quickselect(left, j) # this is the recursion part 
    
    elif j <= len(left) + pivot_count: # if the jth largest element equal to pivot, return the pivot since the desired element is 
        # one of the pivot elements
        return pivot # returns the j-th largest element because j falls within range of pivot elements 
    
    else:
        # catch all cases where j-th largest element is on the right side of partition 
        return quickselect(right, j - len(left) - pivot_count) # recursively call quickselect on the right partition for j-th 
        # largest element, adjusting j to account for # of elements on the left partition = to the pivot number 
    
    raise NotImplementedError


Testing below

----

In [3]:
## DO NOT EDIT TESTING CODE FOR YOUR ANSWER ABOVE
# Press shift enter to test your code. Ensure that your code has been saved first by pressing shift+enter on the previous cell.
from IPython.core.display import display, HTML
def quicksort_test():
    failed = False
    test_cases = [ 
        [9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 0, 0, 0, 0, 0, -1],
        [10, -10, 9, -9, 8, -8, 7, -7]
    ]
    for test_array in test_cases:
        original = test_array.copy()
        expected_output = sorted(test_array)
        compare_count = quicksort(test_array)
        if (test_array != expected_output):
            s1 = '<font color=\"red\"> Failed - test case: Inputs: a=' + str(original) + "<br>"
            s2 = '  <b> Expected Output: </b> ' + str(expected_output) + ' Your code output: ' + str(test_array)+ "<br>"
            display(HTML(s1+s2))
            failed = True
            
    if failed:
        display(HTML('<font color="red"> One or more tests failed. </font>'))
    else:
        display(HTML('<font color="green"> All tests succeeded! </font>'))
quicksort_test()

In [4]:
def quickselect_test():
    failed = False
    test_cases = [ 
        ([9, 8, 7, 6, 5, 4, 3, 2, 1, 0], 2, 8) ,
        ([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 10, 0),
        ([0, 0, 0, 0, 0, 0, -1], 1, 0),
        ([10, -10, 9, -9, 8, -8, 7, -7], 5, -7)
    ]
    for (test_array, j, expected_output) in test_cases:
        output =  quickselect(test_array, j)
        if (output != expected_output):
            s1 = '<font color=\"red\"> Failed - test case: Inputs: a=' + str(test_array) + ' j = ' + str(j) +"<br>"
            s2 = '  <b> Expected Output: </b> ' + str(expected_output) + ' Your code output: ' + str(output)+ "<br>"
            display(HTML(s1+s2))
            failed = True
            
    if failed:
        display(HTML('<font color="red"> One or more tests failed. </font>'))
    else:
        display(HTML('<font color="green"> All tests succeeded! </font>'))
quickselect_test()