# QuickSort

 Your task is to compute the total number of comparisons used to sort the given input file by QuickSort.  As you know, the number of comparisons depends on which elements are chosen as pivots, so we'll ask you to explore three different pivoting rules.

You should not count comparisons one-by-one.  Rather, when there is a recursive call on a subarray of length $m$, you should simply add $m-1$ to your running total of comparisons. (This is because the pivot element is compared to each of the other $m-1$ elements in the subarray in this recursive call.)

In [123]:
import numpy as np

def choosePivot(arr, low, high):
    '''
    Choose the pivot for the quicksort algorithm.
    
    This first function is just as base for the structure
    its purpose is to be replaces based on criterias.

    It returns the index of the pivot element.
    '''
    return

def partition(arr, low, high):
    '''Partition the array around the pivot element.
    
    Arguments:
        arr -- the array to be sorted
        low -- the starting index of the subarray
        high -- the ending index of the subarray
    
    Returns:
        The index of the pivot element after partitioning.
        Number of comparisons made.
    '''
    #Choose pivot using the function defined above
    pivot_index = choosePivot(arr, low, high)
    pivot = arr[pivot_index]
    #Move pivot to the beggining of the subarray 
    #(To mantain same stricture as the original code)
    if pivot_index!=low: arr[pivot_index], arr[low] = arr[low], arr[pivot_index]
    #Apply inplace swapping and count comparisons
    i = low + 1
    for j in range(low+1, high+1):
        if arr[j] < pivot:
            arr[i], arr[j] = arr[j], arr[i]
            i += 1
    #Swap the pivot element with the element at index i-1
    arr[i - 1], arr[low] = arr[low], arr[i - 1]
    count_comparisons = high - low
    #Return the index of the pivot element and the number of comparisons made
    return i - 1 , count_comparisons

def quickSort(arr, low, high):
    '''Sort the array using the quicksort algorithm.
    
    Arguments:
        arr -- the array to be sorted
        low -- the starting index of the subarray
        high -- the ending index of the subarray
    
    Returns:
        The number of comparisons made during the sorting process.
    '''
    if low < high:
        #Partition the array and get the pivot index
        pi , count = partition(arr, low, high)
        #Recursively sort the subarrays
        cl = quickSort(arr, low, pi - 1)
        cr = quickSort(arr, pi + 1, high)
        return count + cl + cr
    return 0

Now we will load our data to test and we will make a small example to check

In [124]:
#The file text must be in the same directory as this script
#The file must be named IntegerArray.txt
#The file must contain integers separated by new lines
#The file used is the same as generated at the course but will not be provided in this repository
#The file is not too big so it can be read in memory
def loadFile():
        filename = 'QuickSort.txt'
        with open(filename, 'r') as file:
                lines = file.readlines()
                A = [int(line.strip()) for line in lines]
                A = np.array(A)
                return A

A = loadFile()
print(f'Data: {A}')
print(f'Length: {len(A)}')

Data: [2148 9058 7742 ... 7266 5792 9269]
Length: 10000


In [125]:
arr = np.array([3, 7, 8, 9, 1, 5])
print(f'Data: {arr}')
print(f'Length: {len(arr)}')

Data: [3 7 8 9 1 5]
Length: 6


For the first part of the programming assignment, you should always use the first element of the array as the pivot element.

In [126]:
def choosePivot(arr, low, high):
    '''
    Choose the pivot for the quicksort algorithm using the first subarray element.
    
    Arguments:
        arr -- the array to be sorted
        low -- the starting index of the subarray
        high -- the ending index of the subarray
    
    Returns:
        The index of the pivot element.
    '''
    return low

In [127]:
a = arr.copy()
c = quickSort(a, 0, len(a) - 1)

print(f'Sorted: {a}')
print(f'Comparisons: {c}')

Sorted: [1 3 5 7 8 9]
Comparisons: 9


In [128]:
a = loadFile()
c = quickSort(a, 0, len(a) - 1)

print(f'Sorted: {a}')
print(f'Comparisons: {c}')

Sorted: [    1     2     3 ...  9998  9999 10000]
Comparisons: 162085


Compute the number of comparisons (as in Problem 1), always using the final element of the given array as the pivot element. 

In [129]:
def choosePivot(arr, low, high):
    '''
    Choose the pivot for the quicksort algorithm using the last subarray element.
    
    Arguments:
        arr -- the array to be sorted
        low -- the starting index of the subarray
        high -- the ending index of the subarray
    
    Returns:
        The index of the pivot element.
    '''
    return high

In [130]:
a = arr.copy()
c = quickSort(a, 0, len(a) - 1)

print(f'Sorted: {a}')
print(f'Comparisons: {c}')

Sorted: [1 3 5 7 8 9]
Comparisons: 8


In [131]:
a = loadFile()
c = quickSort(a, 0, len(a) - 1)

print(f'Sorted: {a}')
print(f'Comparisons: {c}')

Sorted: [    1     2     3 ...  9998  9999 10000]
Comparisons: 164123


Compute the number of comparisons (as in Problem 1), using the "median-of-three" pivot rule.  [The primary motivation behind this rule is to do a little bit of extra work to get much better performance on input arrays that are nearly sorted or reverse sorted.]  In more detail, you should choose the pivot as follows.  Consider the first, middle, and final elements of the given array.  (If the array has odd length it should be clear what the "middle" element is; for an array with even length $2k$ use the $k^{th}$ element as the "middle" element. So for the array 4 5 6 7,  the "middle" element is the second one ---- 5 and not 6!)  Identify which of these three elements is the median (i.e., the one whose value is in between the other two), and use this as your pivot.

In [132]:
def choosePivot(arr, low, high):
    '''
    Choose the pivot for the quicksort algorithm using the median-of-three rule.
    
    Arguments:
        arr -- the array to be sorted
        low -- the starting index of the subarray
        high -- the ending index of the subarray
    
    Returns:
        The index of the pivot element.
    '''
    mid  = (low + high) // 2
    a,b,c = arr[low], arr[mid], arr[high]
    trio = [(a, low), (b, mid), (c, high)]
    trio_sorted = sorted(trio, key=lambda x: x[0])
    return trio_sorted[1][1]

In [133]:
a = arr.copy()
c = quickSort(a, 0, len(a) - 1)

print(f'Sorted: {a}')
print(f'Comparisons: {c}')

Sorted: [1 3 5 7 8 9]
Comparisons: 8


In [134]:
a = loadFile()
c = quickSort(a, 0, len(a) - 1)

print(f'Sorted: {a}')
print(f'Comparisons: {c}')

Sorted: [    1     2     3 ...  9998  9999 10000]
Comparisons: 138382
