The file contains all of the integers between 1 and 10,000 (inclusive, with no repeats) in unsorted order. The integer in the $i^{th}$ row of the file gives you the $i^{th}$ entry of an input array.

Your task is to compute the total number of comparisons used to sort the given input file by QuickSort. As you know, the number of comparisons depends on which elements are chosen as pivots, so we'll ask you to explore three different pivoting rules.

You should not count comparisons one-by-one. Rather, when there is a recursive call on a subarray of length $m$, you should simply add $m−1$ to your running total of comparisons. (This is because the pivot element is compared to each of the other $m−1$ elements in the subarray in this recursive call.)

In [1]:
import numpy as np

### Problem 1. 
For the first part of the programming assignment, you should always use the first element of the array as the pivot element.

In [2]:
def choose_pivot(A, l):
    
    # choose first element as the pivot
    return A[l]

In [3]:
# partition subarray A in place around a pivot
def partition(A, l, r):
    
    # pivot
    p = choose_pivot(A, l)    
    i = l+1
    
    for j in range(l+1, r):
        if A[j] < p:
            # swap 
            tmp = A[i]
            A[i] = A[j]
            A[j] = tmp
            
            # increment i
            i += 1
            
    # swap A[l] - pivot and A[i-1] 
    tmp = A[l]
    A[l] = A[i-1]
    A[i-1] = tmp
    
    # return partitioned A, index of pivot and length of subarray to sort
    return A, i-1

Check partition function:

In [4]:
a = np.fromfile('QuickSort.txt', sep='\n', dtype=np.int)
a = a[:10]
print('A: ', a)
partition(a, 0, 10)

A:  [2148 9058 7742 3153 6324  609 7628 5469 7017  504]


(array([ 504,  609, 2148, 3153, 6324, 9058, 7628, 5469, 7017, 7742]), 2)

In [5]:
# count comparisions quicksort makes when we always choose the first element as the pivot
def count_comparisons(A):
    n = len(A)
    
    if n < 2 :
        # if subarray is 0 or 1 length, it is already sorted, so return 0 to add to the comparison total
        return 0
    else:
        
        # partition A around p and return the index of pivot
        A, i = partition(A, 0, n)
        
        # recursively sort 1st part to the left of the pivot
        le_count = count_comparisons(A[ : i]) 

        # recursively sort 2nd part to the right of the pivot
        ri_count = count_comparisons(A[i + 1 : ])

        return le_count + ri_count + n-1

Example 1:


In [6]:
a = np.fromfile('QuickSort.txt', sep='\n', dtype=np.int)
a = a[:10]
print('A: ', a)
count_comparisons(a)

A:  [2148 9058 7742 3153 6324  609 7628 5469 7017  504]


25

Example 2:

In [7]:
# for sumbission (correct answer is 162085)
a = np.fromfile('QuickSort.txt', sep='\n', dtype=np.int)
count_comparisons(a)

162085

### Problem 2.
Compute the number of comparisons (as in Problem 1), always using the final element of the given array as the pivot element. Recall that, just before the main `Partition` subroutine, you should exchange the pivot element (i.e., the last element) with the first element.

In [8]:
# partition subarray A in place around a pivot
def partition(A, l, r):
    # swap last and first elements
    tmp = A[l]
    A[l] = A[r-1]
    A[r-1] = tmp
    
    # pivot
    p = choose_pivot(A, l)    
    i = l+1
    
    for j in range(l+1, r):
        if A[j] < p:
            # swap 
            tmp = A[i]
            A[i] = A[j]
            A[j] = tmp
            
            # increment i
            i += 1
            
    # swap A[l] - pivot and A[i-1] 
    tmp = A[l]
    A[l] = A[i-1]
    A[i-1] = tmp
    
    # return partitioned A, index of pivot and length of subarray to sort
    return A, i-1

In [9]:
# count comparisions quicksort makes when we always choose the last element as the pivot
def count_comparisons(A):
    n = len(A)
    
    if n < 2 :
        # if subarray is 0 or 1 length, it is already sorted, so return 0 to add to the comparison total
        return 0
    else:
                
        # partition A around p and return the index of pivot
        A, i = partition(A, 0, n)
        
        # recursively sort 1st part to the left of the pivot
        le_count = count_comparisons(A[ : i]) 

        # recursively sort 2nd part to the right of the pivot
        ri_count = count_comparisons(A[i + 1 : ])

        return le_count + ri_count + n-1

Example 1:


In [10]:
a = np.fromfile('QuickSort.txt', sep='\n', dtype=np.int)
a = a[:10]
print('A: ', a)
count_comparisons(a)

A:  [2148 9058 7742 3153 6324  609 7628 5469 7017  504]


31

Example 2:

In [11]:
# for sumbission (correct answer is 164123)
a = np.fromfile('QuickSort.txt', sep='\n', dtype=np.int)
count_comparisons(a)

164123

### Problem 3.
Compute the number of comparisons (as in Problem 1), using the "median-of-three" pivot rule. [The primary motivation behind this rule is to do a little bit of extra work to get much better performance on input arrays that are nearly sorted or reverse sorted.] In more detail, you should choose the pivot as follows. Consider the first, middle, and final elements of the given array. (If the array has odd length it should be clear what the "middle" element is; for an array with even length $2k$, use the $k^{th}$ element as the "middle" element. So for the array 4 5 6 7, the "middle" element is the second one - 5 and not 6!) Identify which of these three elements is the median (i.e., the one whose value is in between the other two), and use this as your pivot. 

In [12]:
# partition subarray A in place around a pivot
def partition(A, l, r):
    
    # identify the set of three: first, middle, and last elements
    ar = np.empty(3)
    ar[0] = A[l]
    
    # middle element
    # if A is even
    if len(A) % 2 == 0:
        ar[1] = A[np.int( (r-l) / 2 - 1 )] 

    # if A is odd
    else:
        ar[1] = A[ np.int( r-l ) // 2] 
        
    ar[2] = A[r-1]  
    
        
    # pivot is the median of the three elements
    p_index = np.where(A == np.median(ar))
    
    # swap pivot and first elements
    tmp = A[l]
    A[l] = A[p_index]
    A[p_index] = tmp
    
    # pivot
    p = choose_pivot(A, l)    
    i = l+1
    
    for j in range(l+1, r):
        if A[j] < p:
            # swap 
            tmp = A[i]
            A[i] = A[j]
            A[j] = tmp
            
            # increment i
            i += 1
            
    # swap A[l] - pivot and A[i-1] 
    tmp = A[l]
    A[l] = A[i-1]
    A[i-1] = tmp
    
    # return partitioned A, index of pivot and length of subarray to sort
    return A, i-1

In [13]:
# count comparisions quicksort makes when we implement the median-of-three pivot rule
def count_comparisons(A):
    n = len(A)
    
    if n < 2 :
        # if subarray is 0 or 1 length, it is already sorted, so return 0 to add to the comparison total
        return 0
    else:
                
        # partition A around p and return the index of pivot
        A, i = partition(A, 0, n)
        
        # recursively sort 1st part to the left of the pivot
        le_count = count_comparisons(A[ : i]) 

        # recursively sort 2nd part to the right of the pivot
        ri_count = count_comparisons(A[i + 1 : ])

        return le_count + ri_count + n-1

Example 1:


In [14]:
a = np.fromfile('QuickSort.txt', sep='\n', dtype=np.int)
a = a[:10]
print('A: ', a)
count_comparisons(a)

A:  [2148 9058 7742 3153 6324  609 7628 5469 7017  504]


21

Example 2:

In [15]:
# for sumbission (correct answer is 164123)
a = np.fromfile('QuickSort.txt', sep='\n', dtype=np.int)
count_comparisons(a)

138382