### Linear Time Selection

In this notebook we will look at two types of selection algorithms, the RSelect, or the randomized selection algorithm and a deterministic version called DSelect. But before we proceed the question is "What is selection?" A selection problem aims to find the $i^{th}$ smallest number for n given numbers. The simplest approach would be to sort the input array and then select the $i^{th}$ position number essentially doing the selection in $nlogn$ time. This however is not the optimum way as selection if an easier problem than sorting and thus we should achieve this in much less running time than sorting. To select a number from n given numbers, we will have to view all the numbers at the minimum and thus $O(n)$ is a reasonable lower bound and we know that sorting can be done in $O(nlogn)$ which is an upperbound for our problem.

Can we do something to possibly perform the selection in $O(n)$ time? Let us implement the selection problem using the partition routine we used in quick sort.



In [47]:
def swap(in_array, ix1, ix2):
    t = in_array[ix1]
    in_array[ix1] = in_array[ix2]
    in_array[ix2] = t
    
def partition(in_array, start_ix, end_ix):

    partition_idx = start_ix
    i = start_ix
    pivot = in_array[partition_idx]
    for idx in range(start_ix + 1, end_ix):
        if in_array[idx] <=  pivot:
            i += 1
            if idx != i:
                swap(in_array, i, idx)

    swap(in_array, start_ix, i)
    partition_idx = i
    
    return partition_idx

def RSelect(array, i, start_ix = 0, end_ix = -1):
    #From a given array, find the ith order statistic
    end_ix = len(array) if end_ix == -1 else end_ix    
    pivot_idx  = partition(array, start_ix, end_ix)    
    stat = pivot_idx - start_ix + 1
    if i == stat:
        return array[pivot_idx]
    elif i < stat:
        return RSelect(array, i, start_ix, pivot_idx)
    else:
        return RSelect(array, i - stat, pivot_idx + 1, end_ix)

In [49]:
print('1st Order stat is', RSelect([3, 8, 2, 5, 1, 4, 7, 6], 1))
print('2nd Order stat is', RSelect([3, 8, 2, 5, 1, 4, 7, 6], 2))
print('3rd Order stat is', RSelect([3, 8, 2, 5, 1, 4, 7, 6], 3))
print('4th Order stat is', RSelect([3, 8, 2, 5, 1, 4, 7, 6], 4))
print('5th Order stat is', RSelect([3, 8, 2, 5, 1, 4, 7, 6], 5))
print('6th Order stat is', RSelect([3, 8, 2, 5, 1, 4, 7, 6], 6))
print('7th Order stat is', RSelect([3, 8, 2, 5, 1, 4, 7, 6], 7))
print('8th Order stat is', RSelect([3, 8, 2, 5, 1, 4, 7, 6], 8))

1st Order stat is 1
2nd Order stat is 2
3rd Order stat is 3
4th Order stat is 4
5th Order stat is 5
6th Order stat is 6
7th Order stat is 7
8th Order stat is 8
