# Basic Sorting Algorithms

Probelm: given an array arr of length n, sort its values\
- insertion sort
- Bubble sort
- Selection sort
- Merge sort
- Heap sort
- Quick sort
- Count sort
- Bucket sort
- Radix sort

## Insersion sort

Algorithm: linearly iterate through all element in the array except the first one, i.e. $i = 1,2, ... n$ and sort the subarray $arr[:i]$ by looping through all its elements $j = i-1, i-2, ... 0$ and figure out the correct position for the element $arr[i]$.

In [1]:
def insertion_sort(arr: list) -> list:
    # loop through the array i = 1, ..., n
    for i in range(1, len(arr)):
        # find the correct position of key
        key = arr[i]
        # loop through the sorted subarra arr[0:j-1]
        j = i-1
        while key < arr[j] and j>=0:
            # shift elements bigger than key one position to the right
            arr[j+1] = arr[j]
            j-=1
        # place key in its correct position
        arr[j+1] = key
    
    return arr

In [3]:
ar = [2,6,5,7,9,10,-2]
print(insertion_sort(ar))

[-2, 2, 5, 6, 7, 9, 10]


- Average Complexity: $\mathcal{O}(n^2)$ , at every step perform half swifts i/2. 
- Worst Complexity: $\mathcal{O}(n^2)$ , array is in reversed order, so at evey step perform i swifts.
- Best Complexity: $\mathcal{O}(n)$ , when the array is already sorted, at every step perform 0 swifts.
- Auxiliary Space: $\mathcal{O}(1)$

## Selection Sort
Algorithm: 
- start at index $i=0$
- iterate through the sub-array $A[i:]$, find the smallest element and place it at position i
- at any momemt the sub-array $A[:i]$ is sorted
- terminate when $i=\text{len}(A)-1$

In [22]:
def selection_sort_recursion(arr: list, index: int = 0):
    '''Recursive implementation of the selection sort algorithm
    
    Return the sorted array.
    '''

    # terminate when we reach the last element
    if index == len(arr)-1 or len(arr) == 0:
        return arr
    
    smallest_elt, smallest_elt_index = arr[index], index
    for i, elt in enumerate(arr[index:], start=index):
        if elt < smallest_elt:
            smallest_elt_index = i
    
    # exchange 
    arr[index] , arr[smallest_elt_index] = arr[smallest_elt_index] , arr[index]
    return selection_sort_recursion(arr, index+1)  

In [23]:
def selection_sort_iteration(arr: list):
    '''Iterative implementation of the selection sort Algorithm.

    Returns the sorted array
    '''

    n = len(arr)
    # loop throught all elements
    for i in range(n):
        min_index = i 
        # find the min-element in the subarray arr[i:] 
        for j in range(i+1, n):
            if arr[j] < arr[i]:
                min_index = j
        # place the min-element at arr[i]
        arr[min_index], arr[i] = arr[i], arr[min_index]
    
    return arr


In [25]:
ar = [2,6,5,7,9,10,-2]
print(selection_sort_recursion(ar))
ar = [2,6,5,7,9,10,-2]
print(selection_sort_iteration(ar))

[-2, 2, 5, 6, 7, 9, 10]
[-2, 2, 5, 6, 7, 9, 10]


- time complexity: $\mathcal{O}(n^2)$
- space complexity: $\mathcal{O}(n)$, for the recursive calls, $\mathcal{O}(1)$ for iterative implementaion

## Buble sort
- find the maximum element out of the first n element in the array by comparing it to the next element 
- iteratively do the above for the maximum, second maximum, etc. element

In [144]:
def bubble_sort(arr: list):
    '''Bubble sort the array and return it
    '''

    n = len(arr)

    for i in range(n-2,-1, -1):
        not_sorted = False
        for j in range(0, i+1):
            if arr[j] > arr[j+1]:
                arr[j+1], arr[j] = arr[j], arr[j+1]
                not_sorted = True
        
        # check if the array has already been sorted
        if not not_sorted:
            return arr

    return arr

In [145]:
arr = [3,7, 3, 9, 12, 11]
print(f'array= {arr}')
print(f'sorted arry= {bubble_sort(arr)}')

array= [3, 7, 3, 9, 12, 11]
sorted arry= [3, 3, 7, 9, 11, 12]


- time complexity: 
  - best $\mathcal{O}(n)$; when the array is already sorted and we use a sorted flag to check it
  - average, worst: $\mathcal{O}(n^2)$

## Merge-Sort Algorithm

Algorithm: Follow a Devide and Conquer Paradigm. Recursively devide the array into two subarrays, sort each one and cobine them into one sorted array.

In [16]:
def merge_sort(arr):
    print(f'sorting array {arr}') 
    if len(arr) > 1:
        # creating two new object: devide into the left and right subarrays

        m = len(arr)//2
        # left_arr = arr[:m]
        # right_arr = arr[m:]
        # merge_sort(left_arr)
        # merge_sort(right_arr)

        left_arr = merge_sort(arr[:m])
        right_arr = merge_sort(arr[m:])



        # combine the left and right arrays; linearly scan through each one
        # compare left_arr[i] with right_arr[j], store the minimum to arr
        # and move one position to the right to the subarray which had the min element
        i,j  = 0,0
        while i<len(left_arr) and j<len(right_arr):
            if left_arr[i]<=right_arr[j]:
                arr[i+j] = left_arr[i]
                i+=1
            else:
                arr[i+j] = right_arr[j]
                j+=1
            
        # the previous loop finishes when all elements of one of the two subarrays where considered
        # so in one of the two subarrays there are remaining elements to be placed in arr
        # they are in the correct sorted order 
        
        while i < len(left_arr):
            arr[i+j] = left_arr[i]
            i+=1
        while j < len(right_arr):
            arr[i+j] = right_arr[j]
            j+=1

        print(f'the resulted sorted arr is {arr}')

    return arr

In [17]:
ar = [2,6,5,7]
print(merge_sort(ar))

sorting array [2, 6, 5, 7]
sorting array [2, 6]
sorting array [2]
sorting array [6]
the resulted sorted arr is [2, 6]
sorting array [5, 7]
sorting array [5]
sorting array [7]
the resulted sorted arr is [5, 7]
the resulted sorted arr is [2, 5, 6, 7]
[2, 5, 6, 7]


- Average, Best, Worst Complexity: $\mathcal{O}(n\log_2(n))$ 
- Auxiliary Space: $\mathcal{O}(n)$

## Quick Sort Algorithm
Quick sort is a devide and conquer sorting algorithm tha also adds some stochssticity in the algorithm. The algorithm works as follows:
- at any moment partition the sub-array $arr[low:high+1]$ around a pivot element $p$ as:
  - randomly choose a pivot element $p$ from the array $arr[low:high+1]$ and place it at the end, i.e. at position $high$.
  - maintain two sets in the subarray: $arr[low:i+1]$ will contain elements smaller than p and $arr[i+1:high+1]$ will contain elements bigger than $p$.
  - initialize $i$ to $low-1$.
  - iteratively, $j=low, ..., high$, compare each element $arr[j]$ with $p$.
  - if it is smaller or equal than $p$ increase $i$ by one to point it temporarely at the first bigger than $p$ element and exchange $arr[i]$ with $arr[j]$.
  - it is crutial to include the equal elements, otherwise $p$ will not end up in its correct final position and we will be stucked in a endless recursion. 
  - now $i$ points to the position of the last, smaller than p element.
  - $i-low+1$ is the number of elements smaller than p.
  - after iterating through all elements in the subarray,  
    $i$ points to the position of the last smaller than p element.
  - exchange p with the first bigger than p element, i.e. $arr[high]$ with $arr[i+1]$
  - subarray has the form $[s_{low},s_{low+1},\dots,s_{i},p, b_{i+2},..., b_{high}] $
- p now is in its correct sorted final position $i+1$.
- recursively call partion(arr, low, high) on the sub-array $arr[low:i+1]$ and $arr[i+2, high]$
- stop the recursion if the sub-array has at most one element, i.e $low\ge hight$.

In [3]:
from random import randint

def quick_sort(arr, low: int = 0, high: int = -1):
    '''Quick sort the array and returns it 
    '''

    if high == -1:
        high = len(arr)-1
    
    # stop recursion if the subarray containts at most one element
    if low<high:
        # partion around a random pivot, pivot will be placed in its final position
        pivot_final_index = partition(arr, low, high)
        # quick sort the  left and right subarray of the pivot's final position
        quick_sort(arr, low, pivot_final_index-1)
        quick_sort(arr, pivot_final_index+1, high)

    return arr


def partition(arr: list, low: int, high: int):
    '''partition the subarray arr[low:high+1] around a random pivot element
    and returns the position of the pivot
    '''
    
    # randomly choose a pivot from low to higt, including both ends
    pivot_index = randint(low, high)

    # place the pivot at hte end of the subarray
    arr[high], arr[pivot_index] = arr[pivot_index], arr[high]
    
    # initialize i such that there are no elements smaller than p
    i = low-1
    # iterate through all elements except the pivot, which is last now
    for j in range(low, high):
        # compare each element with the pivot
        if arr[j]<=arr[high]:
            # increase the position of the last, smaller than p, element
            i+=1
            # exchange with the previous element, that is the first, bigger than p, element
            arr[i], arr[j] = arr[j], arr[i]
    
    # exhange the pivot with the first bigger than pivot element
    arr[i+1], arr[high] = arr[high], arr[i+1]

    # return the final position of the pivot
    return i+1


In [132]:
a = [9,8,7,1,2,3,6,5,4]
print(f'array={a}')
quick_sort(a)
print(f'sorted array={a}')

array=[9, 8, 7, 1, 2, 3, 6, 5, 4]
sorted array=[1, 2, 3, 4, 5, 6, 7, 8, 9]


In [133]:
def quick_select(arr: list, k: int, low: int = 0, high: int =-1):
    '''Given an array arr, find the k-est smaller element.
    '''
    # check k compared to array
    if not 0<k<=len(arr):
        return 'there are fewer elements'
    
    # initial consider the whole array
    if high == -1:
        high = len(arr)-1

    # find the final position of a random pivot
    pivot_index = partition(arr, low, high)
    
    # if it is the desired position stop; else call partition on the lower/upper subarray
    if pivot_index == k-1:
        pronantiation = {1:'first', 2:'second', 3:'third', 4:'fourd'}
        word = pronantiation[k] if k in pronantiation else f'{k}-th'
        return f'the {word} smaller element is {arr[pivot_index]}'
    
    elif pivot_index>k-1:
        # the k-est smaller element is in the left sub-array
        return quick_select(arr, k, low, pivot_index-1)
    else:
        # the k-est smaller element is in the right sub-array; 
        # there are pivot_index number of elements in the left subarray, including the pivot
        # so it is the k-pivot_index smallest element in the right subarray 
        return quick_select(arr, k, pivot_index+1, high)


In [134]:
a = [-1,0,-2,1,7,4]
quick_select(a, 4)

'the fourd smaller element is 1'

- a partition takes n operations since it scans through elements in the array
- for quick sort the best and average scenario is when the pivot seperate the array into two subarray of roughly the same size. Then we need $\log_2(n)$ recursive calls to complete the sort and each call takes $\mathcal{O}(n)$
- the worst case is when at each recursion we choose the smallest or the biggest remaining element leading to $n-1$ recursive calls.
- for the space in the average and best case we make $\log_2n$ recursive calls, while in the worst case we make $n$.
- for the quick select algorithm, in the worst case we have to make $n$ recursive calls, 
but in the best and average case we only call the subarray that contains the k-st smallet number, so:
  - at each recursion we reduce the size on average by half $(b=2)$ 
  - but we only have one sub-problem to solve $(a=1)$
  - and we conquer the problem in $\Theta(n)$, $(c=1)$. 
  - so from the master method we have:
  $$ T(n) = T(n/2)+\Theta(n) \,\, , \,\, \epsilon = \log_ba=0<c \,\, \rightarrow \,\, T(n) = \Theta(n)\$$

|operation | Best, Average, Worst Time | Space|
|----------|------|------|
|partition |$\mathcal{O}(n)$|$\mathcal{O}(1)$|
|quicksort |$\mathcal{O}(n\log_2n)$,$\mathcal{O}(n\log_2n)$,$\mathcal{O}(n^2)$|$\mathcal{O}(\log_2n)$,$\mathcal{O}(\log_2n)$,$\mathcal{O}(n)$|
|quickselect |$\mathcal{O}(n)$,$\mathcal{O}(n)$,$\mathcal{O}(n^2)$|$\mathcal{O}(\log_2n), \mathcal{O}(n)$|

### Count, Radix and Bucket sort
- All the previous algorithms are comparison algorithms which we call them *comparison sorts*. 
- All comparison sort algorithms they must make $\mathcal{O}(n\log n) $ comparisons in the worst case. This can be proven easily by considering a comparison tree. The leaves of the tree must bwe a permutation of the original series of numbers, and they are $n!$. A tree of heigh $h$ has at most $2^h$ leaves and so $$ n!\le 2^h \rightarrow h\ge \log(n!) = \mathcal{\Omega}(n\log n) $$
- Count, Radix and Bucket sort are sorting algorithms that run in linear time

### Counting sort
Given an array of integers that we know their range, i.e. the maximum is k, we can sort them in linear time by counting the times of appearence of each one and sroring it in an array of size k.
- initialize extra arrays B,C of size k and n respectively
- scan through each integer n in arr and increase B[n] by one
- scan through C[k] and count the number of previous integers  
 
The complexity of counting sort algorithm is:
- $\mathcal{O}(n+k)$ for the time complexity
- $\mathcal{O}(n+k)$ for the space complexity

In [10]:
def counting_sort(arr: list[int] , k: int)-> list[int]:
    '''Count sort implementation of non-negative integers of at most k'''
    n = len(arr)
    arr_sorted = [None]*n
    # auxilary array to count the number of appearence of each number in arr
    count = [0]*(k+1)
    
    for number in arr:
        count[number]+=1
        
    position = 0 
    for number, times in enumerate(count):
        for p in range(position, position + times):
            arr_sorted[p] = number
            
        position+=times
    
    return arr_sorted

In [9]:
a = [9,8,7,1,2,3,6,5,4,4,5,6,6,7]
counting_sort(a, 9)

[1, 2, 3, 4, 4, 5, 5, 6, 6, 6, 7, 7, 8, 9]