# Sorting algorithms

**A sorting algorithm** is an algorithm that puts elements of a list in order.

- numerical order 

0, 2, 7, 8, 1, 3, 17, 10, 12 -> 0, 1, 2, 3, 7, 8, 10, 12, 17

- lexicographical order

a, c, n, b, b, c, a -> a, a, b, b, c, c, n 

An array can be sorted in place or using additional memory.

**Stable** sorting algorithms maintain the relative order of records with equal keys.

![Title](img/stable_sort.png)



## Selection sort

The array is divided into 2 parts: the left one is sorted, and the right one is not.

At each step:

- look for the minimum in the right part.

- swap the minimum element with the first element of the right part.

- shift the boundary by 1 to the right.


![Title](https://upload.wikimedia.org/wikipedia/commons/9/94/Selection-Sort-Animation.gif)

Worst-case: $O(n^{2})$ comparisons, $O(n)$ swaps

Best-case: $O(n^{2})$

Average: $O(n^{2})$

In-place sorting, not stable

In [50]:
def selection_sort(a):
    n = len(a)
    for i in range(n):
        min_index = i
        for j in range(i+1, n):
            if a[j] < a[min_index]:
                min_index = j
        a[i], a[min_index] = a[min_index], a[i]
        print(f'Step: {a}')
        
a = [8,5,2,6,9,3,1,4,0,7]
selection_sort(a)
a

Step: [0, 5, 2, 6, 9, 3, 1, 4, 8, 7]
Step: [0, 1, 2, 6, 9, 3, 5, 4, 8, 7]
Step: [0, 1, 2, 6, 9, 3, 5, 4, 8, 7]
Step: [0, 1, 2, 3, 9, 6, 5, 4, 8, 7]
Step: [0, 1, 2, 3, 4, 6, 5, 9, 8, 7]
Step: [0, 1, 2, 3, 4, 5, 6, 9, 8, 7]
Step: [0, 1, 2, 3, 4, 5, 6, 9, 8, 7]
Step: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Step: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Step: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

## Insertion sort


The array is divided into 2 parts: the left one is sorted, and the right one is not.

At each step:

- take the next element in the right part.

- place it in the correct position in the left part.


![Title](https://upload.wikimedia.org/wikipedia/commons/8/81/Dsa_ins_sort.png)



In [51]:
def insertion_sort(a):
    n = len(a)
    for i in range(1,n):
        curr_elem = a[i]
        j = i - 1
        while j >= 0 and curr_elem < a[j]:
            a[j+1] = a[j]
            j=j-1
        a[j+1] = curr_elem
        print(f'Step: {a}')
        
a = [8,5,2,6,9,3,1,4,0,7]
insertion_sort(a)
a

Step: [5, 8, 2, 6, 9, 3, 1, 4, 0, 7]
Step: [2, 5, 8, 6, 9, 3, 1, 4, 0, 7]
Step: [2, 5, 6, 8, 9, 3, 1, 4, 0, 7]
Step: [2, 5, 6, 8, 9, 3, 1, 4, 0, 7]
Step: [2, 3, 5, 6, 8, 9, 1, 4, 0, 7]
Step: [1, 2, 3, 5, 6, 8, 9, 4, 0, 7]
Step: [1, 2, 3, 4, 5, 6, 8, 9, 0, 7]
Step: [0, 1, 2, 3, 4, 5, 6, 8, 9, 7]
Step: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Worst-case: $O(n^{2})$

Best-case: $O(n)$

Average: $O(n^{2})$

We can use binary search in the left part to find the insertion position; for large arrays; more cache misses than in linear search.

In-place sorting, stable.

Running time of any sorting algorithm that uses comparisons - $\Omega(NlogN)$

## Heap sort

- Build a heap on the input array

- N-1 time take the maximum element from the heap and swap it with the first element of the right part of the array

- SiftDown new element in root node

![Title](img/heap_sort.png)

Building a heap: $O(n)$

Building a heap: $O(nlogn)$

No best case, unstable, in-place


In [52]:
class BinMinHeap:
    def SiftDown(self, i):
        left = 2*i + 1
        right = 2*i + 2
        largest = i
        if left < len(self.heap_array) and self.heap_array[left] < self.heap_array[i]:
            largest = left
        if right < len(self.heap_array) and self.heap_array[right] < self.heap_array[largest]:
            largest = right
        if largest!= i:
            self.heap_array[i], self.heap_array[largest] = self.heap_array[largest], self.heap_array[i]
            self.SiftDown(largest)
            
    def SiftUp(self, i):
        while i > 0:
            parent = (i-1)//2
            if self.heap_array[i] > self.heap_array[parent]:
                return
            self.heap_array[i], self.heap_array[parent] = self.heap_array[parent], self.heap_array[i]
            i = parent
        
    def __init__(self, array):
        self.heap_array = array
        for i in reversed(range(len(self.heap_array)//2)):
            self.SiftDown(i)
            
    def add(self, element):
        self.heap_array.append(element)
        self.SiftUp(len(self.heap_array) - 1)
        
    def get_min(self):
        min_elem = self.heap_array[0]
        leaf_elem = self.heap_array.pop()
        if self.heap_array:
            self.heap_array[0] = leaf_elem 
            self.SiftDown(0)
        return(min_elem)

a = [9,8,7,6,5,4,3,2,1,0]

bin_heap = BinMinHeap(a)
print(bin_heap.heap_array)

for i in range(len(a)):
    print(bin_heap.get_min())

[0, 1, 3, 2, 5, 4, 7, 9, 6, 8]
0
1
2
3
4
5
6
7
8
9


## Merge sort

1. Split the array into two parts

2. Sort each part recursively

3. Merge the sorted parts into one

![Title](https://upload.wikimedia.org/wikipedia/commons/thumb/e/e6/Merge_sort_algorithm_diagram.svg/800px-Merge_sort_algorithm_diagram.svg.png)

Stable.

Merge sort can be implemented without recursion by dividing the input array into subarrays of small length.

### Merging

1. Select an array with the smallest first element

2. Extract this element into the result array

3. Continue until one of the arrays is empty

4. Copy the rest of the second array to the end of the result array

Worst case: $O(n+m)$

Best case: $O(min(n,m))$


![Title](img/merging.png)


In [53]:
def merge(a, l, m, r):
    i = 0
    j = 0
    k = 0
    L = a[l:m]
    length_left = len(L)
    R = a[m:r]
    length_right = len(R)
    print(f"merging L: {L} and R:{R}")
    while i < length_left and j < length_right: 
        if L[i] <= R[j]:
            a[l+k] = L[i]
            i, k = i+1, k+1
        else:
            a[l+k] = R[j]
            j, k = j+1, k+1
            
    while i < length_left:
        a[l+k] = L[i]
        i, k = i+1, k+1
    while j < length_right:
        a[l+k] = R[j]
        j, k = j+1, k+1
        
def merge_sort(a, l, r):
    if l >= r-1:
        return
    m = l + (r-l)//2
    merge_sort(a, l, m)
    merge_sort(a, m, r)
    merge(a, l, m, r) 

a = [9,8,7,6,5,4,3,2,1,0]
merge_sort(a, 0, len(a))
a


merging L: [9] and R:[8]
merging L: [6] and R:[5]
merging L: [7] and R:[5, 6]
merging L: [8, 9] and R:[5, 6, 7]
merging L: [4] and R:[3]
merging L: [1] and R:[0]
merging L: [2] and R:[0, 1]
merging L: [3, 4] and R:[0, 1, 2]
merging L: [5, 6, 7, 8, 9] and R:[0, 1, 2, 3, 4]


[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Merge sort complexity

$T(n) \leq 2T(\frac{n}{2}) + c \cdot n  \leq 4T(\frac{n}{4}) + 2c \cdot n \leq ... \leq  2^{k} \cdot T(1) + k \cdot c \cdot n$,

where $k = logn$

$T(n) = O(nlogn)$

$M = O(n)$, the size of the allocated memory is equal to the sum of the lengths of the merged arrays.

## QuickSort

- Divide the array into 2 parts using the partition function:

{elements from the left part} $\leq$ {elements from the right part}

- Apply this procedure recursively to the left and right parts.

### Partition function

0. In the input array A, select a pivot element

1. Put the pivot in the n-1 position in A

Repeat while $i<j$:

    2. Set 2 pointers - i to the first element of the array and j to the element before the pivot element

    3. Move i to the right until $A[i] < pivot$

    4. Move j to the left until  $A[j] ≥ pivot$

    5. Swap $A[i]$ and $A[j]$ if i<j
    
6. Swap $A[i]$ and $A[n-1]$ (pivot)

![Title](img/partition.png)

Best case: $O(n log n)$

There is an equal number of elements, that are larger or smaller than the pivot - select the pivot as the mean.

$T(n) \leq 2T(\frac{n}{2}) + c \cdot n$




Worst case - array is sorted: $O(n^{2})$

![Title](img/qsort_worst.png)

Average - $O(n log n)$

### How to choose pivot element?

- First element

- Last element

- Middle element

- Random index

- Median(first, middle, last)

- Median of three random

- Median of the whole array ($O(n)$)

 statistics.median:Qsort - unstable, in-place

In [1]:
import statistics
def qsort(a, l, r):
    if l >= r-1:
        return
    i, j = l, r-1
    m = l + (r-l)//2
    pivot = statistics.median([a[l], a[m], a[r-1]])
    while i <= j:
        while a[i] < pivot: i += 1
        while a[j] > pivot: j -= 1
        if i <= j:
            a[i], a[j] = a[j], a[i]
            i, j = i + 1, j - 1
    qsort(a, l, j)
    qsort(a, i, r)
    
a = [9,8,7,6,5,4,3,2,1,0]
qsort(a, 0, len(a))
a

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

For all comparison based sorts : $O(nlogn)$


Decision tree branches allow us to monitor the sorting algorithm's progress - binary tree.




![Title](img/sorting_naive.png)



n! possible orderings of an array with n elements -> n! leaves in the decision tree

Tree height is at least log(n!)

$n! = \sqrt{2 \pi n} \cdot (\frac{n}{e})^e \cdot (1+\Theta(\frac{1}{n}))$

$log(n!) = log( \sqrt{2 \pi n} \cdot (\frac{n}{e})^e \cdot (1+\Theta(\frac{1}{n}))) = \frac{1}{2}log(2\pi n) + n log(\frac{n}{e}) + log(1+\Theta(\frac{1}{n}) \geq nlogn - nloge$


## Counting sort

Non-comparison-based, integer sorting algorithm

It is particularly efficient when the range of input values is small compared to the number of elements to be sorted.

$O(n)$

Stable sort

1) Get $max$ element

2) Allocate counting array of length $max$

3) Count occurrences of all possible elements in an array

![Title](img/counting_sort.png)



## Radix Sort

- least significant digit

- most significant digit

Use counting sort to sort each digit

$T(n) = O(k*n)$, where k - number of digits

![Title](https://media.geeksforgeeks.org/wp-content/uploads/20230307194541/Radix-Sort-in-C-1-768.png)

