# Sorting

## Merge sort

Complexity $O(N \log(N))$. Goes through two sorted lists and returns a sorted one. Recurses on this resulting in a full sorted array.

## Quick sort

Complexity $O(N^2)$ but average running time $O(N \log(N))$. On most datasets runs slightly faster than merge sort. Considers a pivot, and moves all elements in the array smaller than de pivot before the pivot, and the ones larger than the pivot after the pivot.

# Merge sort

In [1]:
%pprint
from random import randint, choice

Pretty printing has been turned OFF


In [2]:
def merge(arr1, arr2):
    i, j = 0, 0
    res = []
    while i < len(arr1) and j < len(arr2):
        if arr1[i] <= arr2[j]:
            res.append(arr1[i])
            i += 1
        else:
            res.append(arr2[j])
            j += 1

    return res + arr1[i:] + arr2[j:]

def merge_sort(arr):
    if len(arr) == 1:
        return arr

    mid_point = len(arr) // 2
    half1 = merge_sort(arr[:mid_point])
    half2 = merge_sort(arr[mid_point:])
    return merge(half1, half2)

lst = [randint(1, 1000) for _ in range(20)]
merge_sort(lst)

[29, 63, 237, 239, 262, 284, 318, 378, 392, 446, 446, 553, 577, 622, 718, 804, 866, 882, 916, 980]

In [3]:
def test_mergesort():
    for _ in range(10):
        lst = [randint(0, 1000) for _ in range(randint(1, 30))]
        res = merge_sort(lst)
        exp = sorted(lst)
        assert res == exp, f"{res} was not {exp}"

test_mergesort()

## Observations

- Would  be faster if it were done inplace, a lot of temporary lists get needlessly created

# QuickSort

In [4]:
def reshuffle(arr, pivot):
    """ move elements smaller than the pivot before the pivot
    and elements larger after
    """
    arr.remove(pivot)
    return [x for x in arr if x <= pivot] + [pivot] + \
           [x for x in arr if x > pivot]

def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    if len(arr) == 2:
        return [min(arr), max(arr)]
    pivot = choice(arr)
    arr.remove(pivot)
    lesser = [x for x in arr if x <= pivot]
    greater = [x for x in arr if x > pivot]
    return quick_sort(lesser) + [pivot] + quick_sort(greater)

lst = [randint(0, 99) for _ in range(20)]
print(lst)
print(quick_sort(lst))

[72, 59, 83, 88, 59, 97, 69, 60, 22, 48, 44, 15, 74, 95, 97, 29, 36, 61, 3, 20]
[3, 15, 20, 22, 29, 36, 44, 48, 59, 59, 60, 61, 69, 72, 74, 83, 88, 95, 97, 97]


In [5]:
def test_quicksort():
    for _ in range(10):
        lst = [randint(0, 1000) for _ in range(randint(1, 30))]
        exp = sorted(lst) # pivot is removed in the function
        res = quick_sort(lst)
        assert res == exp, f"\n{res}\nwas not\n{exp}"

test_quicksort()