# Testing

In [5]:
import unittest

class TestSortingAlgo(unittest.TestCase):

    def test_sort(self, sorting_algorithm):
        self.assertEqual(sorting_algorithm([]), [], "Empty list")
        self.assertEqual(sorting_algorithm([1]), [1], "One-element list")
        self.assertEqual(sorting_algorithm([1, 1, 1]), [1, 1, 1], "List of duplicates")
        self.assertEqual(sorting_algorithm([1, 2, 3]), [1, 2, 3], "Sorted list")
        self.assertEqual(sorting_algorithm([1, 1, 1, 2, 2, 2, 2, 3]), [1, 1, 1, 2, 2, 2, 2, 3], "Sorted list with duplicates")
        self.assertEqual(sorting_algorithm([7, 2, 5, 6, 1, 9]), [1, 2, 5, 6, 7, 9], "Unsorted list")
        self.assertEqual(sorting_algorithm([7, 2, 1, 6, 1, 9, 5, 2]), [1, 1, 2, 2, 5, 6, 7, 9], "Unsorted list with duplicates")

# Selection-sort

In [38]:
def selection_sort(array):
    L = len(array)
    for i in range(L-1):
        current_min_idx = i
        for j in range(i+1, L):
            if array[j] < array[current_min_idx]:
                current_min_idx = j
        array[i], array[current_min_idx] = array[current_min_idx], array[i]
    return array

TestSortingAlgo().test_sort(sorting_algorithm = selection_sort)

## Selection-sort time complexity

Selection-sort has a O(n<sup>2</sup>) time complexity:
* 1st loop does n-1 comparisons
* 2nd loop does n-2 comparisons
...
* (n-1)th loop does 1 comparison

(n-1)+(n-2)+...+1 = n(n-1)/2 = O(n<sup>2</sup>)

# Insertion-sort

In [7]:
def insertion_sort(array):
    for i in range(1, len(array)):
        j = i - 1
        while j >= 0 and array[j] > array[j+1]:
            array[j], array[j + 1] = array[j + 1], array[j]
            j -= 1
    return array

TestSortingAlgo().test_sort(sorting_algorithm = insertion_sort)

### Insertion-sort time complexity

The reasoning is very similar to selection-sort, with two nested loops. The worst-case time complexity is (n-1)+(n-2)+...+1 = n(n-1)/2 = O(n<sup>2</sup>)

# Heap sort

```
def heap_sort(array):
    L = len(array)
    H = Heap()
    for i in range(L):
        element = L.pop()
        H.add(element)
    for i in range(L):
        L.append(H.remove(element))
    return L
```

In the first loop, the i<sup>th</sup> add operation has O(log(i)) time complexity because the heap has i entries. This is the same thing in the second loop (but kind of in the reverse order), so the **overall time complexity is O[log(2) + ... + log(n)] = O(n!) = O(nlog(n))**

*Proof:*
* log(n!) <= log (n<sup>n</sup>) = nlog(n)
* log(n!) = log(2) + ... + log(n) >= log(2) + ... log(n/2 - 1) + log(n/2) + ... + log(n) >= log(2) * (n/2) + log(n/2) * (n/2) = (n/2) log(n)

So log(n!) <= nlog(n) <= 2 log(n!), i.e nlog(n) = O(n!)

# Merge sort

A divide-and-conquer algorithm.

In [61]:
def merge_sort(array):
    L = len(array)
    # Base case
    if L <= 1:
        return array
    # Recursive calls
    a1, a2 = merge_sort(array[:L//2]), merge_sort(array[L//2:])
    # Merge sorted arrays
    i, j, l1, l2 = 0, 0, len(a1), len(a2)
    while i + j < L:
        if j == l2 or (i < l1 and a1[i] < a2[j]):
            array[i+j] = a1[i]
            i += 1
        else:
            array[i+j] = a2[j]
            j += 1
    return array

TestSortingAlgo().test_sort(sorting_algorithm = merge_sort)

### Merge-sort time complexity

The merge step has a O(l1+l2) time complexity. Now, the most intuitive way to understand the merge-sort time complexity is to look at the merge-sort tree. At depth i, there are 2<sup>i</sup> merged sequences which require O(n/2<sup>i</sup>) to be "merge-sorted" each. So, each level amount to O(n) processing time and there log(n) levels. Hence merge-sort is O(nlog(n))

# Quick-sort

## 1) Common implementation

In [3]:
def quick_sort_common(array, start, end):
    # Base case: array of 0 or 1 elements can only be sorted
    if start >= end:
        return None
    
    # Partition input array
    left = start
    pivot = array[end]
    for right in range(start, end):
        if array[right] < pivot:
            array[left], array[right] = array[right], array[left]
            left += 1
    array[left], array[end] = array[end], array[left]

    quick_sort_common(array, start, left - 1)
    quick_sort_common(array, left + 1, end)

# Test
def quick_sort_test(array):
    quick_sort_common(array, 0, len(array)-1)
    return array

TestSortingAlgo().test_sort(sorting_algorithm = quick_sort_test)

In [4]:
quick_sort_test([3, 2, 1, 4])

[1, 2, 3, 4]

## Quick-sort time complexity

Quick-sort runs in O(n<sup>2</sup>) worst-case time, but best-case and average complexity are O(nlog(n)).

## 2) Unintuitive implementation from *Data Structures & Algorithms* (Goodrich, Tamassa, Goldwasser)

In [5]:
def quick_sort_variant(array, start, end):
  
  # Base case: arrays with 0 or 1 element are already sorted
  if start >= end:
    return None

  # Partitioning (often enclosed in helper function)
  # 1) Put elements <= pivot on its left / elements >= pivot on its right
  # 2) Get pivot's index to know where to recursively quick-sort next
  i, j, pivot = start, end-1, array[end]
  while i <= j:
    while i <= j and array[i] < pivot:
      i += 1
    while i <= j and  pivot < array[j]:
      j -= 1
    if i <= j:
      array[i], array[j] = array[j], array[i]
      i += 1
      j -= 1

  # Finally, put the pivot back at the right index
  array[i], array[end] = array[end], array[i]
  
  # Recursively quick-sort both remaining partitions
  quick_sort_variant(array, start, i-1)
  quick_sort_variant(array, i+1, end)

# Test
def quick_sort_test_2(array):
    quick_sort_variant(array, 0, len(array)-1)
    return array

TestSortingAlgo().test_sort(sorting_algorithm = quick_sort_test_2)

### Details of above implementation

1) Line 11: inequality must be large

This ensures we enter the outer loop at least once, this way *i* can reach the *pivot* so that no irrelevant permutation is made at line 22. If the inequality is strict, then arrays like [1,2] become [2,1] at line 22.

2) Line 12/14: inequality must be large when comparing *i* and *j*

Otherwise condition at line 16 gets satisfied, which incur index *i* to be moved for no reason: try [2,2,1].

Note: regarding the pivot comparison, the inequalities can be strict or large

3) Line 16: inequality must be large

Otherwise some arrays, e.g. [1,1], trigger infinite looping

# Bubble-sort

In [82]:
def bubble_sort(array):
    L=len(array)
    for i in range(L-1):
        swapped = False
        for j in range(L-1-i):
            if array[j] > array[j+1]:
                array[j], array[j+1] = array[j+1], array[j]
                swapped = True
        if not swapped: # All elements are ordered. No need to proceed with next value of i
            break
    return array

TestSortingAlgo().test_sort(sorting_algorithm = bubble_sort)

### Time complexity

O(n<sup>2</sup>) worst-case and average-case time complexity. Bad performer even among the O(n<sup>2</sup>) family (insertion, selection).

# Bucket sort

Applies when elements are in a finite range, e.g. [0, N]. Then, create a bucket for each value and count the number of the times the value appears in the original array. Finally, build the sorted array.

In [24]:
def bucket_sort(array, max_value):
    # Count how many times each value appears in the array
    counts = [0]*(max_value+1)
    for i in range(len(array)):
        counts[array[i]] = counts[array[i]] + 1
    
    # Build the sorted array
    i = 0
    for value, count in enumerate(counts):
        for _ in range(count):
            array[i] = value
            i += 1
    return array

bucket_sort([2,2,0,1,0,2], 3)

[0, 0, 1, 2, 2, 2]