# 3. Insertion Sort, Merge Sort

- Why sorting?
    - Typical applications of sorting (phone book)
    - Makes solving other problems easier
        - Median
        - Binary search --> paradigm of divide and conquer
            - look only at a particular half of a data set depending on what you're looking for
        - Compression
        - Rendering computer graphics
- **Insertion Sort**
    - Pseudocode:
        - For i = 1, 2, ... n --> start at index 1 because by definition, the element at i = 0 is already sorted
            - insert A[i] in the right position into *sorted (assumption)* array A[0 : i-1] (these are all elements to the left of A[i])
            - done by pairwise swaps down to the correct position that is initially in A[i]
    - Example:
        - [5 2 4 6 1 3]
        - start with second element called "key", 2. We must perform a swap
        - [2 5 4 6 1 3] looking at 4, we must swap
        - [2 4 5 6 1 3] looking at 6, no swap required, new key = 1
        - [2 4 5 6 1 3] looking at 1, we must swap 1 repeatedly (4 swaps)
        - [1 2 4 5 6 3] looking at 3, we must swap 3 repeatedly (3 swaps)
        - [1 2 3 4 5 6] final sorted
    - Analysis:
        - O(n) steps (key movements) for any key position
        - Each step is O(n) swaps OR compares and swaps
        - **Complexity** *= O(n^2) where n is the size of the array*
            - O(n^2) compares and O(n^2) swaps
        - Assuming that a compare and a swap are equal in cost (when dealing with numbers)
            - Note that if you're sorting objects (ex. records) and use a different function for comparing, this assumption fails. Compares may be more expensive than swaps
            - Let's consider that compares ARE more expensive than swaps
                - can be made more efficient by using *binary search* instead of pairwise swaps
                - the compare part of insertion sort would O(nlog(n))
                - but the swap part would still O(n^2) for swaps
                - O(n^2) > O(nlog(n)) so the complexity is still O(n^2)
- **Merge Sort** --> Recursive divide and conquer
    - Take an array A and split it into L and R (both are size n/2)
        - L --> L' (sorted as a result of merging) (size n/2)
        - R --> R' (sorted as a result of merging) (size n/2)
        - end up with a sorted array A' as a result of merging L' and R'
    - **Merge** is the key subroutine
    - Invariant for **merge** routine: two sorted array as input
        - Example:
            - L' = 20 13 7 2
            - R' = 12 11 9 1
            - look at 1 and 2, 1 is smaller so we put 1 down --> [1]
            - look at 9 and 2, 2 is smaller so we put 2 down --> [1 2]
            - look at 9 and 7, 7 is smaller so we put 7 down --> [1 2 7]
            - look at 9 and 13, 9 is smaller so we put 9 down --> [1 2 7 9]
            - look at 11 and 13, 11 is smaller --> [1 2 7 9 11]
            - look at 12 and 13, 12 is smaller --> [1 2 7 9 11 12]
            - look at 20 and 13, 13 is smaller --> [1 2 7 9 11 12 13]
            - last element is 20 --> [1 2 7 9 11 12 13 20]
        - All of the work is in the **merge** routine
    - Analysis:
        - Merge step has complexity that is O(n)
            - takes two arrays of size n/2 and must do the merge subroutine on them
        - **Complexity** *= O(nlog(n))* 
        - T(n) = c + 2 * T(n/2)) + c * n
            - c = constant for dividing
            - 2 * T(n/2) for recursion
            - c * n for merge
        - Expansion of c * n merge subroutine to explain O(nlog(n)) complexity
            - ![image](screenshots/img_1.png)
            - 1 + log(n) levels and n leaves
                - each leaf is a single element and by definition is already sorted
                - each level takes c * n time
                - c and O(...) are basically the same for our purposes
                - The recurrence becomes T(n) = (1 + log(n)) * cn
                    - T(n) = (c * n) * (1 + log(n)) --> work * levels
                    - T(n) = O(nlog(n))
- Advantage of insertion sort over merge sort
    - Merge sort needs O(n) extra/ auxillary space
    - In-place sorting in insertion sort needs O(1) extra/auxillary space
    - Simply, insertion sort is better on *memory* for large data sets
    - Can manipulate merge sort to be more space efficient
        - in-place merge sort exists --> a paper was written
- **Merge sort in Python**
    - Takes 2.2nlog(n) microseconds
- **Insertion sort in Python**
    - Takes 0.2n^2 microseconds
    - For comparison, C takes 0.01n^2 microseconds because it's compiled not interpreted
    

In [2]:
# Insertion sort algorithm
# https://www.youtube.com/watch?v=qktBUYMO7o8

def insertion_sort(array):
    # start sorting from the second element
    for sorted_length in range(1, len(array)):
        cur_item = array[sorted_length] # item to be swapped
        insert_index = sorted_length
        
        while insert_index > 0 and cur_item < array[insert_index - 1]: # do we still need to swap?
            # shift array[insert_index - 1] to the right
            array[insert_index] = array[insert_index - 1]
            
            #decrement the index to insert at 
            insert_index -= 1
            
        # place the item at the correct spot in the array
        array[insert_index] = cur_item 
    
    # sorted array
    return array

a = [18, 9, 5, 3, 2, 11, 1]
insertion_sort(a)
print(a)

[1, 2, 3, 5, 9, 11, 18]


In [3]:
# Merge sort algorithm
# https://www.youtube.com/watch?v=3aTfQvs-_hA

def merge(a, b):
    c = []
    a_index, b_index = 0, 0
    
    while a_index < len(a) and b_index < len(b): # are there still any elements in the sub arrays
        if a[a_index] < b[b_index]:
            c.append(a[a_index])
            a_index += 1
        else:
            c.append(b[b_index])
            b_index += 1
        
    if a_index == len(a):
        c.extend(b[b_index:])
    else:
        c.extend(a[a_index:])
        
    return c

def merge_sort(array):
    # if the array has a single element, it's already sorted by definition
    if len(array) <= 1: 
        return array
    
    left = merge_sort(array[:len(array)//2])
    right = merge_sort(array[len(array)//2:])
    
    # after recursing, left and right should be sorted
    # now merge the sorted sublists
    return merge(left, right)

a = [18, 9, 5, 3, 2, 11, 1]
print(merge_sort(a))  

[1, 2, 3, 5, 9, 11, 18]
