# Chapter 13 Sorting

Naive sorting is O(n^2) and a number of other algorithms run in O(nlogn) time (heapsort, merge sort, and quick sort)

* heapsort is in place but not stable
* merge sort is stable but not in place
* quick sort runs in O(n^2) time in worst-case
(stable means equal items appear in their original order)


**Best Choice for sorting**
* A well implemented quicksort is usually the best choice for sorting
* For short arrays (10 or fewer) insertion sort is easier to code and faster
* For a small number of distinct elements [0..255] count sorting 

**Sorting boot camp**
* \__lt\__ can be used to create a compare function to be used by sort libraries
* the time complexity of any reasonable library sort is O(nlogn) - most are quicksort

**Top Tips for Sorting**
2 types

1) use sort to make subsequent steps simpler (use a library)

2) design a custom sorting routine (use a data structure like a BST, heap, or array indexed by values)

* sort() is used to sort a list in place
    * stable in place sort, it returns None
* sorted() ois used to sort an iterable
    * returns a list and leaves original unchanged

## 13.1 Compute the Intersection of 2 Sorted Arrays
Write a program which takes as input 2 sorted arrays, and returns a new array containing elements that are present in both of the input arrays. The input arrays may have duplicate entries, but the returned array should be free of duplicates. 

For example the input is [2,3,3,5,5,6,7,7,8,12] and [5,5,6,8,8,9,10,10] your output should be [5,6,8]

In [2]:
# start at the beginning of each list increase the iterator on whichever list has a smaller value
# if the values are equal add that value to the new list
# Keep track of the most recently added value and do not add it again

def intersect_sorted(a,b):
    final_list = []
    ai = 0
    bi = 0
    while ai < len(a) - 1 and bi < len(b) - 1:
        #print "ai: " + str(ai)
        #print "bi: " + str(bi)
        if a[ai] == b[bi] and a[ai]:
            if len(final_list) == 0 or a[ai] != final_list[-1]:
                final_list.append(a[ai])
            ai += 1
            bi += 1
        elif a[ai] < b[bi]:
            ai += 1
        else:
            bi += 1
    return final_list

assert(intersect_sorted([2,3,3,5,5,6,7,7,8,12], [5,5,6,8,8,9,10,10]) == [5,6,8])

## Merge Two Sorted Arrays
Write a program which takes as input two sorted arrays of integers, and updates the first to the combined entries of the 2 arrays in sorted order. Assume the first array has enough empty entries at its end to hold the result.

In [22]:
# plan is to fill from the end of the array, if you can find the non-None length of each array start from that point

def merge_sorted_arrays(a,b):
    ai = len(a) - 1
    bi = len(b) - 1
    
    while a[ai] is not None or b[bi] is not None:
        if a[ai] is None:
            ai -= 1
        else: 
            break
        if b[bi] is None:
            bi -= 1
    i = ai + bi + 1
    while ai >= 0 and bi >= 0:
        if a[ai] >= b[bi]:
            a[i] = a[ai]
            ai -= 1
        else:
            a[i] = b[bi]
            bi -= 1
        i -= 1
    if bi > 0:
        a[i-bi:i+1] = b[0:bi+1]
    return a
        
l = [1,2,4,6,8,None,None,None,None,None]
m = [3,4,5,7]
assert(merge_sorted_arrays(l,m) == [1, 2, 3, 4, 4, 5, 6, 7, 8, None])
assert(merge_sorted_arrays([2,3,5,7,9,None,None,None,None],[0,1,3,6]) == [0, 1, 2, 3, 3, 5, 6, 7, 9])

## 13.5 Render a Calendar

Write a program that takes a set of events, and determines the maximum number of events that take place concurrently

Each event has a start time and an end time

In [24]:
# setup start and end points in tuples with the time of start and end
# sort by the time
# as you move through add 1 for start subtract 1 for end
# maintain the max value seen so far at each point

import collections
# Events are a tuple (start_time, end_time)
Event = collections.namedtuple('Event', ('start', 'finish'))

# end point is a tuple with is_start = 0 for start time so that it comes first if start and end times overlap
End_point = collections.namedtuple('Endpoint', ('time', 'is_start'))

def max_calendar_events(A):
    """
    input - A: list of Event tuples
    return - max number of overlapping events (integer)
    """
    all_endpoints = []
    for event in A:
        all_endpoints.append(End_point(event.start, 0))
        all_endpoints.append(End_point(event.finish, 1))
    all_endpoints.sort()
    
    max = 0
    active = 0
    for point in all_endpoints:
        if point.is_start == 0:
            active += 1
        else:
            active -= 1
        if active > max:
            max = active
            
    return max
    
assert(max_calendar_events([Event(0,1),Event(0,2),Event(5,7)]) == 2)
assert(max_calendar_events([Event(1,5), Event(14,15), Event(2,7), Event(4,5), 
                     Event(6,10), Event(8,9), Event(9,17), Event(11,13), Event(12,15)]) == 3)
assert(max_calendar_events([Event(1,5), Event(14,15), Event(2,7), Event(4,5), Event(15,16), 
                     Event(6,10), Event(8,9), Event(9,17), Event(11,13), Event(12,15)]) == 4)