# Advanced Sorting
### Merge Sort
Merge sort is **`O(NlogN)`** in all cases. `LogN` to split the problem to gradually smaller problems and `N` to go through all elements in the split problems.<br>
<img src="Graphics/merge.png" width="35%" align="left">

In [27]:
def merge_sort(alist):
    size = len(alist)
    if size==1: return alist
    left=[]
    right=[]
    split = len(alist)/2
    left=alist[:split]
    right=alist[split:]
    print "Splitting", alist, "to", left, right
    
    left=merge_sort(left)     # Each call stack frame keeps its own left and right lists (also see Recursion notebook)
    right=merge_sort(right)   

    return merge(left, right) # It then merges them to build the result which is passed to the next call stack frame

In [207]:
def merge(alist, blist):
    merged=[]
    i=j=0
    print "Merging:" , alist, blist
    while i<len(alist) and j<len(blist):
        if alist[i]<blist[j]:
            merged.append(alist[i])
            i+=1
        else:
            merged.append(blist[j])
            j+=1
    
    while i<len(alist):
            merged.append(alist[i])
            i+=1
            
    while j<len(blist):
            merged.append(blist[j])
            j+=1    
            
    print "Merged:", merged
    return merged

#### Notes: 
- Each call stack frame keeps its own left and right lists (also see Recursion notebook).
- It then merges them to build the result which is passed to the next call stack frame. 
- Also, note how `merge()` happens after the recursive calls, which makes `merge_sort()` a head recursion.

In [209]:
# demo the merging function
alist=[2,4,5,7]
blist=[1,3,8,9, 12]
merge(alist, blist)

Merging: [2, 4, 5, 7] [1, 3, 8, 9, 12]
Merged: [1, 2, 3, 4, 5, 7, 8, 9, 12]


[1, 2, 3, 4, 5, 7, 8, 9, 12]

In [211]:
# demo merge sort
alist=[12,35,87,26,9,28,7]
merge_sort(alist)

Splitting [12, 35, 87, 26, 9, 28, 7] to [12, 35, 87] [26, 9, 28, 7]
Splitting [12, 35, 87] to [12] [35, 87]
Splitting [35, 87] to [35] [87]
Merging: [35] [87]
Merged: [35, 87]
Merging: [12] [35, 87]
Merged: [12, 35, 87]
Splitting [26, 9, 28, 7] to [26, 9] [28, 7]
Splitting [26, 9] to [26] [9]
Merging: [26] [9]
Merged: [9, 26]
Splitting [28, 7] to [28] [7]
Merging: [28] [7]
Merged: [7, 28]
Merging: [9, 26] [7, 28]
Merged: [7, 9, 26, 28]
Merging: [12, 35, 87] [7, 9, 26, 28]
Merged: [7, 9, 12, 26, 28, 35, 87]


[7, 9, 12, 26, 28, 35, 87]

### Quick Sort
Quick sort is **`O(N^2)`** (worst case, i.e. when the list is in reverse order) but it is **`O(NlogN)`** on average (an improvement over selection and insertion sort's `O(N^2)` average).

The sorting principle of quick sort is as follows: Pick a partitioning (aka pivot) element. For example use the last element of the list, `list[-1]`, as the pivot. Then run through the list and every time you find an element smaller than the pivot, place it in the sublist that holds the smaller items. Every time you find an element larger than the pivot, place it at the sublist with the larger items. Now these two sublists can be sorted independently. Sort them recursively. The main difference between merge sort and quick sort is that merge sort runs independenetly of the input data, it just halves the list. Quick sort partitions it based on a selected element.

There are two ways to execute this algorithm. The first and more intuitive version, creates two lists and then concatenates them. This version is more readable, but it is not as space efficient as it uses additional lists. Note that this implementation is consistent to recursive return pattern 1 with a tail recursion.

In [1]:
def quick_sort(alist):
    less = []
    equal = []
    greater = []
    size = len(alist)
    
    if size <= 1: return alist
    pivot = alist[-1]
    for x in alist:
        if x < pivot:
            less.append(x)
        elif x == pivot:
            equal.append(x)
        else:
            greater.append(x)
    return quick_sort(less)+equal+quick_sort(greater)  

In [2]:
alist = [2, 4, 5, 7, 1, 3, 8, 9, 12, 6]
alist=quick_sort(alist)
print alist

[1, 2, 3, 4, 5, 6, 7, 8, 9, 12]


#### In place Quick Sort
A more efficient implementation creates the two sublists in place (space complexity `O(logN)`). It uses a left index which starts from `list[0]` and a right index which starts from the element left left of the pivot, `list[-2]`. The left index runs the list to the right and every time it finds an element larger than the pivot it stops. The right index runs the list to the left and every time it finds an element smaller than the pivot is stops. The two elements are swapped and the process goes on until the two indices meet. The point at which they meet is the position where the pivot must be placed. The pivot is swapped with this element and now the list consists of two sublists: The one on the left of the pivot contains the smaller than the pivot elements and the one on the right of the pivot contains the larger than the pivot elements. Assorting the sublists is done here by the helper function `partition()`.


In [23]:
def quick_sort_in_place(alist, start=0, end=len(alist)-1):
    if start<0 or start>=end or end<1: return
    alist, pivot_index = partition(alist, start, end)
    quick_sort_in_place(alist, start, pivot_index-1)
    quick_sort_in_place(alist, pivot_index+1, end)

In [24]:
def partition(alist, start, end):
    pivot = end
    left_index = start
    right_index = end-1
    while left_index<right_index:
        if alist[left_index]<=alist[pivot]:
            left_index+=1
            continue
        if alist[right_index]>=alist[pivot]:
            right_index-=1
            continue
        alist[left_index], alist[right_index] = alist[right_index], alist[left_index]
    if alist[left_index] > alist[end]: alist[left_index], alist[end] = alist[end], alist[left_index]
    return alist, left_index

#### Note:
In-place Quick Sort does not return a value. The reason is that the part that changes the list is extracted in this implementation to the helper function `partition()`, which returns the changed list to the caller `quick_sort_in_place()` function. `quick_sort_in_place()` then just manipulates the indices. Also see the Recursion notebook.

In [25]:
alist=[2, 4, 5, 7, 1, 3, 8, 9, 12, 6]
quick_sort(alist)

In [26]:
print alist

[1, 2, 3, 4, 5, 6, 7, 8, 9, 12]
