# Elementary Sorting

The class uses Java.  These notes will do an implementation in Python.

## Introduction: Rules of the Game

A sorting function must have some means of comparing two objects to determine how to sort.  The mechanism for doing this is called a callback.  There must be built in to the object definition some method for comparing that the sorting function can use.

For basic integer/float dataypes, the comparing mechanism is built in:

In [2]:
print(5  >  5.5)
print(-2 <  5)
print(4  == 4.0)
print(4  == 5)

False
True
True
False


For custom objects, the comparing method needs to be defined:

In [3]:
class Custom_Integer:
    
    def __init__(self, x):
        self.x = x

X = Custom_Integer(5)
Y = Custom_Integer(4)

print(X>Y)

TypeError: '>' not supported between instances of 'Custom_Integer' and 'Custom_Integer'

In [8]:
class Revised_Custom_Integer:
    
    def __init__(self, x):
        self.x = x
        
    def __gt__(self, y):
        x = self.x
        y = y.x
        return True if x > y else False
    
X = Revised_Custom_Integer(4)
Y = Revised_Custom_Integer(5)

print(X > Y)
print(Y > X)

False
True


The methods for comparing are: 

    __gt__, 
    __ge__, 
    __eq__, 
    __le__, 
    __lt__, 
    __ne__
    
Cannot compare different data types:

In [10]:
print( X > 1)

AttributeError: 'int' object has no attribute 'x'

When implementing compare functions, they must satisfy the following properties:
    
    Antisymmetry: if v ≤ w and w ≤ v, then v = w.
    Transitivity: if v ≤ w and w ≤ x, then v ≤ x.
        Totality: either v ≤ w or w ≤ v or both.

## Selection Sort

In each iteration, find the smallest remaining entry and swap it with the location of the index pointer, then move the index pointer up one.

Time complexity: N^2 / 2 compares and N swaps.  Complexity is N^2 (quadratic) even if the original array is already sorted.

In [28]:
def selection_sort(L):
    print(L)
    l = len(L)
    for i in range(l-1):
        min_index = i
        for j in range(i + 1, l):
            if L[j] < L[min_index]:
                min_index = j
        L[i], L[min_index] = L[min_index], L[i]
        print(L)
    return L

test = [5,7,4,5,8,5,3,9]
selection_sort(test)

[5, 7, 4, 5, 8, 5, 3, 9]
[3, 7, 4, 5, 8, 5, 5, 9]
[3, 4, 7, 5, 8, 5, 5, 9]
[3, 4, 5, 7, 8, 5, 5, 9]
[3, 4, 5, 5, 8, 7, 5, 9]
[3, 4, 5, 5, 5, 7, 8, 9]
[3, 4, 5, 5, 5, 7, 8, 9]
[3, 4, 5, 5, 5, 7, 8, 9]


[3, 4, 5, 5, 5, 7, 8, 9]

## Insertion Sort

In each iteration, compare the entry at the pointer with each item to its left and swap if the item to the left is greater.  Stop when at index 0 or the number to the left is less.

Time complexity is N^2 / 4 on average for an array that is randomly ordered.  Best case is linear if it already sorted.  Worst case is N^2 / 2 (but also N^2 / 2 exchanges).

In [29]:
def insertion_sort(L):
    print(L)
    for i in range(1,len(L)):
        j = i
        while j > 0 and L[j] < L[j-1]:
            L[j], L[j-1] = L[j-1], L[j]
            j -= 1
        print(L)
    return L

test = [5,7,4,5,8,5,3,9]
insertion_sort(test)

[5, 7, 4, 5, 8, 5, 3, 9]
[5, 7, 4, 5, 8, 5, 3, 9]
[4, 5, 7, 5, 8, 5, 3, 9]
[4, 5, 5, 7, 8, 5, 3, 9]
[4, 5, 5, 7, 8, 5, 3, 9]
[4, 5, 5, 5, 7, 8, 3, 9]
[3, 4, 5, 5, 5, 7, 8, 9]
[3, 4, 5, 5, 5, 7, 8, 9]


[3, 4, 5, 5, 5, 7, 8, 9]

### Partially Sorted Arrays

An array is partially sorted if the number of inversions is less than some constant times N.  An example would be a large array with a small number of inversions or if an a small number of unsorted elements was appended to a sorted array.  Insertion Sort for partially sorted arrays is linear.

## Shell Sort

Move entries more than one position at a time by h-sorting the array.  An h-sorted array is h different interleaved sorted subsequences.  Will get a completely sorted array by applying different values for h.  With big h values, amount to sort is small.  When h is small, the array will already be partially sorted.  Difficult to determine the best values of h's, but a common one that works fairly well is inervals of 3x + 1.

Time complexity in worst case is N^(3/2).  Average is something less, but no definitive answer yet.

Benefits.  Very fast unless the array is huge.  Small, fixed footprint for code. Hardware sort prototype

In [43]:
def shell_sort(L):
    N = len(L)
    h = 1
    while h < N / 3:
        h = 3 * h + 1
    while h >= 1:
        print('h = ', h)
        for i in range(h,N):
            j = i
            while j > h-1 and L[j] < L[j-h]:
                L[j], L[j-h] = L[j-h], L[j]
                j -= h
            print(L)
        h //= 3
    return L

test = [6,4,5,8,7,3,19,2,33,5,46,99,32,1,25,76,43,4,56,78,98]
shell_sort(test)

h =  13
[1, 4, 5, 8, 7, 3, 19, 2, 33, 5, 46, 99, 32, 6, 25, 76, 43, 4, 56, 78, 98]
[1, 4, 5, 8, 7, 3, 19, 2, 33, 5, 46, 99, 32, 6, 25, 76, 43, 4, 56, 78, 98]
[1, 4, 5, 8, 7, 3, 19, 2, 33, 5, 46, 99, 32, 6, 25, 76, 43, 4, 56, 78, 98]
[1, 4, 5, 8, 7, 3, 19, 2, 33, 5, 46, 99, 32, 6, 25, 76, 43, 4, 56, 78, 98]
[1, 4, 5, 8, 4, 3, 19, 2, 33, 5, 46, 99, 32, 6, 25, 76, 43, 7, 56, 78, 98]
[1, 4, 5, 8, 4, 3, 19, 2, 33, 5, 46, 99, 32, 6, 25, 76, 43, 7, 56, 78, 98]
[1, 4, 5, 8, 4, 3, 19, 2, 33, 5, 46, 99, 32, 6, 25, 76, 43, 7, 56, 78, 98]
[1, 4, 5, 8, 4, 3, 19, 2, 33, 5, 46, 99, 32, 6, 25, 76, 43, 7, 56, 78, 98]
h =  4
[1, 4, 5, 8, 4, 3, 19, 2, 33, 5, 46, 99, 32, 6, 25, 76, 43, 7, 56, 78, 98]
[1, 3, 5, 8, 4, 4, 19, 2, 33, 5, 46, 99, 32, 6, 25, 76, 43, 7, 56, 78, 98]
[1, 3, 5, 8, 4, 4, 19, 2, 33, 5, 46, 99, 32, 6, 25, 76, 43, 7, 56, 78, 98]
[1, 3, 5, 2, 4, 4, 19, 8, 33, 5, 46, 99, 32, 6, 25, 76, 43, 7, 56, 78, 98]
[1, 3, 5, 2, 4, 4, 19, 8, 33, 5, 46, 99, 32, 6, 25, 76, 43, 7, 56, 78, 98]
[1, 3, 5, 

[1, 2, 3, 4, 4, 5, 5, 6, 7, 8, 19, 25, 32, 33, 43, 46, 56, 76, 78, 98, 99]

### Shuffling

In each iteration, pick a random integer (r) between 0 and i and swap L[r] and L[i].

Time complexity is linear.

In [47]:
import random
def shuffle(L):
    for i in range(1,len(L)):
        r = random.randint(0,i)
        L[i], L[r] = L[r], L[i]
        print(L)
    return L

test = [1,2,3,4,5,6,7,8,9,10]
shuffle(test)

[2, 1, 3, 4, 5, 6, 7, 8, 9, 10]
[3, 1, 2, 4, 5, 6, 7, 8, 9, 10]
[4, 1, 2, 3, 5, 6, 7, 8, 9, 10]
[4, 5, 2, 3, 1, 6, 7, 8, 9, 10]
[4, 5, 2, 3, 6, 1, 7, 8, 9, 10]
[4, 7, 2, 3, 6, 1, 5, 8, 9, 10]
[4, 7, 2, 8, 6, 1, 5, 3, 9, 10]
[4, 7, 2, 8, 6, 1, 5, 3, 9, 10]
[4, 7, 10, 8, 6, 1, 5, 3, 9, 2]


[4, 7, 10, 8, 6, 1, 5, 3, 9, 2]

## Convex Hull

The convex hull of a set of N point is the smallest perimeter enclosing all points.

Convex hull output is the sequence of vertices that gives us the polygon.

Can transverse the convex hull by only making counterclockwise turns.  The vertices of the hull appear in increasing polar angle from point p, where p has the lowest y-coordinate.

### Graham Scan

1. Choose point p with smallest y coordinate (sorting based on just y coordinate of point)
2. Sort points by polar angle with p (sorting based on calculated polar angle)
3. Consider points in order; discard if it doesn't make a CCW turn