# Sorting

## Sorting Algo Rankings

- The worst sorting algorithms are O(n^2)

Stable: sorting algo where equal entries appear in their original order
In place: O(1) space used

- Many sorting algos run in O(nlogn)
    - heapsort
        - in place, but not stable
    - mergesort
        - stable, but not in place
    - quicksort
        - Runs in worst case O(n^2)

Well implemented quick sort is usually the best for sorting.

Other are better in *specific* circumstances:
- Short array with ~10 or fewer elements? -> Insertion sort is quick to code faster than O(faster) algorithms
- Elements known to be at most k places from final location? -> min-heap can be a O(n log k) algorithm
- Small number of distinct keys e.g. [0-255]? -> Counting sort 
    - records for each element, the # of elements less than it
    - If largest number is comparable in value to the size of the set being sorted, use an array
    - Or use a BST. Keys are the numbers, values are the frequencies. Use a linked list for elements with duplicate keys. Inorder traversal can be done to get the order.

- Most sorting algos aren't stable. 
    - One solution is to add the index as an integer rank to keys to break ties.

## Sorting Objects

In [2]:
class Student(object):
    def __init__(self, name, grade_point_average):
        self.name = name
        self.grade_point_average = grade_point_average

    # This is the compare function that will sort by name
    def __lt__(self, other):
        return self.name < other.name
    
students = [
    Student('A', 4.0),
    Student('C', 3.0),
    Student('B', 2.0),
    Student('D', 3.2)
]

# Sorted makes a copy of students, but sorted. Doesn't affect the actual students variable
print('Copy of students sorted variable')
students_sorted_by_name = sorted(students)
for student in students_sorted_by_name:
    print(student.name)

print("\nstudents variable")
for student in students:
    print(student.name)

Copy of students sorted variable
A
B
C
D

students variable
A
C
B
D


In [12]:
# .sort() affects the actual variable
students.sort()

print("\nstudents variable post-.sort() method")
for student in students:
    print(student.name)


students Variable post .sort() method
A
B
C
D


In [8]:
# Reset students
students = [
    Student('A', 4.0),
    Student('E', 3.0),
    Student('C', 3.0),
    Student('B', 2.0),
    Student('D', 3.2)
]

print("\nstudents variable post-lambda on GPA attribute")

students.sort(key=lambda student: student.grade_point_average)
for student in students:
    print(student.name, student.grade_point_average)



students variable post-lambda on GPA attribute
B 2.0
E 3.0
C 3.0
D 3.2
A 4.0


In [9]:
# Reset students, but one with a tie. Let's say we wanna break the tie by doing a tie breaker pattern
students = [
    Student('A', 4.0),
    Student('E', 3.0),
    Student('C', 3.0),
    Student('B', 2.0),
    Student('D', 3.2)
]

print("\nstudents variable post-lambda on GPA attribute, tie breaking on name attribute")

# This breaks the tie with the 2nd value you add to the lambda!
students.sort(key=lambda s: (s.grade_point_average, s.name))
for student in students:
    print(student.name, student.grade_point_average)



students variable post-lambda on GPA attribute, tie breaking on name attribute
B 2.0
C 3.0
E 3.0
D 3.2
A 4.0


## Sorting Libraries

### .sort()
`.sort()` sorts a list in-place.
2 optional args
- *reverse*=False
    - Default to ascending order, setting to True will make it descending
- *key*=None
    - If key is not None, it needs to be a function taking a list of elements and mapping them to comparable objects. Function defines sort order.

In [16]:
a = [1, 2, 4, 3, 5, 0, 11, 21, 100]
# This maps integers to string equivalents
a.sort(key=lambda x: str(x))

# Entries are instead ordered by lexicographical ordering by string representation
a

[0, 1, 100, 11, 2, 21, 3, 4, 5]

### sorted()

`sorted(list)` Use this for iterables. Returns copy of list in ascending order.
2 optional args, both work just like .sort()
- *reverse*=False
- *key*=None

## Sorting Problem Types

2 Main Types

- Use sorting to make subsequent steps in an algorithm simpler
    - Fine to use a library sort function, possibly with custom comparator (key=lambda)
- Design a custom sorting routine
    - Use a data structure like a BST, heap, or array indexed by values

Most natural reason to sort is if the inputs have a natural ordering. So sorting would be a *preprocessing* step to speed up *searching*.

For special input, it's possible to sort in O(n) time!

Usually, the answer to optimizing a brute force solution is to use *less space*