# Table of Contents
 <p><div class="lev1 toc-item"><a href="#Overview" data-toc-modified-id="Overview-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Overview</a></div><div class="lev1 toc-item"><a href="#Sort" data-toc-modified-id="Sort-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Sort</a></div><div class="lev2 toc-item"><a href="#Asymptotic-Runtime" data-toc-modified-id="Asymptotic-Runtime-21"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Asymptotic Runtime</a></div><div class="lev2 toc-item"><a href="#Select-Sort" data-toc-modified-id="Select-Sort-22"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Select Sort</a></div><div class="lev2 toc-item"><a href="#Insert-Sort" data-toc-modified-id="Insert-Sort-23"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Insert Sort</a></div><div class="lev2 toc-item"><a href="#Bubble-Sort" data-toc-modified-id="Bubble-Sort-24"><span class="toc-item-num">2.4&nbsp;&nbsp;</span>Bubble Sort</a></div><div class="lev2 toc-item"><a href="#Merge-Sort" data-toc-modified-id="Merge-Sort-25"><span class="toc-item-num">2.5&nbsp;&nbsp;</span>Merge Sort</a></div><div class="lev2 toc-item"><a href="#Count-Sort" data-toc-modified-id="Count-Sort-26"><span class="toc-item-num">2.6&nbsp;&nbsp;</span>Count Sort</a></div><div class="lev2 toc-item"><a href="#Radix-Sort" data-toc-modified-id="Radix-Sort-27"><span class="toc-item-num">2.7&nbsp;&nbsp;</span>Radix Sort</a></div><div class="lev2 toc-item"><a href="#Runtime-Comparison" data-toc-modified-id="Runtime-Comparison-28"><span class="toc-item-num">2.8&nbsp;&nbsp;</span>Runtime Comparison</a></div><div class="lev1 toc-item"><a href="#Conclusion" data-toc-modified-id="Conclusion-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Conclusion</a></div>

In [1]:
from __future__ import print_function, division
from random import randint # the class might not have numpy, though that would be easier

def rand_list(n_items=50, lower_bound=-100, upper_bound = 100):
    """
    :param n_items: number of elements in the list
    :param lower_bound: smallest integer allowed
    :param upper_bound: largest integer allowed
    :returns: a list of length n_items containing integers bounded by lower_bound and upper_bound
    """
    return [randint(lower_bound, upper_bound) for _ in xrange(n_items)]

# Overview

This notebook provides a lesson plan for my 2017 Spark class: **Intro to Sorting (w/ Python)**.

Find visualizations [here][vis].

Outline:
* What is sorting?
* Have any of you heard of a sorting algorithm before? (an algorithm is like a recipe: it's a set of instructions that tells you how to do something)
    * List any, if so
* Brainstorm with your neighbors, then share ideas with the class: how would you sort a list of numbers?
    * Write down the basics of some ideas
    * (See how they compare to "known" sorts?)
* Go through the sorts below
* Pick a sort to implement in Python (I don't recommend radix sort unless you're comfortable programming. Count sort should be doable, though). You can work solo or with a partner. I'll come around to help. Tips:
    * Print things frequently, so you can see what's happening to your list!
    
Saving time: if it doesn't look like there will be enough time, cut out some of the sorts (these are in order by the ones to drop first)
1. **Radix sort** doesn't need to be explained; just mention that one can extend count sort to integers with multiple digits.
2. **Bubble sort** can be skipped entirely.
3. **Insert sort** can be dropped if needed.

[vis]: https://visualgo.net/sorting

# Sort

## Asymptotic Runtime

Short description here; don't do it mathematically but intuitively

## Select Sort

Select sort...

In [67]:
def select_sort(items):
    items = items[:]
    n_items = len(items)
    for next_idx_to_sort in range(n_items - 1): # - 1 because we don't have to sort the last value once all others are sorted
        # find smallest value not yet sorted
        min_val = float('inf')
        min_idx = None
        for idx in range(next_idx_to_sort, n_items):
            if items[idx] < min_val:
                min_val = items[idx]
                min_idx = idx
        # make the min value sorted by swapping it with the first unsorted element
        items[min_idx] = items[next_idx_to_sort]
        items[next_idx_to_sort] = min_val
    return items

## Insert Sort

Insert sort...

In [3]:
def insert_sort(items):
    pass

## Bubble Sort

Bubble sort...

In [4]:
def bubble_sort(items):
    pass

## Merge Sort

TODO: how much list copying is done? Is this still O(n log n)? also, make a good description


Sorting is a very common algorithm; you may already be familiar with multiple ways to do it. It's a good introductory algorithm because the desired outcome is simple (and easily visualizable!) and it's relevant. There are a few obvious but inefficient ways to do it (e.g. find the smallest item in the original list by scanning the entire list, put that item as the first element of a new, sorted list; repeat until the original list is empty. This is $O(n)$ for each item, so $O(n^2)$ total). Merge sort is a well known and simple algorithm that performs asymptotically as well as possible for an algorithm that has to compare all items to sort them (that may sound like it means that _no_ sorting algorithm can do better. If you're interested, look up count sort and then radix sort to see a way around having to compare all items). A proof of this isn't _too_ difficult, but I won't go into any proofs here, although they're an important part of a foundation in algorithms.

Merge sort is a divide and conquer (and combine) algorithm. It works by splitting the unsorted list in half repeatedly until a base case is reached of a list of one element. A one element list is always sorted, so now instead of an unsorted $n$ element list, we have $n$ sorted, 1-element lists. Next those lists are combined with each other, two at a time (to make 2-element lists then 4-element lists until an $n$ element list is reached). The key to merge sort is that when one combines two sorted lists, one doesn't have to compare all items of both lists to each other. Suppose we are sorting so that smaller items come first. We know the smallest item of the combined list will be the smaller of the first items of the two starting lists. As an example, if I have the lists $l1=[3,6,8,9]$ and $l2 = [1,2,5,7]$ then I only check l1[0] and l2[0] to find that the smallest item is 1. This can be removed and appended to the combined list, $l12=[1]$. Then we iterate until there are no items left in $l1$ or $l2$.

Taking advantage of the sortedness of the lists lets us combine two lists in linear time in their size (that is, if the two lists are length $n$, I'll have to use $O(n)$ time to combine them: each element I add to the combined list requires me to access the first element of each list ($O(1)$), compare them ($O(1)$), and then add the smaller one to the combined list ($O(1)$) (I don't actually have to remove the elements from the split lists when I add them to the combined lists; I can juse update an index of which item is to be considered next in that split list so that I don't add the same element twice. Updating the index is also $O(1)$ (access the old value, increment it, store it)). In total this is $O(1)$ per element added, and $2n$ elements will be added, so $O(n)$ overall for combining two sorted lists.

-----------------------

Words are not nearly so good at describing such an algorithm as a good picture is. [Here][visualization] is a nice visualization that should help. At the top you can select the type of sort you want to see. Compare merge sort with the others given. Make sure to take a look at count sort and then radix sort. They are simple enough that just from watching you may already understand the entire algorithm, yet it's rather amazing: they allow sorting some lists (where the items can be counted in a fashion like shown) in $O(n)$. I hope you are seeing how simple and elegant many incredibly useful algorithms are. When you've clicked the type of sort you want, you can click create in the bottom left to make a new list (or use the one already generated), and then click Sort and Go to watch. Their implementation of merge sort is iterative rather than recursive, but you should be able to follow it still. The boxes in the bottom right show pseudocode; only worry about it if it helps.

--------------------------

a) Write a recurrence for the runtime of merge sort and solve it.

b) Implement an algorithm to perform merge sort on a given list. I would recommend doing it recursively first, as that may be easier, but you can do it iteratively if you wish.

[visualization]: https://visualgo.net/sorting

In [69]:
def merge_sort(items):
    """
    
    There are multiple ways that you can go about this. A recursive approach as below is simple but calling a function many times
    does incur overhead and you can run into the recurrence depth limit. An iterative approach would be better to use at scale. If you didn't
    implement it that way, think about how you would.
    
    NOTE: sorts the list in _ascending_ order (smaller items first); one can always reverse it if need be
    """
#     items = items[:] # make a copy so that we don't modify the original
    
    # base cases: one or two items (you can also just do one base case: a single item)
    if len(items) == 1:
        return items
    elif len(items) == 2:
        if items[0] <= items[1]:
            return items
        else:
            return [items[1], items[0]]
    
    # recursive case
    else:
        mid_point = int(len(items) / 2)
        first_half = merge_sort(items[:mid_point])
        second_half = merge_sort(items[mid_point:])
        
        # combine sorted lists
        first_idx = 0
        second_idx = 0
        result = []
        while True:
            first_elem = first_half[first_idx]
            second_elem = second_half[second_idx]
            
            if first_elem < second_elem:
                result.append(first_elem)
                first_idx += 1
                if first_idx == len(first_half): # all items from first_half have already been added to result
                    result.extend(second_half[second_idx:])
                    return result
            else:
                result.append(second_elem)
                second_idx += 1
                if second_idx == len(second_half):
                    result.extend(first_half[first_idx:])
                    return result

In [39]:
l = [1,2,3,4,5]
z = l[:3]
z[1] = 17
l # I think in numpy arrays, this would work to make z == [1, 17, 3, 4, 5]

[1, 2, 3, 4, 5]

## Count Sort

Count sort...

In [6]:
def count_sort(items):
    pass

## Radix Sort

Radix sort...

In [7]:
def radix_sort(items):
    pass

## Runtime Comparison

In [70]:
# check that the results are correct
sorts = [select_sort, insert_sort, bubble_sort, merge_sort, count_sort, radix_sort]
n = 1000
for sort in sorts:
    unsorted = rand_list(n)
    if sort(unsorted) == sorted(unsorted):
        print("{} passed.".format(sort.__name__))
    else:
        print("{} failed!".format(sort.__name__))

select_sort passed.
insert_sort failed!
bubble_sort failed!
merge_sort passed.
count_sort failed!
radix_sort failed!


In [71]:
n = 1000
array = rand_list(n)
print("Runtime comparison with a list of length {:,}.".format(n))

sorts = [select_sort, insert_sort, bubble_sort, merge_sort, count_sort, radix_sort, sorted]
for sort in sorts:
    print("\n", sort.__name__)
    %timeit sort(array)

# # none of the sorts can be in-place or else you'll need to make a new list for each loop for the timings

# print("\nNaive sort")
# %timeit naive_sort(array)

# print("\nMerge sort")
# %timeit merge_sort(array)

# print("\nPython built-in sorted")
# %timeit sorted(array) # I assume radix sort is used here?

Runtime comparison with a list of length 1,000.

 select_sort
10 loops, best of 3: 15.1 ms per loop

 insert_sort
10000000 loops, best of 3: 66.5 ns per loop

 bubble_sort
10000000 loops, best of 3: 63.8 ns per loop

 merge_sort
1000 loops, best of 3: 1.72 ms per loop

 count_sort
The slowest run took 23.36 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 133 ns per loop

 radix_sort
The slowest run took 30.80 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 132 ns per loop

 sorted
10000 loops, best of 3: 153 µs per loop


# Conclusion
Say a little bit here to encourage them to pursue computer science / programming if they have an interest: make it clear how much they can do even on their own.