## Welcome! 

### This talk will introduce you to searching, sorting, and sharing using your favorite language and mine, Python!

### Python is a great language to quickly prototype and is backed by a great open-source community.

<img src="python_ecosystem.jpg">

**Note: I adapted a ton of images from <a href="http://interactivepython.org/runestone/static/pythonds/index.html">Interactive Python</a>, check them out if you want to learn more about Python**

### We'll start with searching, and grow outwards from there.

#### We want to start by importing numpy's random module, to generate random integers for our list of values

In [93]:
from IPython.display import display
import numpy.random as random

#### Let's initialize a list ```random_values``` of random values in the range [0,1000] using numpy's ```random.randint``` function (<a href="http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.randint.html">documentation</a>)

In [94]:
# Will hold 1000 elements in the range 0 ~ 1000
random_values = [random.randint(1000) for x in range(1000)]

### So how would we go about finding a particular key value in the list?

Well, to be completely sure whether or not our key is in the list, we have to iterate through and ask: <br>

    "Is the value at my current index equal to my key value (the value I'm looking for)?"
    
If **yes**: "Awesome! Return ```True``` and break or whatever." <br>
If **not**: "Lame. Keep on looking, though."

#### Let's print out the contents of our ```random_values``` list

In [95]:
# Let's take a look at our values list
print(random_values) # alternatively, ```random_values``` would also print our list

[76, 654, 82, 890, 608, 124, 636, 685, 163, 907, 208, 484, 372, 605, 538, 434, 708, 428, 57, 847, 261, 259, 249, 125, 567, 474, 514, 830, 695, 646, 855, 94, 182, 228, 402, 622, 155, 608, 992, 992, 610, 794, 40, 992, 353, 130, 829, 557, 983, 52, 475, 800, 423, 279, 645, 695, 70, 209, 925, 412, 730, 807, 364, 354, 913, 26, 493, 32, 863, 0, 110, 565, 319, 896, 45, 459, 595, 401, 38, 428, 692, 931, 396, 658, 857, 776, 154, 830, 983, 398, 22, 971, 286, 181, 750, 755, 924, 226, 432, 462, 502, 555, 721, 273, 211, 484, 436, 116, 561, 585, 263, 175, 785, 106, 636, 828, 582, 583, 563, 775, 287, 42, 856, 406, 598, 699, 952, 234, 87, 381, 870, 617, 538, 775, 327, 373, 962, 110, 147, 226, 231, 108, 43, 887, 217, 42, 812, 652, 476, 703, 588, 283, 582, 411, 761, 743, 142, 952, 670, 169, 799, 955, 688, 228, 784, 45, 95, 564, 486, 389, 304, 375, 96, 986, 975, 700, 548, 118, 484, 427, 702, 493, 458, 894, 419, 16, 60, 599, 509, 346, 820, 67, 396, 173, 256, 676, 665, 829, 938, 275, 198, 32, 465, 207, 362,

## Now we can linearly search three ways: the bad way, the better way, and the Pythonic way.

<img src="linear_search.png">

### The bad way is to hard-code a loop through our values list.
_Thought experiment_: Why is this bad? It works fine, right?

Let's assume we're looking for the value ```516``` within our list. Let's code up our implementation (Note: for some of you, ```516``` won't be found in the list. That's fine!):

In [96]:
key = 516
for value in random_values:
    if value == key: 
        print("Found it!")
        break

Found it!


Here's what it looks like, visually:

<img src="linear_search_done.png">

You might have noticed that this isn't modular. We would have to rewrite each of these lines – key definition and iteration structure – for any possible list we want to iterate over. Kinda tedious.

### Now let's do it the semi-right way and make our linear search into a function

We define a function ```linear_search```. What will the function need? We'll need an iterable object (i.e. a list) and we'll need a key value to search for. <br>
So we say:

In [97]:
def linear_search(iterable, key):
    found = False
    for value in iterable:
        if value == key:
            found = True
            break
            
    return found

### Why is this better than hardcoding our loop? Because this way, we can do more complex tasks, say:

```Given all numbers in the range of 0 to 1000, check if each number is in the random_values list. 
If it is, print "Found x" (where x is the number)```

Let's do just that!

In [98]:
# Create our normal range from 0 to 1000
normal_values = [x for x in range(1000)]

# For each item in our normal_values list
for i in normal_values:
    # If our linear search evaluates to True, we print the number
    if linear_search(random_values, i) == True:
        print("Found {}!".format(i))

Found 0!
Found 1!
Found 4!
Found 5!
Found 7!
Found 10!
Found 11!
Found 13!
Found 16!
Found 18!
Found 20!
Found 21!
Found 22!
Found 23!
Found 26!
Found 27!
Found 28!
Found 29!
Found 32!
Found 33!
Found 34!
Found 35!
Found 37!
Found 38!
Found 40!
Found 41!
Found 42!
Found 43!
Found 44!
Found 45!
Found 46!
Found 48!
Found 49!
Found 50!
Found 52!
Found 53!
Found 55!
Found 56!
Found 57!
Found 59!
Found 60!
Found 64!
Found 67!
Found 69!
Found 70!
Found 71!
Found 72!
Found 73!
Found 74!
Found 75!
Found 76!
Found 81!
Found 82!
Found 85!
Found 87!
Found 88!
Found 89!
Found 93!
Found 94!
Found 95!
Found 96!
Found 98!
Found 99!
Found 102!
Found 104!
Found 106!
Found 107!
Found 108!
Found 110!
Found 112!
Found 113!
Found 114!
Found 115!
Found 116!
Found 118!
Found 122!
Found 124!
Found 125!
Found 126!
Found 128!
Found 130!
Found 132!
Found 138!
Found 139!
Found 142!
Found 143!
Found 146!
Found 147!
Found 149!
Found 150!
Found 151!
Found 152!
Found 154!
Found 155!
Found 156!
Found 158!
Found 162!
F

### Now let's do it the Pythonic way
Python has this nifty inclusion operator called ``in`` that we use all the time in our loops! Let's revisit our two previous examples using the ``in`` operator.

In [99]:
key = 516
key in random_values

True

In [100]:
for i in normal_values:
    if i in random_values:
        print("Found {}!".format(i))

Found 0!
Found 1!
Found 4!
Found 5!
Found 7!
Found 10!
Found 11!
Found 13!
Found 16!
Found 18!
Found 20!
Found 21!
Found 22!
Found 23!
Found 26!
Found 27!
Found 28!
Found 29!
Found 32!
Found 33!
Found 34!
Found 35!
Found 37!
Found 38!
Found 40!
Found 41!
Found 42!
Found 43!
Found 44!
Found 45!
Found 46!
Found 48!
Found 49!
Found 50!
Found 52!
Found 53!
Found 55!
Found 56!
Found 57!
Found 59!
Found 60!
Found 64!
Found 67!
Found 69!
Found 70!
Found 71!
Found 72!
Found 73!
Found 74!
Found 75!
Found 76!
Found 81!
Found 82!
Found 85!
Found 87!
Found 88!
Found 89!
Found 93!
Found 94!
Found 95!
Found 96!
Found 98!
Found 99!
Found 102!
Found 104!
Found 106!
Found 107!
Found 108!
Found 110!
Found 112!
Found 113!
Found 114!
Found 115!
Found 116!
Found 118!
Found 122!
Found 124!
Found 125!
Found 126!
Found 128!
Found 130!
Found 132!
Found 138!
Found 139!
Found 142!
Found 143!
Found 146!
Found 147!
Found 149!
Found 150!
Found 151!
Found 152!
Found 154!
Found 155!
Found 156!
Found 158!
Found 162!
F

## Let's talk about binary search
So linear search is cool and all, but what about something faster? Well, we can improve our searching if we *know* that our collection is in sorted order.

Let's sort our ```random_values``` list:

In [101]:
sorted_values = sorted(random_values)

Now we can take advantage of our sorted values and say: compare my ```key``` against the middle value within my list. From there, we evaluate:

**Is my ```key``` value greater than the list's value at the middle index? Is it less than? Equal to?**

If our ```key``` is found, then we're done. For our purposes, let's say our key is *greater than* the value at the middle of the list. Since our list is sorted, we **know** we won't find it *below* the middle value. Therefore, we can eliminate *half of our search space* and only consider the upper half of our list when re-searching.

<img src="bin_search.png">

Let's implement binary search recursively:

In [102]:
def rec_binary_search(list_of_values, key):
    # if our list is empty, we can't find key
    if len(list_of_values) == 0:
        return "{} was not found".format(key)
    else:
        
        mid = len(list_of_values) // 2
        if key > list_of_values[mid]:
            return rec_binary_search(list_of_values[mid+1:], key)
        elif key < list_of_values[mid]:
            return rec_binary_search(list_of_values[:mid], key)
        else:
            return "{} was found".format(key)

Let's implement binary search iteratively:

In [103]:
def iter_binary_search(list_of_values, key):
    left_index = 0
    right_index = len(list_of_values) - 1
    
    while (left_index <= right_index):
        mid = (left_index + right_index) // 2
        if key > list_of_values[mid]:
            left_index = mid + 1
        elif key < list_of_values[mid]:
            right_index = mid - 1
        else:
            return "{} was found".format(key)
        
    return "{} was not found".format(key)

In [105]:
list_of_values = [17, 20, 26, 31, 44, 54, 55, 65, 77, 93]
for value in list_of_values:
    print(rec_binary_search(list_of_values, value))
    print(iter_binary_search(list_of_values, value))

17 was found
17 was found
20 was found
20 was found
26 was found
26 was found
31 was found
31 was found
44 was found
44 was found
54 was found
54 was found
55 was found
55 was found
65 was found
65 was found
77 was found
77 was found
93 was found
93 was found


Cool! So we've that wraps up searching in Python. Next, we introduce sorting.

# Sorting

Sorting is the (not-so) simple task of taking a collection of values/objects/etc. and arranging them in a cohesive, sorted order (ascending *or* descending). We all (hopefully) know the basic sorting methods, including selection, insertion, and bubble sorts. 

We'll introduce two more advanced sorting methods -- mergesort and quicksort -- that are much quicker than the previously mentioned sorting algorithms. Note that we won't be covering the **very advanced, objectively necessary** <a href="http://rosettacode.org/wiki/Sorting_algorithms/Sleep_sort">sleep sort</a> or <a href="https://en.wikipedia.org/wiki/Bogosort">bogo sort</a> algorithms (sleep sort sorts any list of numbers in O(n) time -- true story)*.

With that, let's approach mergesort.

\* I'm totally kidding.


## Mergesort

Mergesort is a recursive sorting algorithm that recursively splits a list in half until it's dealing with a list of size 1. By definition, a list of size 0 or 1 is considered sorted. Ergo, if the list has a size of 2 or more, we recursively split it in half and invoke merge sort on both halves.

A vital part of merge sort is the *merge* operation, which is done once each half is sorted. Merging is where we take two sorted sub-arrays and merge them into a single, sorted, list.

Let's see if we can visualize it. Take our list:

                        | 10 | 12 | 34 | 58 | 43 | 25 | 19 | 61 | 49 | 32 |

We note that it's length is greater than 1, so (by our definition) it's unsorted. We first recursively break the problem into half by recursively calling mergesort on the first half of the list while the list's length is greater than 1. I've put an asterisk * next to the arrays that are considered sorted. Ergo:

                        | 10 | 12 | 34 | 58 | 43 | 25 | 19 | 61 | 49 | 32 |
                                   /                           \
            | 10 | 12 | 34 | 58 | 43 |                       | 25 | 19 | 61 | 49 | 32 |
                     /      \                                         /      \
          | 10 | 12 |        | 34 | 58 | 43 |              | 25 | 19 |        | 61 | 49 | 32 |
              / \                /      \                      / \                /      \
        | 10*|   | 12*|    | 34*|        | 58 | 43 |     | 25*|    | 19*|   | 61*|        | 49 | 32 |
                                             / \                                              / \
                                       | 58*|   | 43*|                                  | 49*|    | 32*|

## Feel free to check out any of the following links for more resources on Python:
<a href="http://interactivepython.org/runestone/static/pythonds/index.html">Interactive Python</a>: A great open source repository for interactive textbooks, including one on problem solving in Python. <br>
<a href="http://www.amazon.com/Python-Cookbook-Alex-Martelli/dp/0596007973/">Python Cookbook</a>: Good collection of problems to solve and projects to undertake using Python <br>
<a href="http://flask.pocoo.org/docs/0.10/">Flask</a>: A microframework for Python web development. <br>

<a href="http://nbviewer.ipython.org/">nbviewer</a>

# Thanks!