* Develop an understanding of the inherent complexity of the problem with which we are faced.
* Think about how to break that problem up into subproblems.
* Relate those subproblems to other problems for which efficient algorithms already exist. 

# 10.1  Search Algorithms

In this section, we will examine two algorithms for **searching a list**. Each meets the specification:
```python
def search(L, e): 
    """Assumes L is a list. 
       Returns True if e is in L and False otherwise"""
```

## 10.1.1  Linear Search and Using Indirection to Access Elements

In [1]:
def search(L, e):
    for i in range(len(L)):
        if L[i] == e:
            return True 
    return False 

## 10.1.2  Binary Search and Exploiting Assumptions

In [2]:
def search(L, e): 
    """Assumes L is a list, the elements of which are in ascending order. 
       Returns True if e is in L and False otherwise""" 
    for i in range(len(L)): 
        if L[i] == e: 
            return True 
        if L[i] > e:  
            return False 
    return False 

The complexity is at **best linear** in the length of L. 

In [3]:
def search(L, e): 
    """Assumes L is a list, the elements of which are in ascending order. 
       Returns True if e is in L and False otherwise""" 
     
    def bSearch(L, e, low, high): 
        #Decrements high - low 
        if high == low: 
            return L[low] == e 
        mid = (low + high)//2 
        if L[mid] == e: 
            return True 
        elif L[mid] > e: 
            if low == mid: #nothing left to search 
                return False 
            else: 
                return bSearch(L, e, low, mid - 1) 
        else: 
            return bSearch(L, e, mid + 1, high) 
         
    if len(L) == 0: 
        return False 
    else: 
        return bSearch(L, e, 0, len(L) - 1) 

The complexity of search is $O(log(len(L)))$.

# 10.2  Sorting Algorithms

We begin with a simple but inefficient algorithm, **selection sort**.  Selection sort, works by maintaining the loop invariant that, given a partitioning of the list into a **prefix (L\[0:i\])** and a **suffix (L\[i+1:len(L)\])**, the prefix is sorted and no element in the prefix is larger than the smallest element in the suffix. 
We use induction to reason about loop invariants. 
* Base case: At the start of the first iteration, the prefix is empty, i.e., the suffix is the entire list.  The invariant is (trivially) true. 
* Induction step: At each step of the algorithm, we move one element from the suffix to the prefix.  We do this by appending a minimum element of the suffix to the end of the prefix.  Because the invariant held before we moved the element, we know that after we append the element the prefix is still sorted.  We also know that since we removed the smallest element in the suffix, no element in the prefix is larger than the smallest element in the suffix. 
* When the loop is exited, the prefix includes the entire list, and the suffix is empty.  Therefore, the entire list is now sorted in ascending order. 

In [4]:
def selSort(L): 
    """Assumes that L is a list of elements that can be compared using >. 
       Sorts L in ascending order""" 
    suffixStart = 0 
    while suffixStart != len(L): 
        #look at each element in suffix 
        for i in range(suffixStart, len(L)):
            #print(L) #xem lai L sau moi vong lap
            if L[i] < L[suffixStart]: 
                #swap position of elements 
                L[suffixStart], L[i] = L[i], L[suffixStart] 
                #print(L)
        suffixStart += 1 
    return L   
        
#L = [10,7,4,6,-4,-44]
#selSort(L)

The complexity of the entire function is $O(len(L)^2)$.  I.e., it is quadratic in the length of L. 

## 10.2.1  Merge Sort

**Merge sort** is a prototypical **divide-and-conquer** algorithm.  It was invented in 1945, by John von Neumann, and is still widely used.  Like many divide-and-conquer algorithms it is most easily described recursively. 
1.  If the list is of length 0 or 1, it is already sorted. 
2.  If the list has more than one element, split the list into two lists, and use merge sort to sort each of them. 
3.  Merge the results.

Consider, for example, merging the two lists \[1,5,12,18,19,20\] and \[2,3,4,17\]: 


|Left in list 1|        Left in list 2 |   Result|
|:-:|:-:|:-:|
|\[1,5,12,18,19,20\] |  \[2,3,4,17\] |   \[\] |
|\[5,12,18,19,20\]|    \[2,3,4,17\]|    \[1\] |
|\[5,12,18,19,20\]|    \[3,4,17\]|    \[1,2\] |
|\[5,12,18,19,20\]|    \[4,17\] |     \[1,2,3\]|
|\[5,12,18,19,20\]|    \[17\] |     \[1,2,3,4\] |
|\[12,18,19,20\]|     \[17\] |     \[1,2,3,4,5\]| 
|\[18,19,20\] |     \[17\] |     \[1,2,3,4,5,12\]  |
|\[18,19,20\]|      \[\]   |     \[1,2,3,4,5,12,17\] |
|\[\] |         \[\] |         \[1,2,3,4,5,12,17,18,19,20\] |

In [5]:
def merge(left, right, compare): 
    """Assumes left and right are sorted lists and compare defines an ordering on the elements. 
       Returns a new sorted (by compare) list containing the same elements as (left + right) would contain.""" 
     
    result = [] 
    i,j = 0, 0 
    while i < len(left) and j < len(right): 
        if compare(left[i], right[j]): 
            result.append(left[i]) 
            i += 1 
        else: 
            result.append(right[j]) 
            j += 1 
    while (i < len(left)): 
        result.append(left[i]) 
        i += 1 
    while (j < len(right)): 
        result.append(right[j]) 
        j += 1 
    return result 
 
import operator 
 
def mergeSort(L, compare = operator.lt): 
    """Assumes L is a list, compare defines an ordering on elements of L 
       Returns a new sorted list containing the same elements as L""" 
    #print(L)
    if len(L) < 2: 
        return L[:] 
    else: 
        middle = len(L)//2 
        left = mergeSort(L[:middle], compare) 
        right = mergeSort(L[middle:], compare)
        return merge(left, right, compare) 
    
#L = [10,7,4,6,-4,-44]
#mergeSort(L)

## 10.2.2  Exploiting Functions as Parameters

In [6]:
def lastNameFirstName(name1, name2):
    import string
    name1 = string.split(name1, ' ')
    name2 = string.split(name2, ' ')
    if name1[1] != name2[1]:
        return name1[1] < name2[1]
    else: #last names the same, sort by first name
        return name1[0] < name2[0]

def firstNameLastName(name1, name2):
    import string
    name1 = string.split(name1, ' ')
    name2 = string.split(name2, ' ')
    if name1[0] != name2[0]:
        return name1[0] < name2[0]
    else: #first names the same, sort by last name
        return name1[1] < name2[1]

L = ['Chris Terman', 'Tom Brady', 'Eric Grimson', 'Gisele Bundchen']
newL = mergeSort(L, lastNameFirstName)
print('Sorted by last name =', newL)
newL = mergeSort(L, firstNameLastName)
print('Sorted by first name =', newL)

AttributeError: module 'string' has no attribute 'split'

## 10.2.3  Sorting in Python
The sorting algorithm used in most Python implementations is called **timsort**. The key idea is to take advantage of the fact that in a lot of data sets the data is already partially sorted. Timsort’s worst-case performance is the same as merge sort’s, but on average it performs considerably better.

The Python method _list.sort_ takes a list as its first argument and modifies that list.
<br>In contrast, the Python function sorted takes an iterable object (e.g., a list or a dictionary) as its first argument and returns a new sorted list. 

In [None]:
L = [3,5,2] 
D = {'a':12, 'c':5, 'b':'dog'} 
print(sorted(L))
print(L)
L.sort() 
print(L)
print(sorted(D) )
D.sort() 

Both the list.sort method and the sorted function can have **two additional parameters**.

In [None]:
L = [[1,2,3], (3,2,1,0), 'abc'] 
print(sorted(L, key = len, reverse = True))

Both the list.sort method and the sorted function provide **stable sorts**. This means that if two elements are equal with respect to the comparison used in the sort, their relative ordering in the original list (or other iterable object) is preserved in the final list. 

# 10.3 Hash Tables

In [None]:
class intDict(object): 
    """A dictionary with integer keys""" 
     
    def __init__(self, numBuckets): 
        """Create an empty dictionary""" 
        self.buckets = [] 
        self.numBuckets = numBuckets 
        for i in range(numBuckets): 
            self.buckets.append([]) 
             
    def addEntry(self, dictKey, dictVal): 
        """Assumes dictKey an int.  Adds an entry.""" 
        hashBucket = self.buckets[dictKey%self.numBuckets] 
        for i in range(len(hashBucket)): 
            if hashBucket[i][0] == dictKey: 
                hashBucket[i] = (dictKey, dictVal) 
                return 
        hashBucket.append((dictKey, dictVal)) 
         
    def getValue(self, dictKey): 
        """Assumes dictKey an int.  Returns entry associated 
           with the key dictKey""" 
        hashBucket = self.buckets[dictKey%self.numBuckets] 
        for e in hashBucket: 
            if e[0] == dictKey: 
                return e[1] 
        return None 
     
    def __str__(self): 
        result = '{' 
        for b in self.buckets: 
            for e in b: 
                result = result + str(e[0]) + ':' + str(e[1]) + ',' 
        return result[:-1] + '}' #result[:-1] omits the last comma 