#### 1. Set up a random experiment to test the difference between a sequential search and a binary search on a list of integers.


In [37]:
import timeit

def seqSearch(alist, item):
    for n in alist:
        if n == item:
            return True
    return False

def binSearchR(alist, item):
    if len(alist) == 0:
        return False
    midpoint = len(alist) // 2
    if alist[midpoint] == item:
        return True
    else:
        if alist[midpoint] < item:
            return binSearchR(alist[midpoint+1:], item)
        else:
            return binSearchR(alist[:midpoint], item)

def main():
    sequential = timeit.Timer("seqSearch(alist, item)", "from __main__ import seqSearch, alist, item")
    binary = timeit.Timer("binSearchR(alist, item)", "from __main__ import binSearchR, alist, item")  
    seq_t = sequential.timeit(number=100000)
    bin_t = binary.timeit(number=100000)
    print("The time of 100,000 times sequential serach is %.5fs." % seq_t)
    print("The time of 100,000 times binary serach is %.5fs." % bin_t)
    
if __name__ == '__main__':
    alist = [0, 1, 2, 8, 13, 17, 19, 32, 42]
    item = 13
    main()

The time of 100,000 times sequential serach is 0.03462s.
The time of 100,000 times binary serach is 0.03692s.


#### 2. Use the binary search functions given in the text (recursive and iterative). Generate a random, ordered list of integers and do a benchmark analysis for each one. What are your results? Can you explain them?


In [38]:
import timeit

def binSearchR(alist, item):
    if len(alist) == 0:
        return False
    midpoint = len(alist) // 2
    if alist[midpoint] == item:
        return True
    else:
        if alist[midpoint] < item:
            return binSearchR(alist[midpoint+1:], item)
        else:
            return binSearchR(alist[:midpoint], item)

def binSearchI(alist, item):
    first = 0
    last = len(alist)-1
    found = False
    while first <= last and not found:
        midpoint = (first+last)//2
        if alist[midpoint] == item:
            found = True
        else:
            if alist[midpoint] < item:
                first = midpoint + 1
            else:
                last = midpoint - 1
    return found

def main():
    recursive = timeit.Timer("binSearchR(alist, item)", "from __main__ import binSearchR, alist, item")
    iterative = timeit.Timer("binSearchI(alist, item)", "from __main__ import binSearchI, alist, item")
    rec_t = recursive.timeit(number=100000)
    itr_t = iterative.timeit(number=100000)
    print("The time of 100,000 times recursive binary serach is %.5fs." % rec_t)
    print("The time of 100,000 times iterative binary serach is %.5fs." % itr_t)
    
if __name__ == '__main__':
    alist = [0, 1, 2, 8, 13, 17, 19, 32, 42]
    item = 17
    main()

The time of 100,000 times recursive binary serach is 0.15765s.
The time of 100,000 times iterative binary serach is 0.05687s.


The recursive call uses the slice operator to create the left half of the list that is then passed to the next invocation (similarly for the right half as well). The analysis that we did above assumed that the slice operator takes constant time. However, we know that the slice operator in Python is actually O(k). This means that the binary search using slice will not perform in strict logarithmic time. Luckily this can be remedied by passing the list along with the starting and ending indices.

#### 3. Implement the binary search using recursion without the slice operator. Recall that you will need to pass the list along with the starting and ending index values for the sublist. Generate a random, ordered list of integers and do a benchmark analysis.


In [45]:
def binSearchR2(alist, item, first=0, last=len(alist)-1):
    if len(alist) == 0:
        return False
    midpoint = (first+last) // 2
    if alist[midpoint] == item:
        return True
    else:
        if alist[midpoint] < item:
            first = midpoint + 1
            return binSearchR2(alist, item, first, last)
        else:
            last = midpoint - 1
            return binSearchR2(alist, item, first, last)

def main():
    recursive2 = timeit.Timer("binSearchR2(alist, item)", "from __main__ import binSearchR2, alist, item")
    iterative = timeit.Timer("binSearchI(alist, item)", "from __main__ import binSearchI, alist, item")
    rec_t2 = recursive2.timeit(number=100000)
    itr_t = iterative.timeit(number=100000)
    print("The time of 100,000 times recursive binary serach is %.5fs." % rec_t2)
    print("The time of 100,000 times iterative binary serach is %.5fs." % itr_t)

if __name__ == '__main__':
    alist = [0, 1, 2, 8, 13, 17, 19, 32, 42]
    item = 17
    main()

The time of 100,000 times recursive binary serach is 0.09288s.
The time of 100,000 times iterative binary serach is 0.05853s.


#### 4. Implement the len method (__len__) for the hash table Map ADT implementation.
#### 5. Implement the in method (__contains__) for the hash table Map ADT implementation.


In [51]:
class HashTable:
    def __init__(self):
        self.size = 11
        self.slots = [None] * self.size
        self.data = [None] * self.size
    
    def put(self, key, data):
        hashvalue = self.hashfunction(key, len(self.slots))
        
        if self.slots[hashvalue] == None:
            self.slots[hashvalue] == key
            self.data[hashvalue] == data
        else:
            if self.slots[hashvalue] == key:
                self.data[hashvalue] == data    # replace
            else:
                nextslot = self.rehash(hashvalue, len(self.slots))
                while self.slots[nextslot] != None and self.slots[nextslot] != key:
                    nextslot = self.rehash(nextslot, len(self.slots))
                    
                if self.slots[nextslot] == None:
                    self.slots[nextslot] = key
                    self.data[nextslot] = data
                else:
                    self.data[nextslot] = data    # replace
                    
    def hashfunction(self, key, size):
        return key % size
    
    def rehash(self, oldhash, size):
        return (oldhash+1)%size
    
    def get(self, key):
        startslot = self.hashfunction(key, len(self.slots))  
        data = None
        stop = False
        found = False
        position = startslot
        while self.slots[position] != None and not found and not stop:
            if self.slots[position] == key:
                found = True
                data = self.data[position]
            else:
                position = self.rehash(position, len(self.slots))
                if position == startslot:
                    stop = True
        return data
    
    def __getitem__(self, key):
        return self.get(key)
    
    def __setitem__(self, key, data):
        self.put(key, data)
        
    def __len__(self):
        return self.size
        
    def __contains__(self, key):
        startslot = self.hashfunction(key, self.size)
        stop = False
        found = False
        position = startslot
        while self.slots[position] != None and not found and not stop:
            if self.slots[position] == key:
                found = True
            else:
                position = self.rehash(position, self.size)
                if position == startslot:
                    stop = True
        return found
    
def main():
    H=HashTable()
    H[54]="cat"
    H[26]="dog"
    H[93]="lion"
    H[17]="tiger"
    H[77]="bird"
    H[31]="cow"
    H[44]="goat"
    H[55]="pig"
    H[20]="chicken"
    print(H.slots)
    print(H.data)
    
    print(H[20])

    print(H[17])
    H[20]='duck'
    print(H[20])
    print(H[99])

if __name__ == '__main__':
    main()

[None, None, None, None, None, None, None, None, None, None, None]
[None, None, None, None, None, None, None, None, None, None, None]
None
None
None
None


In [52]:
class HashTable:
    def __init__(self):
        self.size = 11
        self.slots = [None] * self.size
        self.data = [None] * self.size

    def put(self,key,data):
      hashvalue = self.hashfunction(key,len(self.slots))

      if self.slots[hashvalue] == None:
        self.slots[hashvalue] = key
        self.data[hashvalue] = data
      else:
        if self.slots[hashvalue] == key:
          self.data[hashvalue] = data  #replace
        else:
          nextslot = self.rehash(hashvalue,len(self.slots))
          while self.slots[nextslot] != None and \
                          self.slots[nextslot] != key:
            nextslot = self.rehash(nextslot,len(self.slots))

          if self.slots[nextslot] == None:
            self.slots[nextslot]=key
            self.data[nextslot]=data
          else:
            self.data[nextslot] = data #replace

    def hashfunction(self,key,size):
         return key%size

    def rehash(self,oldhash,size):
        return (oldhash+1)%size

    def get(self,key):
      startslot = self.hashfunction(key,len(self.slots))

      data = None
      stop = False
      found = False
      position = startslot
      while self.slots[position] != None and  \
                           not found and not stop:
         if self.slots[position] == key:
           found = True
           data = self.data[position]
         else:
           position=self.rehash(position,len(self.slots))
           if position == startslot:
               stop = True
      return data

    def __getitem__(self,key):
        return self.get(key)

    def __setitem__(self,key,data):
        self.put(key,data)

H=HashTable()
H[54]="cat"
H[26]="dog"
H[93]="lion"
H[17]="tiger"
H[77]="bird"
H[31]="cow"
H[44]="goat"
H[55]="pig"
H[20]="chicken"
print(H.slots)
print(H.data)

print(H[20])

print(H[17])
H[20]='duck'
print(H[20])
print(H[99])


[77, 44, 55, 20, 26, 93, 17, None, None, 31, 54]
['bird', 'goat', 'pig', 'chicken', 'dog', 'lion', 'tiger', None, None, 'cow', 'cat']
chicken
tiger
duck
None


How can you delete items from a hash table that uses chaining for collision resolution? How about if open addressing is used? What are the special circumstances that must be handled? Implement the del method for the HashTable class.
In the hash table map implementation, the hash table size was chosen to be 101. If the table gets full, this needs to be increased. Re-implement the put method so that the table will automatically resize itself when the loading factor reaches a predetermined value (you can decide the value based on your assessment of load versus performance).
Implement quadratic probing as a rehash technique.
Using a random number generator, create a list of 500 integers. Perform a benchmark analysis using some of the sorting algorithms from this chapter. What is the difference in execution speed?
Implement the bubble sort using simultaneous assignment.
A bubble sort can be modified to “bubble” in both directions. The first pass moves “up” the list, and the second pass moves “down.” This alternating pattern continues until no more passes are necessary. Implement this variation and describe under what circumstances it might be appropriate.
Implement the selection sort using simultaneous assignment.
Perform a benchmark analysis for a shell sort, using different increment sets on the same list.
Implement the mergeSort function without using the slice operator.
One way to improve the quick sort is to use an insertion sort on lists that have a small length (call it the “partition limit”). Why does this make sense? Re-implement the quick sort and use it to sort a random list of integers. Perform an analysis using different list sizes for the partition limit.
Implement the median-of-three method for selecting a pivot value as a modification to quickSort. Run an experiment to compare the two techniques.