13016213 Data Structures and Algorithms Laboratory

**NOTE** click here to select this cell, press Esc-Enter to enter cell edit mode, press Shift-Enter to put the cell back to display mode.

#### Name: Araya Siriadun

#### Student ID: 58090046

Laboratory 9: Maps and Hash Tables
===

## Overview
A **map** is an abstraction that associates *unique* **keys** to **values**. Maps are also commonly known as **dictionary** or **associative arrays**. Figure 1 illustrates a map from the names of countries to their assoiciated currency units.

<br />
<center>
<img src="figs/fig1.jpg" />
<br />
<b>Figure 1.</b> A map from countries (the keys) to their currency units (the values). <br />Note that, the keys are assumed to be unique, but the values are not necessarily unique.
</center>
<br />

Maps use an array-like syntax for indexing, such as `currency['China']` to access a value associated with a given key or `currency['China']='RMB'` to remap a given key to a new value. Unlike an array, indices for a map need not be consecutive nor even numeric. There are numerous applications of the map ADTs, including the following.

* The domain-name-system (DNS) maps a host name, such as www.ic.kmitl.ac.th to an Internet Protocol (IP) address, such as 161.246.94.250.

* A social media site typically uses a (non-numeric) username as a key that can be efficiently mapped to a particular user's associated information.

* Python uses a dictionary to map an identifier (variable name), such as `'pi'`, to an associated object, such as `3.14159`.

In this laboratory, we introduce the map ADT, its implementation variants, and applications. 

<hr />

## The Map ADT

In consistent with Python's built-in dict class, a ***map M*** consists of five key behaviors as follows:

* **`M[k]`**: Return the value `v` associated with key `k` in map `M`, if one exists; otherwise raise a `KeyError`.

* **`M[k] = v`**: Associate value `v` with key `k` in map `M`, replacing the existing value if the map already contains an item with key equal to `k`.

* **`del M[k]`**: Remove from map `M` the item with key equal to `k`; if `M` has no such item, then raise a `KeyError`.

* **`len(M)`**: Return the number of items in map `M`. 

* **`iter(M)`** The default iteration for a map generates a sequence of keys in the map.


### Question 1 [2 marks].
Show the effect of a series of operations on an initially empty map storing items with *country name* as keys and *currency unit* as values. 

|No. |   Operation           |      Return Value      |           Map          |
|:---|:----------------------|:----------------------:|:-----------------------|
|ex. | `len(M)`              |          `0`           |           { }          |
|ex. | `M['Thailand'] = Bath`|          `-`           | {'Thailand': 'Baht' }  |
|1.  | `M['Germany'] = Euro` |         ...            |   ...      |
|2.  | `M['France'] = Euro`  |         ...            |   ...   |
|3.  | `M['Thailand']`       |         ...            |   ...   |
|4.  | `M['China']`          |         ...            |   ...   |
|5.  | `len(M)`              |         ...            |   ...   |
|6.  | `del M['France']`     |         ...            |   ...   |
|7.  | `del M['Sweden']`     |         ...            |   ...   |
|8.  | `for k in M: print(k, M[k])`|         ...            |   ...   |


#### Answer.

|No. |   Operation           |      Return Value      |          Map           |
|:---|:----------------------|:----------------------:|:-----------------------|
|1.  | `M['Germany'] = Euro` |          `-`           | {'Thailand': 'Bath', 'Germany': 'Euro'} |
|2.  | `M['France'] = Euro`  |          `-`           | {'Thailand': 'Bath', 'Germany': 'Euro', 'France': 'Euro'} |
|3.  | `M['Thailand']`       |         'Bath'         | {'Thailand': 'Bath', 'Germany': 'Euro', 'France': 'Euro'} |
|4.  | `M['China']`          |   KeyError: 'China'    | {'Thailand': 'Bath', 'Germany': 'Euro', 'France': 'Euro'} |
|5.  | `len(M)`              |           3            | {'Thailand': 'Bath', 'Germany': 'Euro', 'France': 'Euro'} |
|6.  | `del M['France']`     |          `-`           | {'Thailand': 'Bath', 'Germany': 'Euro'} |
|7.  | `del M['Sweden']`     |   KeyError: 'Sweden'   | {'Thailand': 'Bath', 'Germany': 'Euro'} |
|8.  | `for k in M: print(k, M[k])`|Thailand Bath <br> Germany Euro| {'Thailand': 'Bath', 'Germany': 'Euro'} |

<hr />

## Implementations of Map ADT
To demonstrate a trade-off advantages and disadvantages of a variety of data structures, we will be studying four different implementations of the map ADT, in the remainder of this laboratory. Figure 2 provides an overview of those  concrete classes for the map ADT.

<br />
<center>
<img src="figs/fig2.jpg" />
<br />
<b>Figure 2.</b> Our class hierarchy of map types.
</center>
<br />



### MapBase Class

Our `MapBase` class extends the `MutableMapping` abstract class of the Python's `collections` module. The `MapBase` class inherits many useful methods provided by the `MutableMapping`. Here, we define a nonpublic nested `_Item` class, whose instances can be used to store a key-value pair. To provide support for equality tests and comparions based on the item's key, we also implement the `__eq__`, `__ne__`, and `__lt__` special methods. The definition of our `MapBase` class is provided in the following code snippet.

In [1]:
from collections import MutableMapping

class MapBase(MutableMapping):
    """an abstract base class extending the MutableMapping."""
    class _Item:
        """Lightweight composite to store key-value pairs as map items."""
        __slots = '_key', '_value'
        
        def __init__(self, k, v):
            self._key = k
            self._value = v
            
        def __eq__(self, other):
            return self._key == other._key
        
        def __ne__(self, other):
            return not (self == other)
        
        def __lt__(self, other):
            return self._key < other._key



### UnsortedTableMap Class

An `UnsortedTableMap` class is the simplest concrete implementation of the map ADT. It relies on ***storing key-value pairs in arbitrary order within a Python list***. The `UnsortedTableMap` class extends our `MapBase` class by providing an initializer method and the five key methods of the map ADT.

In [2]:
class UnsortedTableMap(MapBase):
    """Map implementation using an unsorted list."""
    
    def __init__(self):
        """Create an empty map."""
        self._table = []
        
    def __getitem__(self, k):
        """Return value associated with key k.
        
        raise KeyError if not found.
        """
        for item in self._table:
            if k == item._key:
                return item._value
        raise KeyError('KeyError' + repr(k))
    
    def __setitem__(self, k, v):
        """Assign value v to key k, overwriting existing value if present."""
        for item in self._table:
            if k == item._key:
                item._value = v
                return
        # did not find match for key
        self._table.append(self._Item(k, v))
        
    def __delitem__(self, k):
        """Remove item associated with key k. 
        
        Raise KeyError if not found.
        """
        for j in range(len(self._table)):
            if k == self._table[j]._key:
                self._table.pop(j)
                return
        raise KeyError('KeyError: ' + repr(k))
        
    def __len__(self):
        """Return number of items in the map."""
        return len(self._table)
    
    def __iter__(self):
        """Generate iteration of the map's keys."""
        for item in self._table:
            yield item._key



### Question 2 [2 marks].

Use `UnsortedTableMap` class to check your answer for Question 1. 

In [3]:
### TODO.Q2

def Map(M):
    d = []
    for k in M: d += ["{}: {}".format(k,M[k])]
    return '{'+str(d)[1:-1]+'}'
def Testcase(M):
    print(type(M).__name__.center(109))
    print("Operation                  |        Return Value        | Map")
    print("len(M)                     |             {}              | {}".format(len(M), Map(M)))
    print("M['Thailand'] = 'Bath'     |            {}            | {}".format(M.__setitem__('Thailand', 'Bath'), Map(M)))
    print("M['Germany']  = 'Euro'     |            {}            | {}".format(M.__setitem__('Germany', 'Euro'), Map(M)))
    print("M['France'] = 'Euro'       |            {}            | {}".format(M.__setitem__('France', 'Euro'), Map(M)))
    print("M['Thailand']              |            {}            | {}".format(M.__getitem__('Thailand'), Map(M)))
    try: M['China']
    except KeyError: print("M['China']                 |      {}     | {}".format("KeyError: 'China'", Map(M)))
    print("len(M)                     |             {}              | {}".format(len(M), Map(M)))
    print("del M['France']            |            {}            | {}".format(M.__delitem__('France'), Map(M)))
    try: del M['Sweden']
    except KeyError: print("del M['Sweden']            |      {}    | {}".format("KeyError: 'Sweden'", Map(M)))
    print("for k in M: print(k, M[k]) |", end = ' ')
    for k in M: print(k, M[k], end = ' ')
    print("| {}\n".format(Map(M)))
Testcase(UnsortedTableMap())

                                               UnsortedTableMap                                              
Operation                  |        Return Value        | Map
len(M)                     |             0              | {}
M['Thailand'] = 'Bath'     |            None            | {'Thailand: Bath'}
M['Germany']  = 'Euro'     |            None            | {'Thailand: Bath', 'Germany: Euro'}
M['France'] = 'Euro'       |            None            | {'Thailand: Bath', 'Germany: Euro', 'France: Euro'}
M['Thailand']              |            Bath            | {'Thailand: Bath', 'Germany: Euro', 'France: Euro'}
M['China']                 |      KeyError: 'China'     | {'Thailand: Bath', 'Germany: Euro', 'France: Euro'}
len(M)                     |             3              | {'Thailand: Bath', 'Germany: Euro', 'France: Euro'}
del M['France']            |            None            | {'Thailand: Bath', 'Germany: Euro'}
del M['Sweden']            |      KeyError: 'Sweden'    | {'Th

<hr />
### Question 3 [2 marks].

What are the worst-case runtime of the following methods of the `UnsortedTableMap` class.

|  Methods |   Worst-case runtime    | 
|:---------|:------------------------|
|  `__len__` | ....            |
|  `__getitem__` | ....            |
|  `__setitem__` | ....            |
|  `__delitem__` | ....            |


#### Answer.

|  Methods |   Worst-case runtime    | 
|:---------|:------------------------|
|  `__len__` |<center>O(1)</center>|
|  `__getitem__` |<center>O(n)</center>|
|  `__setitem__` |<center>O(n)</center>|
|  `__delitem__` |<center>O(n)</center>|

<hr />
### SortedTableMap Class
The second concrete implementation of our map ADT stores the map's items in an array-based sequence so that they are in increasing order of their keys, assuming the keys have a naturally defined order. As we will see, the advantage of keeping the map's items sorted in an array is that it allows us to use the ***binary search*** algorithm for a variety of fundamental operations of the map ADT.

#### Binary Search
A binary search algorithm is a search algorithm that finds the position of a target value within a sorted array.
It compares the target value to the middle element of the array; if they are unequal, the half in which the target cannot lie is eliminated and the search continues on the remaining half until it is successful (reference: https://en.wikipedia.org/wiki/Binary_search_algorithm).


In [4]:
def binary_search(A, k, low, high):
    """Binary search algorithm.
    
    Return index of the item in A with key equal to k
    
    Return None if no such item qualifies
    
    """
    mid = (low+high)//2
    if low > high:
        return None
    
    if k == A[mid]:
        return mid
    elif k < A[mid]:
        return binary_search(A, k, low, mid-1)
    else:
        return binary_search(A, k, mid+1, high)

In [5]:
def test_binary_search():
    """testing the binary search algorithm."""
    A = [101, 103, 105, 107, 116, 118, 133, 134]
    
    assert binary_search(A, 105, 0, len(A)-1) == 2
    assert binary_search(A, 101, 0, len(A)-1) == 0
    assert binary_search(A, 134, 0, len(A)-1) == len(A)-1
    assert binary_search(A, 118, 0, len(A)-1) == 5
    assert binary_search(A, -9, 0, len(A)-1) is None
    assert binary_search(A, 0, 0, len(A)-1) is None
    
    print(binary_search(A, 116, 0, len(A)-1))
    print(A[binary_search(A, 116, 0, len(A)-1)])
    
test_binary_search()

4
116


<hr />
#### Implementation of SortedTableMap Class

A partial implementation of a class `SortedTableMap` is provided below. The key feature of this class is a `_find_index` utility function that uses a binary search algorithm to find the position of an item in the ordered list of the map. By convention, the `_find_index` method returns the index of the left-most item in the search interval ([low, high]) having key greater than or equal to `k`. Therefore, if the key is present, it will return the index of the item having that key. When the key is missing, the function returns the index of the item in the search interval that is just beyond where the key would have been located. The method returns index `high + 1` to indicate that no items of the interval had a key greater than `k`. 

The body of each of the `__getitem__, __setitem__, __delitem` methods begins with a call to `_find_index` to determine a candidate index at which a matching key might be found.

In [4]:
class SortedTableMap(MapBase):
    """Map implementation using a sorted table."""
    
    def _find_index(self, k, low, high):
        """Return index of the leftmost item with key greater than or equal to k.
        
        Return high + 1 if no such item qualifies.
        """
        #
        ### TODO.Q4
        # implementation of _find_index method.
        if low > high:
            return high + 1
        mid = (low + high) // 2
        if k == self._table[mid]._key:
            return mid
        elif k < self._table[mid]._key:
            return self._find_index(k, low, mid - 1)
        return self._find_index(k, mid + 1, high)
            
    def __init__(self):
        """Create an empty map."""
        self._table = []
        
    def __len__(self):
        """Return the number of items in the map."""
        return len(self._table)
    
    def __getitem__(self, k):
        """Return value associated with key k
        
        Raise KeyError if not found.
        """        
        j = self._find_index(k, 0, len(self._table)-1)
        
        if j == len(self._table) or self._table[j]._key != k:
            raise KeyError('KeyError: ' + repr(k))
        return self._table[j]._value
    
    def __setitem__(self, k, v):
        """Assign value v to key k, overwriting existing value if present."""
        
        j = self._find_index(k, 0, len(self._table) - 1)
        
        if j < len(self._table) and self._table[j]._key == k:    # overwrite existing item            
            self._table[j]._value = v
            
        else:     # append/insert new item
            if len(self._table) <= j:                    
                self._table.append(self._Item(k, v))
            else:                                        
                self._table.insert(j, self._Item(k, v))

    def __delitem__(self, k):
        """Remove item associated with key k. 
        
        Raise KeyError if not found.
        """
        j = self._find_index(k, 0, len(self._table) - 1)
        if j == len(self._table) or self._table[j]._key != k:
            raise KeyError('KeyError:' + repr(k))
        self._table.pop(j)
        
    def __iter__(self):
        """Generate iteration of the map's keys. Ordered from minimum to maximum."""
        for item in self._table:
            yield item._key


### Question 4. [2 marks]

Complete the implementation of the `_find_index` method of the `SortedTableMap` class

#### Answer.  See above


<hr />

### Question 5 [2 marks].

Use `SortedTableMap` class to check your answer for Question 1. 

In [5]:
### TODO.Q5

Testcase(SortedTableMap())

                                                SortedTableMap                                               
Operation                  |        Return Value        | Map
len(M)                     |             0              | {}
M['Thailand'] = 'Bath'     |            None            | {'Thailand: Bath'}
M['Germany']  = 'Euro'     |            None            | {'Germany: Euro', 'Thailand: Bath'}
M['France'] = 'Euro'       |            None            | {'France: Euro', 'Germany: Euro', 'Thailand: Bath'}
M['Thailand']              |            Bath            | {'France: Euro', 'Germany: Euro', 'Thailand: Bath'}
M['China']                 |      KeyError: 'China'     | {'France: Euro', 'Germany: Euro', 'Thailand: Bath'}
len(M)                     |             3              | {'France: Euro', 'Germany: Euro', 'Thailand: Bath'}
del M['France']            |            None            | {'Germany: Euro', 'Thailand: Bath'}
del M['Sweden']            |      KeyError: 'Sweden'    | {'Ge

<hr />
### Question 6 [2 marks].

What are the runtime of the following methods of the `SortedTableMap` class.

|  Methods |   Running Time    | 
|:---------|:------------------------|
|  `__len__` | ....            |
|  `__getitem__` | ....            |
|  `__setitem__` | ....            |
|  `__delitem__` | ....            |


#### Answer.

|  Methods |   Running Time    | 
|:---------|:------------------------|
|  `__len__` |<center>O(1)</center>|
|  `__getitem__` |<center>O(logn)</center>|
|  `__setitem__` |<center>O(n)</center>|
|  `__delitem__` |<center>O(n)</center>|

<hr />
## Hashing

Hashing is the process of mapping a search key to a limited range of array indices with the goal of providing direct access to items in a collection. The items are stored in an array-based sequence called a ***hash table***. A ***hash function***, associated with the hash table, converts the search keys to specific entries in the hash table.

For example, suppose we have the following set of items (denoted by integer keys):

`765, 431, 96, 142, 579, 226, 903, 388`

and a hash table `T`, containing `M = 13` elements. We can define a simple hash function $h(.)$ that maps the keys to entries in the hash table:

`h(key) = key % M`

To add keys to the hash table, first we apply the hash function to determine the entry in which the given key should be stored. 

```
h(765) = 765 % 13 = 11
h(431) = 431 % 13 = 2
h(96)  = 96  % 13 = 5
h(142) = 142 % 13 = 12
h(579) = 579 % 13 = 7
```

Figure 3 illustrates the insertion of the first five keys into the hash tables.

<br />
<center>
<img src="figs/fig3.png" />
<br />
<b>Figure 3.</b> Adding the first five keys to the hash table.
</center>
<br />



## Collision Resolution

The first five keys were easily added to the table because the resulting indices were unique and the corresponding table entries were empty. Consider what happens when we attempt to add key `226` to the hash table.

```
h(226) = 226 % 13 = 5
```

The hash function maps this key to entry with index `5`, but that entry was already occupied by an item with key `96` as shown in Figure 4. Thee result is a ***collsion***, which occurs when two or more keys map to the same hash location. 


<br />
<center>
<img src="figs/fig4.png" />
<br />
<b>Figure 4.</b> A collision occurs when adding key `226`.
</center>
<br />

There are several methods for resolving the collision, such as separate chaining, linear probing, quadratic probing, and double hashing. Here, we will explore two methods for collision resolution: ***separate chaning*** and ***linear probing***.


### Separate Chaining

In separate chaning, the hash table is constructed as an array of secondary containers (e.g., linked lists or maps). The keys are mapped to an individual index in the usual way, but instead of storing the key into the array elements, the keys are inserted into the secondary container referenced from the corresponding entry. This collision resolution rule is known as ***seprate chaning***. 

Figure 5 illustrates a hash table of size 13, storing 10 items with collision resolved by separate chaining.

<br />
<center>
<img src="figs/fig5.jpg" />
<br />
<b>Figure 5.</b> A hash table of size 13, storing 10 items with integer keys, with collisions resolved by separate chaining. The hash function is `h(k) = k mod 13`.
</center>
<br />

### Linear Probing

The major disadvantage of the separate chaining is that it requires the use of auxilliary data structure -- a map or a list -- to hold items with colliding keys. If memory space is limited, then we can use the alternative approach of always storing each item directly in a hash table entry. This approach is collectively referred to as ***open addressing*** schemes. Open addressing saves space because no auxilliary data structures are employed, but it requires a bit more complexity to deal with collisions. We will explore the simplest variant of open addressing, called ***linear probing***.

In linear probing, if collision occurs then we examine the table entries in sequential order starting with the first entry immediately following the original hash location. As an example, for key value `226` in Figure 4, the linear probe finds slot `6` available, so the key can be stored at that location (see Figure 6).

<br />
<center>
<img src="figs/fig6.png" />
<br />
<b>Figure 6.</b> Resolving a collision for key `226` by linear probing.
</center>
<br />

When key `903` is added, the hash function maps the key to index `6` (`h(903)= 903 % 13 = 6`), but we just added key `226` to this entry. To resolve the collision, we sequentially probe the hash table until we find an empty slot at entry indexed `8` (see Figure 7).


<br />
<center>
<img src="figs/fig7.png" />
<br />
<b>Figure 7.</b> Resolving a collision for key `903` by linear probing: (a) perform a linear probe; (b) the hash table after adding `903`.
</center>
<br />

If the end of the hash table is reached during the linear probe, we have to wrap around to the first entry and continue until either an available slot is found or all entries have been examined. For example, if we add key `388` to the hash table, the key is mapped to slot `11` (`h(388) = 388 % 13 = 11`), which is occupied by key `765`. The linear probe, as illustrate in Figure 8, requires wrapping around to the beginning of the hash table.


<br />
<center>
<img src="figs/fig8.png" />
<br />
<b>Figure 8.</b> Adding key `388` to the hash table: (a) linear probe requires wrapping around to the beginning of the hash table; (b)  the hash table after adding `388`.
</center>
<br />

<hr />
### Question 7 [4 mark].

Suppose we have the following set of items (denoted by integer keys):

`26 13 5 37 16 21 15 19 39 48`

and a hash table `T`, containing `M = 11` elements. Assume that the hash function associated with `T` is defined as:

`h(key) = key % M`

**Programmatically** depict the hash table `T` after adding all items.

(a) using linear probing <br />
(b) using separate chaining


In [4]:
### TODO.Q7

def linear_probing(s, M = 1):
    assert len(s) <= M, "input keys > lenght of hash table"
    T = [None] * M
    for key in s:
        for i in range(M):
            if not T[(key + i) % M]:
                T[(key + i) % M] = key
                break
    return T
def separate_chaining(s, M = 1):
    T = [None] * M
    for key in s:
        if not T[key % M]:
            T[key % M] = key
        elif isinstance(T[key % M], int):
            T[key % M] = [T[key % M], key]
        elif isinstance(T[key % M], list):
            T[key % M] += [key]
    return T

print("(a)", linear_probing([26, 13, 5, 37, 16, 21, 15, 19, 39, 48], M = 11))
print("(b)", separate_chaining([26, 13, 5, 37, 16, 21, 15, 19, 39, 48], M = 11))

(a) [39, 48, 13, None, 26, 5, 37, 16, 15, 19, 21]
(b) [None, None, 13, None, [26, 37, 15, 48], [5, 16], 39, None, 19, None, 21]


<hr />

## Implementation of Hash based Maps

In this section, we present two implementation of a hash table: a separate chaining and a linear probing. 

As there are quite a number of common functionalities of these two hashing implementations, we extend the `MapBase` class to define a new `HashMapBase` class and add the common functionalities to the `HashMapBase` class. The `HashMapBase` class presumes abstract methods `_bucket_getitem(j, k), _bucket_setitem(j, k, v), _bucket_delitem(j, k), __iter__` which must be implemented by each concrete subclasses (`ChainHashMap` and `ProbeHashMap`).

A full implementation of the `HashMapBase`, `ChainHashMap`, and `ProbeHashMap` is provided below.

In [8]:
from random import randrange

class HashMapBase(MapBase):
    """
    Abstract base class for map using hash-table with 
    MAD (Multiply-Add-and-Divide) compression.
    """
    
    def __init__(self, cap=11, p=109345121):
        """Create an empty hash-table map."""
        self._table = cap * [None]
        self._n = 0                            # number of entries in the map
        self._prime = p                        # prime for MAD compression
        self._scale = 1 + randrange(p - 1)     # scale from 1 to p-1 for MAD
        self._shift = randrange(p)             # shift from 0 to p-1 for MAD
        
    def _hash_function(self, k):
        """Compute hash for key k.
        
        Our hash function consists of two parts: a hash code and a compression function.
        
        - Hash code function (Python built-in hash function):
        The first part of the hash function maps keys k to well distributed integer values.
        
        - Compression function (MAD compression method):
        The second part of the hash function maps integer values i = hash(k) to:
            ((a * i + b) mod p) mod N
            
            where, 
            N is the size of the hash table.                       # self._table
            p is a prime number larger than N                      # self._prime
            a is an integer randomly chosen from integer 1..(p-1)  # self._scale
            b is an integer randomly chosen from integer 0..(p-1)  # self._shift           
            
        """
        return (hash(k) * self._scale + self._shift) % self._prime % len(self._table)
    
    def __len__(self):
        return self._n
    
    def __getitem__(self, k):
        j = self._hash_function(k)
        return self._bucket_getitem(j, k)      # may raise KeyError
    
    def __setitem__(self, k, v):        
        j = self._hash_function(k)        
        self._bucket_setitem(j, k, v)          # subroutine maintains self._n
        if self._n > len(self._table) // 2:    # keep load factor <= 0.5
            self._resize(2 * len(self._table) - 1)
            
    def __delitem__(self, k):
        j = self._hash_function(k)
        self._bucket_delitem(j, k)             # may raise KeyError
        self._n -= 1
        
    def _resize(self, c):                      # resize bucket array to capacity c
        old = list(self.items())
        self._table = c * [None]
        self._n = 0
        for (k, v) in old:
            self[k] = v


class ChainHashMap(HashMapBase):
    """Hash map implemented with separate chaining for collision resolution."""
    
    def _bucket_getitem(self, j, k):
        bucket = self._table[j]
        if bucket is None:
            raise KeyError('KeyError: ' + repr(k)) 
        return bucket[k]
    
    def _bucket_setitem(self, j, k, v):
        if self._table[j] is None:
            self._table[j] = UnsortedTableMap()
        oldsize = len(self._table[j])
        self._table[j][k] = v
        if len(self._table[j]) > oldsize:            # key was new to the table
            self._n += 1                             # increase overall map size            
            
    def _bucket_delitem(self, j, k):
        bucket = self._table[j]
        if bucket is None:
            raise KeyError('KeyError:' + repr(k))
        del bucket[k]
        
    def __iter__(self):
        for bucket in self._table:
            if bucket is not None:
                for key in bucket:
                    yield key


class ProbeHashMap(HashMapBase):
    """Hash map implemented with linear probing for collision resolution."""
    
    _AVAIL = object()         # sentinel marks locations of previous deletions
    
    def _is_available(self, j):
        """Return True if index j is available in table."""
        return self._table[j] is None or self._table[j] is ProbeHashMap._AVAIL
    
    def _probe_index(self, hashed_k):
        """Return the next index to be probed."""
        return (hashed_k + 1) % len(self._table)
    
    def _find_slot(self, j, k):
        """Search for key k in bucket at index j.
        
        Return (success, index) tuple, described as follows:
        if match was found, success is True and index denotes its location.
        if no match found, success is False and index denotes first available slot.
        """
        
        firstAvail = None
        while True:
            if self._is_available(j):
                if firstAvail is None:
                    firstAvail = j
                if self._table[j] is None:
                    return (False, firstAvail)
                
            elif k == self._table[j]._key:
                return (True, j)
            
            j = self._probe_index(j)
            
    def _bucket_getitem(self, j, k):
        found, s = self._find_slot(j, k)
        if not found:
            raise KeyError('KeyError: ' + repr(k))
        return self._table[s]._value
    
    def _bucket_setitem(self, j, k, v):
        found, s = self._find_slot(j, k)
        if not found:                            # insert new item
            self._table[s] = self._Item(k, v)
            self._n += 1
        else:                                    # overwrite existing
            self._table[s]._value = v
            
    def _bucket_delitem(self, j, k):
        found, s = self._find_slot(j, k)
        if not found:
            raise KeyError('KeyError: ' + repr(k))
        self._table[s] = ProbeHashMap._AVAIL
        
    def __iter__(self):
        for j in range(len(self._table)):
            if not self._is_available(j):
                yield self._table[j]._key
                



<hr />
### Question 8 [2 mark].
Use `ChainHashMap` and `ProbeHashMap` class to check your answer for Question 1. 

In [10]:
### TODO.Q8
Testcase(ChainHashMap())
Testcase(ProbeHashMap())

<hr />
### Question 9 [2 marks].

What are the runtime of the following methods of the `ChainHashMap`  and the `ProbeHashMap` classes.

|  Methods |   Expected Running Time of <br />`ChainHashMap` and `ProbeHashMap` | Worst case Running Time of <br />`ChainHashMap` and `ProbeHashMap` |
|:---------|:----------------------------------------------------|
|  `__len__` |   ...       |   ...       |
|  `__getitem__` | ...     |   ...       |
|  `__setitem__` | ...     |   ...       |
|  `__delitem__` | ...     |   ...       |

#### Answer.

|  Methods |   Expected Running Time of <br />`ChainHashMap` and `ProbeHashMap` | Worst case Running Time of <br />`ChainHashMap` and `ProbeHashMap` |
|:---------|:----------------------------------------------------|
|  `__len__` |<center>O(1)</center>|<center>O(1)</center>|
|  `__getitem__` |<center>O(1)</center>|<center>O(n)</center>|
|  `__setitem__` |<center>O(1)</center>|<center>O(n)</center>|
|  `__delitem__` |<center>O(1)</center>|<center>O(n)</center>|

<hr />
## Programming Quiz [6 marks]

Design and implement a simple English-to-Thai dictionary program that translates an English word into Thai. 


* An English-to-Thai dictionary data is provided in a plain-text file named `eng2thai/en2th.dict.txt`. The file is encoded in a UTF-8 format. Each row in the dictionary data file corresponds to a tab-delimited English-to-Thai translation. For example, the first five lines of the data file read

```
safe-conduct   เอกสารอนุญาตให้ผ่านได้โดยเฉพาะช่วงสงคราม
untruthful	 ที่พูดโกหก
lacrimal	   เกี่ยวกับต่อมน้ำตา
mineralogical  เกี่ยวกับการศึกษาแร่
towage	     การพ่วงหรือลาก
```

* The number of entries in the dictionary data file is a bit large (`54492` entries). Make sure that your main data structure can handle the dictionary data of this size.

* Here is an sample run of the program:

<br />
<center>
<img src="figs/fig9.jpg" style="border: solid black 1px;"/>
<br />
<b>Figure 9.</b> A sample run of the English-to-Thai dictionary program.
</center>
<br />

In [None]:
### a simple English-to-Thai dictionary program

if __name__ == "__main__":
    M = ProbeHashMap()
    dictionary = "eng2thai/en2th.dict.txt"
    with open(dictionary, encoding = "utf8") as file:
        for row in file:
            M[row.split('\t')[0]] = row.split('\t')[1][:-1]
        print("Loaded dictionary from {} [# of entries = {}]\n\n".format(file.name, len(M)))
    while True:
        word = input("Please input an English word: ")
        if word == '/': break
        try: print("\n\t{} => {}\n".format(word, M[word]))
        except KeyError: print("\tNo entries found.\n")

Loaded dictionary from eng2thai/en2th.dict.txt [# of entries = 54492]


Please input an English word: jazzy

	jazzy => น่าดึงดูดใจ

Please input an English word: english
	No entries found.

Please input an English word: thailand
	No entries found.

Please input an English word: pick
	No entries found.

Please input an English word: plate

	plate => ชุบ

Please input an English word: serene

	serene => ปลอดโปร่ง

Please input an English word: clean

	clean => ไม่มีสิ่งผิดกฎหมาย เช่น ยาเสพย์ติดหรืออาวุธ

