#### Hash Tables

A hash table is a data structure which maps keys to values for highly efficient lookup. In a good implementation, the lookup time is O(1), however it can have a worst case of O(N) in the case that there are many collisions.

A simple implementation of a hash table can involve an array of linked lists and a hash code function, as follows:

1. Compute the key's hash code (which will usually be an int or a long) using the **hash function**. Note, different keys could have the same hash code due to an infinite number of keys and a finite number of ints.
2. Map the hashcode to the index of an array. This could be done with something like hash(key) % array_max_length. Different hash codes could map to the same index.
3. At this index, there is a linked list of keys and values. We use a linked list to account for collisions.

<img src="assets/hash-table.png" width="400">

An implementation of a hash table is shown below.

In [77]:
class HashTable:
    def __init__(self):
        self.ARRAY_LENGTH = 10
        self.array = [None] * self.ARRAY_LENGTH

    def add(self, key: str, val: float):
        idx = self.__hash_function(key)
        if self.array[idx] == None:
            self.array[idx] = LinkedList()
        self.array[idx].insert(key, val) 
    
    def get(self, key: str):
        idx = self.__hash_function(key)
        val = self.array[idx].find(key)
        
        if val is None:
            raise KeyError("Key was not found!")
        else:
            return val
    
    def __hash_function(self, string: str) -> int: 
        hash_val = 0
        for i, ch in enumerate(string):
            hash_val += (i + len(string)) ** ord(ch)
        # Perform modulus to stay in range of max length
        return hash_val % self.ARRAY_LENGTH
    
    def __str__(self):
        s = ""
        for i in range(len(self.array)):
            s += f"{i}: {self.array[i]}\n"
        return s

class LinkedList:
    def __init__(self):
        self.head = None
        self.curr = None
    
    def insert(self, key: str, val: float):
        if self.head is None:
            self.head = self.Node(key, val)
            self.curr = self.head
        else:
            self.curr.next = self.Node(key, val)
            self.curr = self.curr.next
    
    def find(self, key: str):
        if self.head is None:
            return self.head
        
        curr = self.head
        while True:
            if curr.key == key:
                return curr.val
            if curr.next is None:
                return curr.next
            curr = curr.next
    
    def __str__(self):
        s = ""
        if self.head is None:
            return s
        
        curr = self.head
        while True:
            s += f"({curr.key}, {curr.val})"
            
            if curr.next is None:
                return s
            
            s += ", "
            curr = curr.next
    
    class Node: 
        def __init__(self, key: str, val: float):
            self.key = key
            self.val = val
            self.next: Node = None

In [84]:
import requests
import random

hash_table = HashTable()

content = requests.get("https://www.mit.edu/~ecprice/wordlist.10000").content
words = content.splitlines()

for i, word in enumerate(words[:100]):
    # Convert from byte to string and store in hash table
    hash_table.add(word.decode("utf-8"), i)

print(hash_table)
print(hash_table.get("accountability"))

0: (ability, 9), (absolute, 20), (accommodate, 48), (accounts, 61), (achievement, 73), (acquisition, 85), (acrylic, 91)
1: (a, 0), (ab, 4), (abandoned, 5), (aboriginal, 11), (accent, 32), (accept, 33), (access, 39), (accessible, 42), (accessory, 45), (accidents, 47), (accommodation, 49), (accomplish, 53), (accounting, 60), (accurately, 66)
2: (aaa, 2), (able, 10), (absence, 18), (academic, 28), (acc, 31), (accepts, 38), (accessed, 40), (accompanied, 51), (accordingly, 57), (accused, 67), (ace, 69), (acm, 80), (acne, 81), (acre, 87), (act, 92), (actions, 95)
3: (aaron, 3), (about, 13), (absorption, 22), (abstracts, 24), (acceptable, 34), (accepting, 37), (accessibility, 41), (accordance, 55), (accredited, 63), (acdbentity, 68), (acids, 77), (across, 90)
4: (abc, 6), (abs, 17), (abu, 25), (academy, 30), (accepted, 36), (accomplished, 54), (account, 58), (accuracy, 64), (acer, 70), (acknowledge, 78), (acrobat, 89)
5: (aa, 1), (absolutely, 21), (achieving, 75), (acting, 93), (action, 94), 

In [14]:
class Test:
    def __init__(self):
        self.x = 10

In [16]:
test = Test()
test.x = 12
test.x

12