# Hash Tables
A data structure that stored unordered items by mapping each item to a location in an array (or vector) known as a bucket.\
The modulo operator is used with an item's key and the size of the table. The key should be unique to that item.\
Hash Function: key % table_size = bucket index\
## Allocation
Hash tables are often allocated with a prime number. When resizing, the next ((prime number) >= N * 2). New array is allocated and all items from old array are re-inserted into the new array. resize complexity = O(n)\
### Load Factor
The load factor is (num items / table_size).
### When to resize
- LF is used to determine when to resize hash table.
- Open Addressing: A threshold of insertion collisions are reached
- Chaining: Size of a bucket's linked list passes a threshold

## Collisions
Occur when an item is being inserted into an already used item. Various techniques: Chaining and Open Addressing. When searching for an item a comparison is then made to be sure it's the right item within the bucket.
### Chaining
Each bucket holds a list of items
### Open Addressing
Store item within *any* bucket
### Linear Probing
When collision occurs, go to the *next* available bucket. On search this results in probing the bucket index first and then every bucket after until found or N size is reached.
### Quadratic Probing
Similar to Linear, only rather than *every* subsequent bucket, the bucket order is determined quadratically. The formula: (H + c1 * i + c2 * i^2) % (table_size) determines bucket index.\
H=hash(item.key) | c1 & c2 = user-defined | i = iteration number if bucket isn't empty (starts with 0)
### Double Hashing
Open Addressing solution that uses 2 different hash functions to find the bucket index. The formula: (h1(key) + i * h2(key)) % (table_size).\
h1 & h2 are different hash functions | i = iteration number if bucket isn't empty (starts with 0)

## Common Hash Functions
### Modulo
item_key % table_size
### Mid-square base 10
Squares key, extracts R digits from the result's middle, returns result % table_size. R must be >= ceil(log10(N)).\
### Mid-square base 2
Squares key, converts result to binary, extracts R digits from middle, returns result % table_size. R must be >= ceil(log2(N))
### Multiplicative String
Repeatedly multiplies hash value and adds ASCII value of each char in the string. Multiplier is often a prime number. Finally returns result % table_size.\
Loops by string size: stringHash = (stringHash * multiplier) + strChar | Then: stringHash % table_size

```
HashInsert(hashTable, item) {
    if hashSearch for the table and item.key is null
        then make a new bucket list as a hashTable with the hashed item.key
        allocate a new linked list node
        connect the item to the node's data
        append the node to the bucketList
    else append the i

```

In [12]:
test_int = 54
test_str = "Random Word"
test_float = 3.1459
test_tuple = (1, 2, 3, 4)
test_list = [5, 6, 7, 8] # throws error because mutable
test_dict = {"item1": "value1", "item2": "value2"} # throws error, mutable

In [2]:
def printHash(item):
    print(f"Hash value for {item} is: " + str(hash(item)))

In [3]:
printHash(test_int)

Hash value for 54 is: 54


In [4]:
printHash(test_str)

Hash value for Random Word is: 4944782876873321693


In [5]:
printHash(test_float)

Hash value for 3.1459 is: 336422495044278275


In [7]:
printHash(test_tuple)

Hash value for (1, 2, 3, 4) is: 590899387183067792


In [8]:
printHash(test_list)

TypeError: unhashable type: 'list'

In [11]:
printHash(test_dict)

TypeError: unhashable type: 'dict'

User-defined Types can override hash() method and equal comparison operator

In [21]:
class Employee:
    def __init__(self, employee_name, eid):
        self.employee_name = employee_name
        self.eid = eid

    def __eq__(self, other):
        return self.employee_name == other.employee_name and self.eid == other.eid

    def __hash__(self,):
        # Store unique items of object within a tuple to hash
        return hash((self.employee_name, self.eid))

In [22]:
e = Employee("John", 1)

In [26]:
e2 = Employee("Jane", 2)

In [27]:
e3 = Employee("VeryLongEmployeeName", 999)

In [23]:
printHash(e)

Hash value for <__main__.Employee object at 0x0000025D91673620> is: -5810975169737240520


In [25]:
printHash(e2)

Hash value for <__main__.Employee object at 0x0000025D922D4050> is: -7500842263962692335


In [28]:
printHash(e3)

Hash value for <__main__.Employee object at 0x0000025D9166F490> is: 8272547450903156520


# Hash Table

In [32]:
### BASE CLASS ###
class HashTable:
    # Return non-negative hash code
    def hashKey(self, key):
        return abs(hash(key))

    # Inserts key/valuye pair.
    # If key exists, corresponding value is updated.
    # If inserted/updated True is returned, else false.
    def insert(self, key, value):
        pass

    # Searches for key, if found: removed and returns True, else False
    def remove(self, key):
        pass

    # Searches for key, returns key's value if found, else None
    def search(self, key):
        pass

## Chaining Table

In [31]:
class ChainingHashTableItem:
    
    def __init__(self, key, value, next_node=None):
        self.key = key
        self.value = value
        self.next_node = next_node

In [114]:
class ChainingHashTable(HashTable):

    def __init__(self, table_size):
        self.table = [None] * table_size
        self.table_size = table_size
    
    def insert(self, key, value):
        bucketIndex = (self.hashKey(key) % len(self.table))
        if self.table[bucketIndex] is None:
            # If the bucket at this index is empty, add this key to the bucket space with no next pointer
            # alt: make an empty list, store this node in the list and set the bucketIndex to = new list
            bucketList = []
            node = ChainingHashTableItem(key, value)
            bucketList.append(node)
            self.table[bucketIndex] = bucketList
            return True
        else:
            bucketList = self.table[bucketIndex]
            for element in bucketList:
                if element.key == key:
                    element.value = value
                    break
                elif element.next_node == None:
                    node = ChainingHashTableItem(key, value)
                    element.next_node = node
                    bucketList.append(node)
                    break
            return True
        return False
    
    def remove(self, key):
        pass

    def search(self, key):
        pass

In [115]:
tempCHT = ChainingHashTable(11)

In [116]:
print(tempCHT.table)

[None, None, None, None, None, None, None, None, None, None, None]


In [117]:
tempCHT.insert("key1", "value7")

True

In [118]:
tempCHT.insert(8, "value5")

True

In [120]:
tempCHT.insert(19, "value5")

True

In [127]:
for item in tempCHT.table[8]:
    print("key: "+ str(item.key) + "\nvalue: " + str(item.value))
    print()

key: key1
value: value7

key: 8
value: value5

key: 19
value: value5

