# Implementing a Hash Table

### A basic hash function.

For each character in the key we will sum the ord (an integer representing the Unicode character).<br>
Then we will modular divide by the size of the list (100 in this example)

In [3]:
def get_hash(key):
    h = 0
    for char in key:
        h += ord(char)
    
    return h % 100  # 100 = size of list


tests = ["march 9", "march 10", "march 11"]

for test in tests:
    print(get_hash(test))

12
52
53


### Building a Hash Table class

In [21]:
class HashTable:
    def __init__(self):
        self.MAX = 100
        self.arr = [None for _ in range(self.MAX)]

    def get_hash(self, key):
        h = 0

        for char in key:
            h += ord(char)
        
        return h % self.MAX
    
    def __setitem__(self, key, val):
        h = self.get_hash(key)
        self.arr[h] = val
    
    def __getitem__(self, key):
        h = self.get_hash(key)

        return self.arr[h]
    
    def __delitem__(self, key):
        h = self.get_hash(key)
        self.arr[h] == None

In [22]:
t = HashTable()

t["march 6"] = 130
t["march 7"] = 140
t["dec 17"] = 10000
t["june 9"] = 22

t["dec 17"]

10000

Delete an item

In [23]:
del t["dec 17"]

# Collision

When we get two or more keys trying to store data on the same index it means we have a collision.

There are two approaches to solving this issue:

The first approach called *'Chaining'*, where instead of storing the value, we store a list or linked list at the location instead. this is called a bucket.

*Time Complexity*:<br>
- A normal search of a key value is *O(1)* constyant time.<br>
- When we have two or more keys at the same index it might go up to order of *O(n)*


The second appoach is called *'Linear Probing'*, where if there is already a value at the index we are trying to store our new value, we go to the next avaliable loaction. it is called *Linear Probing* because we search through the locations in a linear way looking fo the next avaliable space. if we get to the end of the list we will then go back to the beginning.

### Implementing Chaining

A hash table that avoids collision using the chaining method

In [42]:
class HashTable:
    def __init__(self):
        self.MAX = 10
        self.arr = [[] for _ in range(self.MAX)]

    def get_hash(self, key):
        h = 0

        for char in key:
            h += ord(char)
        
        return h % self.MAX
    
    def __setitem__(self, key, val):
        h = self.get_hash(key)
        found = False
        
        for idx, element in enumerate(self.arr[h]):
            if len(element) == 2 and element[0] == key:
                self.arr[h][idx] = (key, val)
                found = True
                break
        
        if not found:
            self.arr[h].append((key, val))
    
    def __getitem__(self, key):
        h = self.get_hash(key)
        
        for element in self.arr[h]:
            if element[0] == key:
                return element[1]
    
    def __delitem__(self, key):
        h = self.get_hash(key)

        for idx, key_value in enumerate(self.arr[h]):
            if key_value[0] == key:
                del self.arr[h][idx]


In [43]:
t = HashTable()

In [44]:
t["march 6"] = 120
t["march 6"] = 78
t["march 8"] = 67
t["march 9"] = 4
t["march 17"] = 459

In [45]:
t['march 6']

78

In [46]:
t.arr

[[],
 [('march 8', 67)],
 [('march 9', 4)],
 [],
 [],
 [],
 [],
 [],
 [],
 [('march 6', 78), ('march 17', 459)]]

In [47]:
del t['march 17']

In [48]:
t.arr

[[],
 [('march 8', 67)],
 [('march 9', 4)],
 [],
 [],
 [],
 [],
 [],
 [],
 [('march 6', 78)]]