# Hash Table
The reason Hash Tables are sometimes preferred instead of arrays or linked lists is because:
1. searching for, 
2. adding, and 
3. deleting data 

can be done really quickly, even for large amounts of data.

---
## Analogy

> In a Linked List, finding a person "Bob" takes time because we would have to go from one node to the next, checking each node, until the node with "Bob" is found.
> 
> And finding "Bob" in an list/array could be fast if we knew the index, but when we only know the name "Bob", we need to compare each element and that takes time.
>
> Hash tables use **hashing functions**
>

## Uses of Hash Tables

Hash Tables are great for:

- Checking if something is in a collection (like finding a book in a library).
- Storing unique items and quickly finding them (like storing phone numbers).
- Connecting values to keys (like linking names to phone numbers).

---
The most important reason why Hash Tables are great for these things is that Hash Tables are very fast compared Arrays and Linked Lists, especially for large sets. 

Arrays and Linked Lists have time complexity O(n) for search and delete, 

while Hash Tables have just O(1) on average.

## Terminology

1. **Buckets** storage containers for elements in a hash table
2. **hash function** takes the key of an element to generate a *hash code*
3. **Hash code** dictates the bucket elements belong to. To modify, delete or check if an element exists just go to the exact hash code
4. **collision** elements have the same hash code and belong to the same *bucket*
5. **channeling** solving colision using lists to allow more elements in the same bucket



In [None]:
class HashMap:
    def __init__(self, capacity=8):
        self.capacity = capacity
        self.size = 0
        self.buckets = [[] for _ in range(capacity)]
        self.load_factor_threshold = 0.75

    def _hash(self, key):
        """
        Generate a hash index for the given key.
        """
        hash_code = 0
        for char in str(key):
            hash_code = (hash_code * 31 + ord(char)) % self.capacity
        return hash_code

    def _resize(self):
        """
        Resize the hash table when load factor exceeds threshold.
        """
        old_buckets = self.buckets
        self.capacity *= 2
        self.buckets = [[] for _ in range(self.capacity)]
        self.size = 0

        for bucket in old_buckets:
            for key, value in bucket:
                self.put(key, value)

    def put(self, key, value):
        """
        Insert or update a key-value pair.
        """
        index = self._hash(key)
        bucket = self.buckets[index]

        for i, (existing_key, _) in enumerate(bucket):
            if existing_key == key:
                bucket[i] = (key, value)
                return

        bucket.append((key, value))
        self.size += 1

        if self.size / self.capacity >= self.load_factor_threshold:
            self._resize()

    def get(self, key):
        """
        Retrieve value associated with key.
        Raises KeyError if key does not exist.
        """
        index = self._hash(key)
        bucket = self.buckets[index]

        for existing_key, value in bucket:
            if existing_key == key:
                return value

        raise KeyError(f"Key '{key}' not found")

    def contains(self, key):
        """
        Check if key exists in the hash map.
        """
        index = self._hash(key)
        bucket = self.buckets[index]

        for existing_key, _ in bucket:
            if existing_key == key:
                return True
        return False

    def remove(self, key):
        """
        Remove key-value pair from the hash map.
        """
        index = self._hash(key)
        bucket = self.buckets[index]

        for i, (existing_key, _) in enumerate(bucket):
            if existing_key == key:
                del bucket[i]
                self.size -= 1
                return

        raise KeyError(f"Key '{key}' not found")


In [27]:
my_names = HashMap(1)
names = {0:'Jones', 1:"Lisa", 2:"Bob", 3:"Siri", 4:"Pete", 5:"Stuart", 0: "New Jones"}

# add elements
for idx, name in names.items():
    my_names.put(idx, name)

print(my_names.buckets)

[[(0, 'New Jones')], [(1, 'Lisa')], [(2, 'Bob')], [(3, 'Siri')], [(4, 'Pete')], [(5, 'Stuart')], [], [], [], [], [], [], [], [], [], []]


In [28]:
print(len(my_names.buckets))

16


In [19]:
print(my_names.size)

6


In [24]:
my_names.get(0)

'Jones'