# Hash table
Hash table is a very large array that stores key-value pairs. Hash table allows faster access to data compard to other data structures (in average case)

Terminologies:
1. Hash table: a very large array that stores key value pair
2. Key: the input value by user
3. Hash function: a function that takes in a key, and output a value that corrsepond to one entry in the hash table array. 
    - A good hash function should output unique values for different keys, meaning no two keys are mapped to the same entry in the array. Also, it should output hash values uniformly accross the table
4. Value: the output of the hash function, which is the index that the key will be stored in the array
5. Collision: when two key are mapped to the same value
<img src="https://media.geeksforgeeks.org/wp-content/uploads/20240508162721/Components-of-Hashing.webp" width=500>

## Collision resolution
1. Hashing with chaining
Each entry of the hash table stores a linkedlist. If 2 keys are mapped to the same index in the table, they'll be connected together as a linkedlist
    - Pros: easy collision handling, easy deletion
    - Cons: the linkedlist structure requires more space, performance degradation with long linkedlist

<img src="https://media.geeksforgeeks.org/wp-content/uploads/chain-hashing-1.png" width=400>

2. Linear and quardratic probing
* Linear: when collision occurs at an index, repeatedly check for the next index until an empty index is found
* Quardratic: when collision occurs at an index, repeatedly check for the next index with interval increasing quardratically
    - Pros: compact storage, easy traversal
    - Cons: difficult deletion implementation, clustering issue

<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1682756944474/7ae20656-36eb-4c86-b160-db11ca328f21.png?auto=compress,format&format=webp" width=500>

3. Double hashing
Using 2 hash functions, the primary and secondary hash functions. If there is no collision, insert normally using the primary hash function. If collision occured, apply the formula, $index = (h_1 + (i * h_2)) \% table\_size$ for $i = 1, 2, 3 ...$, until an empty index is found 
    - Pros: less clustering
    - Cons: more computions and good secondary hash function are required

## Dynamic resizing
When the hash table becomes full (in the case of probing) or the number of elements exceeds twice the table size (in the case of chaining), it is necessary to resize the table. Resizing involves creating a new array with double the current table size and migrating all existing elements into the new array.

Since the index of each element depends on the table size, resizing requires recalculating the index for every element using the updated table size. This ensures that all elements are correctly positioned in the new table before inserting the next element.

Dynamic resizing helps prevent clustering and slow retrival of data

## Hashing with chaining

In [11]:
# Linked List implementation
class Node:
    def __init__(self, val):
        self.val = val
        self.next = None
        
class List:
    def __init__(self):
        self.head = None
    
    def insert(self, val):
        if self.head == None:
            self.head = Node(val)
            return
        
        cur = self.head
        while(cur.next != None):
            cur = cur.next
        
        cur.next = Node(val)
    
    def delete(self, val):
        if self.head == None:
            return
        
        if self.head.val == val:
            self.head = self.head.next
            return
            
        cur = self.head
        while (cur.next != None and cur.next.val != val):
            cur = cur.next
            
        if cur.next != None:
            cur.next = cur.next.next
            
    def exist(self, val):
        if self.head == None:
            return false
        
        cur = self.head
        while (cur != None):
            if (cur.val == val):
                return True
            cur = cur.next
        return False

In [16]:
TB_SIZE = 10

class Chain_Hashing:
    def __init__(self):
        self.table = [None] * TB_SIZE
    
    
    def hashFunc(self, k):
        return (k % TB_SIZE)
    
    
    def insert(self, k):
        idx = self.hashFunc(k) # Get index
        
        if self.table[idx] == None: # If there's no List at the given index, create one
            self.table[idx] = List() # Create a List()
        
        self.table[idx].insert(k) # Insert the key
    
    
    def delete(self, k):
        idx = self.hashFunc(k) # Get index
        
        if self.table[idx] != None:
            self.table[idx].delete(k)
            
    def exist(self, k):
        idx = self.hashFunc(k) # Get index
        return self.table[idx].exist(k)

In [17]:
# Test insertion
hash_table = Chain_Hashing()

# Insert elements
hash_table.insert(10)
hash_table.insert(20)
hash_table.insert(30)
hash_table.insert(11)

# Display before deletion
print("Before deletion:")
for i in range(TB_SIZE):
    print(f"Index {i}:", end=" ")
    if hash_table.table[i] is not None:
        cur = hash_table.table[i].head
        while cur:
            print(cur.val, end=" -> ")
            cur = cur.next
    print("None")
    
# Exist
print("Exists")
print(hash_table.exist(10))
print(hash_table.exist(30))
print(hash_table.exist(1))

# Delete an element
hash_table.delete(20)
hash_table.delete(10)
hash_table.delete(11)

# Display after deletion
print("\nAfter deleting:")
for i in range(TB_SIZE):
    print(f"Index {i}:", end=" ")
    if hash_table.table[i] is not None:
        cur = hash_table.table[i].head
        while cur:
            print(cur.val, end=" -> ")
            cur = cur.next
    print("None")


Before deletion:
Index 0: 10 -> 20 -> 30 -> None
Index 1: 11 -> None
Index 2: None
Index 3: None
Index 4: None
Index 5: None
Index 6: None
Index 7: None
Index 8: None
Index 9: None
Exists
True
True
False

After deleting:
Index 0: 30 -> None
Index 1: None
Index 2: None
Index 3: None
Index 4: None
Index 5: None
Index 6: None
Index 7: None
Index 8: None
Index 9: None


## Hashing with probing

In [18]:
TB_SIZE = 10

class Element:
    def __init__(self, val):
        self.val = val
        self.deleted = False

class Probe_Table:
    def __init__(self):
        self.table = [None] * TB_SIZE
        self.length = 0
        
    
    def hash_func(self, k):
        return k % TB_SIZE
    
    def insert(self, k):
        if self.length == TB_SIZE: # Check for full table
            return
        
        idx = self.hash_func(k)
        
        for i in range(TB_SIZE):
            if self.table[idx] == None: # For empty entry
                self.table[idx] = Element(k)
                return
            
            elif self.table[idx].deleted == True: # For deleted entry
                self.table[idx].val = k
                self.table[idx].deleted = False
                return
                
            idx = (idx + 1) % TB_SIZE # Probe
        
        self.length += 1
    
    
    def search(self, k):
        i = 0 # Track how many entries are checked, end the loop when i == TB_SIZE
        idx = self.hash_func(k)
        
        while self.table[idx] != None and i < TB_SIZE:
            if self.table[idx].val == k and self.table[idx].deleted == False:
                return True
            
            idx = (idx + 1) % TB_SIZE
            i += 1
        
        return False
    
    
    def delete(self, k):
        if self.search(k):
            idx = self.hash_func(k)
            i = 0
            
            while self.table[idx] != None and i < TB_SIZE:
                if self.table[idx].val == k and self.table[idx].deleted == False:
                    self.table[idx].deleted = True
                    self.length -= 1
                    return
                
                idx = (idx + 1) % TB_SIZE
                i += 1

In [20]:
# Define the Probe_Table and Element classes here

# Test insertion
hash_table = Probe_Table()

# Insert elements
hash_table.insert(10)
hash_table.insert(20)
hash_table.insert(30)
hash_table.insert(11)

# Display before deletion
print("Before deletion:")
for i in range(TB_SIZE):
    print(f"Index {i}:", end=" ")
    if hash_table.table[i] is not None:
        print(f"Value={hash_table.table[i].val}, Deleted={hash_table.table[i].deleted}", end="")
    print()

# Search for elements
print("\nSearch results:")
print(f"Exists (10): {hash_table.search(10)}")  # Should print True
print(f"Exists (30): {hash_table.search(30)}")  # Should print True
print(f"Exists (1): {hash_table.search(1)}")    # Should print False

# Delete some elements
hash_table.delete(20)
hash_table.delete(10)
hash_table.delete(11)

# Display after deletion
print("\nAfter deletion:")
for i in range(TB_SIZE):
    print(f"Index {i}:", end=" ")
    if hash_table.table[i] is not None:
        print(f"Value={hash_table.table[i].val}, Deleted={hash_table.table[i].deleted}", end="")
    print()

Before deletion:
Index 0: Value=10, Deleted=False
Index 1: Value=20, Deleted=False
Index 2: Value=30, Deleted=False
Index 3: Value=11, Deleted=False
Index 4: 
Index 5: 
Index 6: 
Index 7: 
Index 8: 
Index 9: 

Search results:
Exists (10): True
Exists (30): True
Exists (1): False

After deletion:
Index 0: Value=10, Deleted=True
Index 1: Value=20, Deleted=True
Index 2: Value=30, Deleted=False
Index 3: Value=11, Deleted=True
Index 4: 
Index 5: 
Index 6: 
Index 7: 
Index 8: 
Index 9: 
