# HASH TABLES
---
Hash tables are a fundamental data structure used in many computer science applications.



## Definition
A hash table, also known as a hash map, is a data structure that maps keys to values. It uses a hash function to compute an index (also called a hash code) into an array of buckets or slots, from which the desired value can be found.

## Hash Function
A hash function is a function that takes an input (or 'key') and returns a fixed-size string of bytes. The output is typically a number that serves as an index to an array. The goal of the hash function is to distribute the keys uniformly across the array to reduce the likelihood of collisions.

## Collisions
A collision occurs when two different keys hash to the same index. Hash tables handle collisions using various methods:
   - **Chaining**: Each bucket contains a list of all elements that hash to the same index. When a collision occurs, the new element is added to the list.
   - **Open Addressing**: When a collision occurs, the algorithm probes the array to find the next empty slot. Techniques include linear probing, quadratic probing, and double hashing.

## Python's Dictionary Implementation
In Python, dictionaries (`dict`) are implemented using hash tables. The key features include:
   - **Fast Lookups**: Average $O(1)$ time complexity for lookups, insertions, and deletions.
   - **Dynamic Resizing**: When the dictionary becomes too full (i.e., when the load factor exceeds a certain threshold), it is resized to reduce collisions. This involves creating a new larger array and rehashing all the existing keys.
   - **Hashable Keys**: Keys must be hashable, meaning they must implement `__hash__()` and `__eq__()`. Immutable types like strings, numbers, and tuples are hashable by default.

## Operations
Key operations supported by hash tables in Python include:
   - **Insertion**: Adding a new key-value pair.
   - **Deletion**: Removing a key-value pair.
   - **Lookup**: Retrieving the value associated with a given key.
   - **Update**: Changing the value associated with an existing key.

## Advantages
   - **Efficiency**: O(1) average time complexity for basic operations.
   - **Simplicity**: Easy to use and understand.
   - **Versatility**: Suitable for a wide range of applications.

## Disadvantages
   - **Memory Usage**: Can use more memory than other data structures due to the need for an array and possibly linked lists for chaining.
   - **Worst-Case Performance**: In the worst case (many collisions), operations can degrade to O(n) time complexity, though this is rare with a good hash function and appropriate load factor.

## Implementation
It would be too easy to just call the Python built-in `dict` method, so instead we will implement a simple hash function and create our own simple Hash Table!

In [2]:
# Creating the hash function

def hashFunction(k, n):
    return k % n

# Simple hash function that converts two letters in ASCII to integers.
def hashTwoLetters(k: str, n: int):
    k = list(k)
    return (ord(k[0]) + ord(k[1])) % n

# In this example we will use the Brazilian state acronyms as `keys`

n = 10 # Our module
hashTable = [None] * n # Our Hash Table

index = hashTwoLetters('RS', n) # Applying the hash function to the state of Rio Grande do Sul (RS)
hashTable[index] = 'RS' # Inserting the `RS` acronym to the list

print(hashTable)

[None, None, None, None, None, 'RS', None, None, None, None]


Let's implement some validation to verify if the key, after the hash, has a valid position.

In [5]:
n = 10
hashTable = [None] * 10

opt=0
while opt != 4:
    print("Select an Option:")
    print("1 \t Insert item into the Hast Table")
    print("2 \t Delete item from the Hash Table")
    print("3 \t List the Hash Table")
    print("4 \t Exit")
    opt = int(input(">>> "))

    if opt == 1:
        print("\nType an Brazilian State Acronym: ")
        key = str(input(">>> "))
        index = hashTwoLetters(key, n)
        if hashTable[index] is None:
            hashTable[index] = key
            print("Value added!", end="\n\n")
        else:
            print("There is already a value here!!")
    elif opt == 2:
        print("\nType the value to be deleted:")
        key = str(input(">>> "))
        index = hashTwoLetters(key, n)
        hashTable[index] = None
        print("Value deleted!", end="\n\n")
    elif opt == 3:
        print("\nPrinting the Hash Table:")
        print(hashTable)
    else:
        opt = 4

Select an Option:
1 	 Insert item into the Hast Table
2 	 Delete item from the Hash Table
3 	 List the Hash Table
4 	 Exit

Type an Brazilian State Acronym: 
Value added!

Select an Option:
1 	 Insert item into the Hast Table
2 	 Delete item from the Hash Table
3 	 List the Hash Table
4 	 Exit

Type an Brazilian State Acronym: 
Value added!

Select an Option:
1 	 Insert item into the Hast Table
2 	 Delete item from the Hash Table
3 	 List the Hash Table
4 	 Exit

Type an Brazilian State Acronym: 
There is already a value here!!
Select an Option:
1 	 Insert item into the Hast Table
2 	 Delete item from the Hash Table
3 	 List the Hash Table
4 	 Exit

Type an Brazilian State Acronym: 
Value added!

Select an Option:
1 	 Insert item into the Hast Table
2 	 Delete item from the Hash Table
3 	 List the Hash Table
4 	 Exit

Printing the Hash Table:
['SC', None, 'PR', None, None, 'RS', None, None, None, None]
Select an Option:
1 	 Insert item into the Hast Table
2 	 Delete item from the Hash 

Now let's implement a method to handle colisions:

In [11]:
def colisionSolver(hashtable, index, n):
    position = index
    while (hashtable[position] is not None):
        position += 1
        if position >= n:
            position = 0
        if position == index:
            position = -1
            return position
    return position

n = 10
hashTable = [None] * 10

opt=0
while opt != 4:
    print("Select an Option:")
    print("1 \t Insert item into the Hast Table")
    print("2 \t Delete item from the Hash Table")
    print("3 \t List the Hash Table")
    print("4 \t Exit")
    opt = int(input(">>> "))

    if opt == 1:
        print("\nType an Brazilian State Acronym: ")
        key = str(input(">>> "))
        position = hashTwoLetters(key, n)
        index = colisionSolver(hashTable, position, n)
        if index == -1:
            print("Hash Table full, impossible to add new values!")
        else:
            hashTable[index] = key
            print("Value added!", end="\n\n")

    elif opt == 2:
        print("\nType the value to be deleted:")
        key = str(input(">>> "))
        index = hashTwoLetters(key, n)
        hashTable[index] = None
        print("Value deleted!", end="\n\n")
    elif opt == 3:
        print("\nPrinting the Hash Table:")
        print(hashTable)
    else:
        opt = 4

Select an Option:
1 	 Insert item into the Hast Table
2 	 Delete item from the Hash Table
3 	 List the Hash Table
4 	 Exit


In [12]:
n = 10
hashTable = [None] * 10

for _ in range(11):
    key = "RS"
    position = hashTwoLetters(key, n)
    index = colisionSolver(hashTable, position, n)
    if index == -1:
            print("Hash Table full, impossible to add new values!")
    else:
        hashTable[index] = key
        print("Value added!", end="\n\n")
    
    print(hashTable)

Value added!

[None, None, None, None, None, 'RS', None, None, None, None]
Value added!

[None, None, None, None, None, 'RS', 'RS', None, None, None]
Value added!

[None, None, None, None, None, 'RS', 'RS', 'RS', None, None]
Value added!

[None, None, None, None, None, 'RS', 'RS', 'RS', 'RS', None]
Value added!

[None, None, None, None, None, 'RS', 'RS', 'RS', 'RS', 'RS']
Value added!

['RS', None, None, None, None, 'RS', 'RS', 'RS', 'RS', 'RS']
Value added!

['RS', 'RS', None, None, None, 'RS', 'RS', 'RS', 'RS', 'RS']
Value added!

['RS', 'RS', 'RS', None, None, 'RS', 'RS', 'RS', 'RS', 'RS']
Value added!

['RS', 'RS', 'RS', 'RS', None, 'RS', 'RS', 'RS', 'RS', 'RS']
Value added!

['RS', 'RS', 'RS', 'RS', 'RS', 'RS', 'RS', 'RS', 'RS', 'RS']
Hash Table full, impossible to add new values!
['RS', 'RS', 'RS', 'RS', 'RS', 'RS', 'RS', 'RS', 'RS', 'RS']


### Using Chaining (Linked Lists) with Hash Tables

For this implementation we will use a Simple Linked List and a HastTable together!

In [48]:
# Creating a class for the node:
class Node:
    def __init__(self, key, value) -> None:
        self.key = key
        self.value = value
        self.next = None

# Creating the class for linked list
class LinkedList:
    def __init__(self) -> None:
        self.head = None
    
    # Method for appending a node at the end of the list
    def append_node(self, key, data):
        current_node = Node(key, data)
        if not self.head:
            self.head = current_node
            return
        last_node = self.head
        while last_node.next != None:
            last_node = last_node.next
        last_node.next = current_node
        return
    
    # Method for appending a node at the start of the list
    def prepend_node(self, data):
        current_node = Node(data)
        if not self.head:
            self.head = current_node
            return
        current_node.next = self.head
        temp_node = current_node
        self.head = temp_node

    # Method for printing all nodes
    def print_all_nodes(self):
        current_node = self.head
        if self.head:
            while current_node.next:
                print(current_node.key, end=" -> ")
                current_node = current_node.next
            print(current_node.key, end= " -> None\n")
        else:
            print("None")

    # Method for deleting the node with the respective value selected
    def delete_node(self, data):
        current_node = self.head
        if current_node.key == data:
            del self.head
            self.head = current_node.next
            current_node = self.head
        if current_node.next:
            next_node = current_node.next
        while next_node.next:
            if next_node.key == data:
                current_node.next = next_node.next
                del next_node
                next_node = current_node.next
            else:
                current_node = current_node.next
                next_node = current_node.next
        if next_node.key == data:
                current_node.next = None
                del next_node

# Creating the class fot the Hash Table
class HashTable:
    def __init__(self, n) -> None:
        self.lenght = n
        self.hashtable = [LinkedList() for i in range(self.lenght)]
    
    def hashFunc(self, k):
        k = list(k)
        return (ord(k[0]) + ord(k[1])) % self.lenght
    
    def insert(self, key, value):
        index = self.hashFunc(key)
        self.hashtable[index].append_node(key, value)
    
    def delete(self, key):
        index = self.hashFunc(key)
        self.hashtable[index].delete_node(key)


In [52]:
teste = HashTable(10)

# Inserting all Brazilian States Names and Acronyms
teste.insert('AC', 'Acre')
teste.insert('AL', 'Alagoas')
teste.insert('AP', 'Amapá')
teste.insert('BA', 'Bahia')
teste.insert('CE', 'Ceará')
teste.insert('DF', 'Distrito Federal')
teste.insert('ES', 'Espirito Santo')
teste.insert('GO', 'Goiás')
teste.insert('MA', 'Maranhão')
teste.insert('MT', 'Mato Grosso')
teste.insert('MS', 'Mato Grosso do Sul')
teste.insert('MG', 'Minas Gerais')
teste.insert('PA', 'Pará')
teste.insert('PB', 'Paraíba')
teste.insert('PR', 'Paraná')
teste.insert('PE', 'Pernambuco')
teste.insert('PI', 'Piauí')
teste.insert('RJ', 'Rio de Janeiro')
teste.insert('RN', 'Rio Grande do Norte')
teste.insert('RS', 'Rio Grande do Sul')
teste.insert('RO', 'Rondônia')
teste.insert('RR', 'Roraima')
teste.insert('SC', 'Santa Catarina')
teste.insert('SP', 'São Paulo')
teste.insert('SE', 'Sergipe')
teste.insert('TO', 'Tocantins')

for idx, item in enumerate(teste.hashtable):
    print('Keys at index: ', idx)
    item.print_all_nodes()
    print()

Keys at index:  0
GO -> MS -> RN -> SC -> None

Keys at index:  1
AL -> BA -> MT -> RO -> None

Keys at index:  2
AC -> ES -> MA -> PR -> SE -> None

Keys at index:  3
PI -> SP -> TO -> None

Keys at index:  4
RR -> None

Keys at index:  5
AP -> PA -> RS -> None

Keys at index:  6
CE -> PB -> RJ -> None

Keys at index:  7
None

Keys at index:  8
DF -> MG -> None

Keys at index:  9
PE -> None

