# HashTables

## Associative Arrays
- associative arrays (maps or dictionaries) are abstract data types
- composed of a collection of key-value pairs where each key appears at most once in the collection
- most of the times we implement associative arrays with hashtables but binary search trees can be used as well
- the aim is to reach O(1) time complexity for most of the operations

## HashTable
![image-6.png](attachment:image-6.png)

## Hash Functions
- The h(x) hash-function maps keys to array indexes in the array to be able to use random indexing and achieve O(1) running time!
- THE h(x) HASH-FUNCTION DEFINES THE RELATIONSHIPS BETWEEN THE KEYS AND THE ARRAY INDEXES !!!

## Hash Collision
Collisions occur when the h(x) hash-function mapstwo keys to the same array slot (bucket)
2 solutions
	- Chaining --> same bucket (same index) -- using Linked List
	- Open Addressing --> different bucket (different index) 
		- Linear probing
		- quadratic probing
		- rehashing
## Load factor
probability p(x) of collision is not same
	- n/m  = no. of items in array/ size of array
## Dynamic resizing based on load factor
- so sometimes it is better to resize and change the size of the underlying array data structure
- but the problem is that the hash values are depending on the size of the underlying array data structure
- so we have to consider all the items in the old hashtable and insert them into the new one with the h(x) hash-function
- it takes O(N) linear running time - this fact may make dynamic-sized hash tables inappropriate for real-time applications (so we can also use binary tree instead of array as time complexity reduces to O(logN)

In [1]:

class HashTable:

    def __init__(self):
        # based on the load factor we may change the size of the underlying
        # data structure (dynamic resizing)
        self.capacity = 10
        self.keys = [None]*self.capacity
        self.values = [None] * self.capacity

    def insert(self, key, data):

        # we have to find a valid location for the value (data)
        index = self.hash_function(key)

        # there may be collisions which means that the index is already occupied
        # while we do not find an empty array slot
        while self.keys[index] is not None:
            # sometimes we have to update the value if the key is already present
            if self.keys[index] == key:
                self.values[index] = data
                return

            # do linear probing (try the next slot in the array)
            # because we may increment the index such that we are outside the range
            # of the underlying list
            index = (index+1) % self.capacity

        # we have found the valid slot for the item
        # so we have to insert the data
        self.keys[index] = key
        self.values[index] = data

    def get(self, key):

        # we have to find a valid location for the value (data)
        index = self.hash_function(key)

        while self.keys[index] is not None:
            # this is when we find the item we are looking for
            if self.keys[index] == key:
                return self.values[index]

            index = (index + 1) % self.capacity

        # the given key value pair with key does not exist in the hashtable
        return None

    # hash value (index of the array) based on the key
    def hash_function(self, key):

        hash_sum = 0

        for letter in key:
            hash_sum = hash_sum + ord(letter)

        return hash_sum % self.capacity


In [2]:
if __name__ == '__main__':

    table = HashTable()
    table.insert('Adam', 23)
    table.insert('Kevin', 45)
    table.insert('Daniel', 34)
    table.insert('Daniel', 33)

    print(table.get('Daniel'))

33


## Python dictionaries by default

In [3]:
## Python dictionaries
d = {'name': 'Kevin', 'age': 34, 'gender': 'male'}
print(d.items())
for key, value in d.items():
    print(key, value)

# # clear content of dictionary
# d.clear()
# # remove dictionary itself
# del d

dict_items([('name', 'Kevin'), ('age', 34), ('gender', 'male')])
name Kevin
age 34
gender male
