<h1 align="center">Hash Tables and Python Dictionaries</h1>

### Creating Dictionaries

---

Let's suppose we create a dictionary to store Name and Phone number 

In [1]:
phone_book = {
    'vivek': '9049872675',
    'madhu': '8736484938',
    'alpana': '9376449374',
    'anu' : '7473874389',
}

Printing a dictionary

In [2]:
print(phone_book)

{'vivek': '9049872675', 'madhu': '8736484938', 'alpana': '9376449374', 'anu': '7473874389'}


Getting the value from the key

In [3]:
phone_book['vivek']

'9049872675'

You can store new phone numbers, or update existing ones

In [4]:
phone_book['himanshu'] = '9373762649'
phone_book['anu'] = '8888888888'
phone_book

{'vivek': '9049872675',
 'madhu': '8736484938',
 'alpana': '9376449374',
 'anu': '8888888888',
 'himanshu': '9373762649'}

Names and phone numbers stored in `phone_book` can be accessed using a `for` loop.

In [5]:
for name in phone_book:
    print(f"{name}'s Phone Number is {phone_book[name]}")

vivek's Phone Number is 9049872675
madhu's Phone Number is 8736484938
alpana's Phone Number is 9376449374
anu's Phone Number is 8888888888
himanshu's Phone Number is 9373762649


### Creating a hash table in python 

In [6]:
class HashTable:
    def insert(self, key, value):
        pass

    def find(self, key):
        pass

    def update(self, key, value):
        pass

    def list_all(self):
        pass

Create a Python list which will hold all the key-value pairs. We'll start by creating a list of a fixed size.

In [7]:
MAX_HASH_TABLE_SIZE = 4096

**QUESTION 1: Create a Python list of size `MAX_HASH_TABLE_SIZE`, with all the values set to `None`.**

In [8]:
data_list = [None] * MAX_HASH_TABLE_SIZE

In [9]:
len(data_list)

4096

In [10]:
for item in data_list:
    assert item == None, "Not all items are None"

### Hashing Function

A _hashing function_ is used to convert strings and other non-numeric data types into numbers, which can then be used as list indices. For instance, if a hashing function converts the string `"Aakash"` into the number `4`, then the key-value pair `('Aakash', '7878787878')` will be stored at the position `4` within the data list.

Here's a simple algorithm for hashing, which can convert strings into numeric list indices.

1. Iterate over the string, character by character
2. Convert each character to a number using Python's built-in `ord` function.
3. Add the numbers for each character to obtain the hash for the entire string 
4. Take the remainder of the result with the size of the data list


In [11]:
def get_index(data_list, a_string):
    result = 0
    for a_character in a_string:
        a_number = ord(a_character)
        result +=a_number
    list_index = result % MAX_HASH_TABLE_SIZE
    return list_index


In [12]:
get_index(data_list, "Aakash")

585

#### Insert

To insert a key-value pair into a hash table, we can simply get the hash of the key, and store the pair at that index in the data list.

In [13]:
key, value = 'Aakash', '7878787878'

In [14]:
idx = get_index(data_list, key)
idx

585

In [15]:
data_list[idx] = (key, value)

Here's the same operation expressed in a single line of code.

In [16]:
data_list[get_index(data_list, 'Hemanth')] = ('Hemanth', '9595949494')

#### Find

The retrieve the value associated with a pair, we can get the hash of the key and look up that index in the data list.

In [17]:
idx = get_index(data_list, 'Aakash')
idx

585

In [18]:
key, value = data_list[idx]
key, value

('Aakash', '7878787878')

#### List

To get the list of keys, we can use a simple [list comprehension](https://www.w3schools.com/python/python_lists_comprehension.asp).

Listing all Keys

In [19]:
keys = [kv[0] for kv in data_list if kv is not None]

In [20]:
keys

['Aakash', 'Hemanth']

Listing all Values

In [21]:
values = [kv[1] for kv in data_list if kv is not None]

In [22]:
values

['7878787878', '9595949494']

### Basic Hash Table Implementation

We can now use the hashing function defined above to implement a basic hash table in Python.

In [23]:
class BasicHashTable:
    def __init__(self, max_size = MAX_HASH_TABLE_SIZE):
        self.data_list = [None] * max_size
    
    def insert(self, key, value):
        idx = get_index(self.data_list, key)
        self.data_list[idx] = key, value
    
    def find(self, key):
        idx = get_index(self.data_list, key)
        kv = self.data_list[idx]
        
        if kv is None:
            return None
        else:
            key, value = kv
            return value

    def update(self, key, value):
        idx = get_index(self.data_list, key)
        self.data_list[idx] = key,value

    def list_all(self):
        return [(kv[0], kv[1]) for kv in self.data_list if kv is not None]

In [24]:
basic_table = BasicHashTable(max_size = 1024)

Check length

In [25]:
len(basic_table.data_list) == 1024

True

Insert Values

In [26]:
basic_table.insert('viv', 8755667986)
basic_table.insert('vivek', 9998887776)
basic_table.insert('vivka', 9991111555)

Find Value using key

In [27]:
basic_table.find("viv")

8755667986

Update Value of a key

In [28]:
basic_table.update("viv", "9999333666")

List All key value pair

In [29]:
basic_table.list_all()

[('viv', '9999333666'), ('vivka', 9991111555), ('vivek', 9998887776)]

### Handling Collisions with Linear Probing

As multiple keys can have the same hash. For instance, the keys `"listen"` and `"silent"` have the same hash. This is referred to as _collision_. Data stored against one key may override the data stored against another, if they have the same hash.


In [30]:
basic_table.insert('listen', 99)

In [31]:
basic_table.insert('silent', 200)

In [32]:
basic_table.find('listen')

200

To handle collisions we'll use a technique called linear probing. Here's how it works: 

1. While inserting a new key-value pair if the target index for a key is occupied by another key, then we try the next index, followed by the next and so on till we the closest empty location.

2. While finding a key-value pair, we apply the same strategy, but instead of searching for an empty location, we look for a location which contains a key-value pair with the matching key.

3. While updating a key-value pair, we apply the same strategy, but instead of searching for an empty location, we look for a location which contains a key-value pair with the matching key, and update its value.


In [33]:
def get_valid_index(data_list, key):
    # Start with the index returned by get_index
    idx = get_index(data_list, key)
    
    while True:
        # Get the key-value pair stored at idx
        kv = data_list[idx]
        
        # If it is None, return the index
        if kv == None:
            return idx
        
        # If the stored key matches the given key, return the index
        k, v = kv
        if key == k:
            return idx
        
        # Move to the next index
        idx += 1
        
        # Go back to the start if you have reached the end of the array
        if idx == len(data_list):
            idx = 0

### Hash Table with Linear Probing

We can now implement a hash table with linear probing.

In [34]:
class ProbingHashTable:
    def __init__(self, max_size=MAX_HASH_TABLE_SIZE):
        # 1. Create a list of size `max_size` with all values None
        self.data_list = [None] * max_size
     
    
    def insert(self, key, value):
        # 1. Find the index for the key using get_valid_index
        idx = get_valid_index(self.data_list, key)
        
        # 2. Store the key-value pair at the right index
        self.data_list[idx] = key, value
    
    
    def find(self, key):
        # 1. Find the index for the key using get_valid_index
        idx = get_valid_index(self.data_list, key)
        
        # 2. Retrieve the data stored at the index
        kv = self.data_list[idx]
        
        # 3. Return the value if found, else return None
        return None if kv is None else kv[1]
    
    
    def update(self, key, value):
        # 1. Find the index for the key using get_valid_index
        idx = get_valid_index(self.data_list, key)
        
        # 2. Store the new key-value pair at the right index
        self.data_list[idx] = key, value

    
    def list_all(self):
        # 1. Extract the key from each key-value pair 
        return [kv[0] for kv in self.data_list if kv is not None]

If the `ProbingHashTable` class was defined correctly, the following cells should output `True`.

In [35]:
# Create a new hash table
probing_table = ProbingHashTable()

# Insert a value
probing_table.insert('listen', 99)

# Check the value
probing_table.find('listen') == 99

True

In [36]:
# Insert a colliding key
probing_table.insert('silent', 200)

# Check the new and old keys
probing_table.find('listen') == 99 and probing_table.find('silent') == 200

True

In [37]:
# Update a key
probing_table.insert('listen', 101)

# Check the value
probing_table.find('listen') == 101

True

In [38]:
probing_table.list_all() == ['listen', 'silent']

True