# String Key Hash Table

### Problem Statement
In this quiz, you'll write your own hash table and hash function that uses string keys. Your table will store strings in the buckets. The (bucket) index is calculated by the first two letters of the string, according to the formula below:

    Hash Value = (ASCII Value of First Letter * 100) + ASCII Value of Second Letter
In the formula above, the generated hash value is the (bucket) index.


**Example**: For a string "UDACITY", the ASCII value for letters 'U' and 'D' are 85 and 68 respectively. The hash value would be: `(85 *100) + 68 = 8568`. 

You can use the Python function `ord()` to get the ASCII value of a letter, and `chr()` to get the letter associated with an ASCII value. 

**Assumptions**
1. The string will have at least two letters, 
2. The first two characters are uppercase letters (ASCII values from 65 to 90). 


**Rules**
- Do not use a Python dictionary—only lists! 
- Store `lists` at each bucket, and not just the string itself. For example, you can store "UDACITY" at index 8568 as ["UDACITY"].

### Instructions
Create a `HashTable` class, with the following functions:
 - `store()` - a function that takes a string as input, and stores it into the hash table.
 - `lookup()` - a function that checks if a string is already available in the hash table. If yes, return the hash value, else return -1.
 - `calculate_hash_value()` - a helper function to calculate a hash value of a given string. 

### Exercise - Try building a string hash table!

In [1]:
class Node():
    def __init__(self, key=None, value=None, next_node=None):
        self.key   = key
        self.value = value
        self.next  = next_node


        
class HashMap():
    #Class construtor initialises the num_buckets (default is 10). Creates a list of objects initialised to None    
    def __init__(self, initial_num_buckets = 10):
        self.bucket_array = [None for _ in range(initial_num_buckets)]
        self.prime_coeff  = 31
        self.num_entries  = 0
        self.get          = self.search
    
    def get_load_factor(self):
        num_buckets= len(self.bucket_array)
        load_factor = self.num_entries/num_buckets
        return load_factor
    
        
    def get_hash_code(self, key):
        #convert the key to a string
        key       = str(key)
        hash_code = 0
        
        for char_count, character in enumerate(key):
            multiplicative_factor = pow(self.prime_coeff,char_count)
            hash_code += ord(character)*multiplicative_factor
        
        return hash_code
    
    def get_hash_index(self, key):
        num_buckets = len(self.bucket_array)
        hash_code = self.get_hash_code(key)
        
        #Compression factor: Divide the hash_code by the num_buckets and return the remainder
        hash_index = hash_code %num_buckets
        
        return hash_index
    
    def search(self, key):
        #obtain the hash_index of where the key-value pair will reside if it already exists
        bucket_arr_idx = self.get_hash_index(key)  
        #if the array index is empty, return search=False
        if(self.bucket_array[bucket_arr_idx] == None):
            return_index = [-1,-1]
            search_flag  = False
            value        = None
        #If array is not empty search through the open chain where the index is
        else:
            current = self.bucket_array[bucket_arr_idx]
            horizontal_chain_idx = 0
            search_flag = False
            while(current !=None):
                if(current.key == key):
                    search_flag  = True
                    return_index = [bucket_arr_idx,horizontal_chain_idx]
                    value        = current.value
                    break
                else:
                    horizontal_chain_idx +=1   
                    current = current.next
            
            #At the end of the while if key still not found assign return_idx =[-1,-1]
            if(search_flag == False):
                return_index = [-1,-1]
                value        = None
        
        return return_index,value,search_flag
            
            
    
    def put(self, key, value=None, allow_rehash = True):         
        #obtain the hash_index of where the key-value pair has to go
        arr_index = self.get_hash_index(key)            
        
        #if there are no other nodes at this arr_index, the new key-value becomes head
        if(self.bucket_array[arr_index] == None):
            new_node                     = Node(key, value)
            self.bucket_array[arr_index] = new_node
            self.num_entries             +=1
            put_success_flag             = True
        #if there are other nodes at this index. 
        else:
            search_index, search_value, key_exists = self.search(key)
            #Key already exists
            if(key_exists == True):
                print("Exiting without any insertions !!!. Key:", key ," already exists @index=", index)
                put_success_flag = False
            #Key does not exist. Insert it now as the new-head at the beginning of existing linked-list
            else:
                current_head                 = self.bucket_array[arr_index]
                new_node                     = Node(key, value, current_head)
                self.bucket_array[arr_index] = new_node
                self.num_entries +=1
                put_success_flag = True
        
        
        #The flag allw_rehash is necessary so that self.put and self.rehash are not in a race-condition/infinite loop
        if(put_success_flag == True and allow_rehash== True and self.get_load_factor() > 0.7):
            print("Added (key={}:value={}). Load factor exceeded ! Going to Rehash".format(key,value))
            print("HashMap Before Rehashing")
            print(self)
            
            self.rehash()
            
            print("HashMap After Rehashing")
            print(self)
        
        return put_success_flag
    
      
    def delete(self,key):        
        delete_index,value,search_flag = search(key)
        #Key does not exist
        if(search_flag == False):
            print("No Deletions: Key does not exist")
            delete_success_flag = False
        #Key exists
        else:
            bucket_idx           = delete_index[0]
            horizontal_chain_idx = delete_index[1]
            
            #Key to delete is at the head
            if(horizontal_chain_idx==0):
                key_2delete   = self.bucket_array[bucket_idx].key 
                value_2delete = self.bucket_array[bucket_idx].value
                self.bucket_array[bucket_idx].key   = None
                self.bucket_array[bucket_idx].value = None
                self.num_entries                    -= 1
                delete_success_flag                 = True
                print("Key:{}, Value:{}, deleted@index:{}".format(key2delete,value2delete,delete_index))
                
            #Key to delete is at the middle of the chain
            else:
                #Identify the previous_node of the node2_delete
                previous_node = self.bucket_array[bucket_idx]
                count   = 0
                while(count<=horizontal_chain_idx -1):
                    previous_node = previous_node.next
                    count+=1
                
                node2_delete  = previous_node.next
                key_2delete   = self.bucket_array[bucket_idx].key 
                value_2delete = self.bucket_array[bucket_idx].value
                
                next_node    = node2_delete.next
                #Link the previous_node(of node2_delete) and next_node(of node2_delete)
                previous_node.next = next_node
                #Actually delete node2_delete
                del node2_delete
                self.num_entries    -= 1
                delete_success_flag = True
                print("Key:{}, Value:{}, deleted@index:{}".format(key2delete,value2delete,delete_index))
        
        return delete_index, value, delete_success_flag
 

    def rehash(self):
        print("Rehashing !!! ")
        old_bucket_array  = self.bucket_array
        self.bucket_array = [None for _ in range(2*len(old_bucket_array))]
        self.num_entries  = 0
        for bucket_idx in range(len(old_bucket_array)):
            current_node = old_bucket_array[bucket_idx]
            while(current_node !=None):
                key   = current_node.key
                value = current_node.value
                #Now use the put function. 
                #Set allow_rehash = False,to avoid an infinite loop between the put-rehash methods
                self.put(key,value, allow_rehash = False)
                current_node = current_node.next
                
        return
            
    
    def __repr__(self):
        output = 'Below is HashMap:'
        for bucket_idx in range(len(self.bucket_array)):
            output +="\n[{}] ".format(bucket_idx)
            current = self.bucket_array[bucket_idx]
            while(current !=None):
                key    = current.key
                value  = current.value
                output += "({}:{}), ".format(key,value)
                current = current.next
                         
        output +="\n-----------------------------\n"
        
        return output
    
    
    
    
print("---------------------")

---------------------


"""Write a HashTable class that stores strings
in a hash table, where keys are calculated
using the first two letters of the string."""

class HashTable(object):
    def __init__(self):
        self.table = [None]*10000

    def store(self, string):
        """TODO: Input a string that's stored in 
        the table."""
        pass

    def lookup(self, string):
        """TODO: Return the hash value if the
        string is already in the table.
        Return -1 otherwise."""
        return -1

    def calculate_hash_value(self, string):
        """TODO: Helper function to calulate a
        hash value from a string."""
        return -1


In [2]:
ASCII_RANGE = 255
NUM_BUCKETS = ASCII_RANGE*100 +ASCII_RANGE

        
        
class HashTable(HashMap):
    #Class construtor initialises the num_buckets (default is NUM_BUCKETS). 
    #Creates a list of objects initialised to None    
    def __init__(self, initial_num_buckets = NUM_BUCKETS):
        self.bucket_array = [None for _ in range(initial_num_buckets)]
        self.num_entries  = 0
        self.calculate_hash_value = self.get_hash_index
    
    
    def get_hash_index(self, key):
        #convert the key to a string
        key       = str(key)
        hash_index = ord(key[0])*100 + ord(key[1])        
        return hash_index
    
    def lookup(self, string):
        return_index,value,search_flag = self.search(key=string)
        hash_index = return_index[0]
        return hash_index
    
    def store(self,string):
        self.put(key=string, allow_rehash = False)
    

    
    
    
    
    
    
print("---------------------")

---------------------


### Test Cases - Let's test your function

In [3]:
# Setup
hash_table = HashTable()

# Test calculate_hash_value
print (hash_table.calculate_hash_value('UDACITY'))    # Should be 8568

# Test lookup edge case
print (hash_table.lookup('UDACITY'))                  # Should be -1

# Test store
hash_table.store('UDACITY')
print (hash_table.lookup('UDACITY'))                  # Should be 8568

# Test store edge case
hash_table.store('UDACIOUS')
print (hash_table.lookup('UDACIOUS'))                 # Should be 8568

8568
-1
8568
8568


<span class="graffiti-highlight graffiti-id_53gqd1t-id_55iwcxy"><i></i><button>Show Solution</button></span>