# Hash Table

* Implementation of Hash Table (= Python Dictionary)
* Example of Hash Function
* Example of Python Dictionary

## Implementation of Hash Table

The idea of a dictionary used as a hash table to get and retrieve items using keys is often referred to as a mapping. In our implementation we will have the following methods:

* HashTable() -  Create a new, empty map. It returns an empty map collection.
* put(key,val)  -  Add a new key-value pair to the map. If the key is already in the map then replace the old value with the new value.
* get(key)  -  Given a key, return the value stored in the map or None otherwise.
* del  -  Delete the key-value pair from the map using a statement of the form del map[key].
* len()  -  Return the number of key-value pairs stored in the map in Return True for a statement of the form key in map, if the given key is in the map, False otherwise.

In [3]:
class HashTable(object):
    
    def __init__(self,size):
        self.size = size
        self.slots = [None] * self.size
        self.data = [None] * self.size
        
    def put(self,key,data):
        #Note, we'll only use integer keys for ease of use with the Hash Function
        
        # Get the hash value
        hashvalue = self.hashfunction(key,len(self.slots))

        # If Slot is Empty
        if self.slots[hashvalue] == None:
            self.slots[hashvalue] = key
            self.data[hashvalue] = data
        
        else:
            
            # If key already exists, replace old value
            if self.slots[hashvalue] == key:
                self.data[hashvalue] = data  
            
            # Otherwise, find the next available slot
            else:
                
                nextslot = self.rehash(hashvalue,len(self.slots))
                
                # Get to the next slot
                while self.slots[nextslot] != None and self.slots[nextslot] != key:
                    nextslot = self.rehash(nextslot,len(self.slots))
                
                # Set new key, if NONE
                if self.slots[nextslot] == None:
                    self.slots[nextslot]=key
                    self.data[nextslot]=data
                    
                # Otherwise replace old value
                else:
                    self.data[nextslot] = data 

    def hashfunction(self,key,size):
        # Remainder Method
        return key%size

    def rehash(self,oldhash,size):
        # For finding next possible positions
        return (oldhash+1)%size
    
    def get(self,key):    
        # Getting items given a key
        
        # Set up variables for our search
        startslot = self.hashfunction(key,len(self.slots))
        data = None
        stop = False
        found = False
        position = startslot
        
        # Until we discern that its not empty or found (and haven't stopped yet)
        while self.slots[position] != None and not found and not stop:
            
            if self.slots[position] == key:
                found = True
                data = self.data[position]
                
            else:
                position=self.rehash(position,len(self.slots))
                if position == startslot:
                    
                    stop = True
        return data

    # Special Methods for use with Python indexing
    def __getitem__(self,key):
        return self.get(key)

    def __setitem__(self,key,data):
        self.put(key,data)

In [4]:
h = HashTable(5)
h[1] = 'one'
h[2] = 'two'
h[1] #'one'

'one'

## Example of Hash Function

* Hash table -  A data structure that maps keys to values for highly efficient lookup
* Hash function - maps a key (a big number or string) to a small integer, which indicates the index in the array
* Collision - When two keys map to the same value 

#### Collision handling
* Chaining : The idea is to make each cell of hash table point to a linked list of records that have same hash function value. Chaining is simple, but requires additional memory outside the table.
* Open Addressing: In open addressing, all elements are stored in the hash table itself. Each table entry contains either a record or NIL. When searching for an element,we one by one examine table slots until the desired element is found or it is clear that the element is not in the table.
* Using binary search tree : Implement hash table with binary search tree. 
    It guarantees O(logn) lookup time since we can keep tree balanced.

Reference
http://www.geeksforgeeks.org/hashing-set-1-introduction/

In [None]:
# An implementation of a HashTable class that stores strings in a hash table, 
# where keys are calculated using the first two letters of the string.

class HashTable(object):
    def __init__(self):
        self.table = [None]*100000
        
    def hashFunction(self, string):
        value = ord(string[0])*100 + ord(string[1])
        return value  

    def store(self, string):
        hv = self.hashFunction(string)
        if hv != -1:
            if self.table[hv] != None:
                self.table[hv].append(string)
            else:
                self.table[hv] = [string]

    def lookup(self, string):
        hv = self.hashFunction(string)
        if hv != -1:
            if self.table[hv] != None:
                if string in self.table[hv]:
                    return hv
        return -1

ht = HashTable()
ht.store("coding")
print(ht.lookup("coding")) #10011

ht.store("code")
print(ht.lookup("code")) #10011

ht.store("hello")
print(ht.lookup("hello")) #10501

print(ht.table[10011]) #['coding', 'code']
print(ht.table[10501]) #['hello']

## Example of Python Dictionary

In Python, the map concept appears as a built-in data type called a dictionary. 
A dictionary contains key-value pairs.

In [None]:
# locations = {'North America': {'USA': ['Mountain View', 'Atlanta']}, 
#                 'Asia': {'India': ['Bangalore'], 'China':['Shanghai']},
#                 'Africa' : {'Egypt': ['Cairo']}}
            
locations = {'North America': {'USA': ['Mountain View']}}
locations['North America']['USA'].append('Atlanta')
locations['Asia'] = {'India': ['Bangalore']}
locations['Asia']['China'] = ['Shanghai']
locations['Africa'] = {'Egypt': ['Cairo']}

usa_sorted = sorted(locations['North America']['USA'])

for city in usa_sorted:
    print(city)

asia_cities = []

for countries, cities in locations['Asia'].iteritems():
    city_country = cities[0] + " - " + countries 
    asia_cities.append(city_country)
asia_sorted = sorted(asia_cities)

for city in asia_sorted:
    print(city)