# Hash Table

video for reference

https://www.youtube.com/watch?v=KyUTuwz_b7Q 

### Hashing algorithm
- For numeric keys, we can take the key % len(arr)
- For alphanumeric, we can take the ascii code, ord(key) % len(arr)
- Folding method for long numbers e.g. 012345 - (01 + 23 + 45) % len(arr)




1. initialise the hash table  
2. insert key by hashing key and insert into hash table

In [19]:
# hash function - depend on question 
# usually modulo length of hash table

def hash(string):
    total = 0
    for i, ele in enumerate(string):
        total += ord(ele) * (i+1)
    total %= 5
    return total

In [1]:
# initialise hash table

def init_table(n):
    return ['' for i in range(n)] # or [''] * n

## No collisions

we assume that all elements produce unique hashes  
  
  
### Time complexity 
Create: O(n)  
Search: O(1)

In [6]:
def hash_table(lst):
    table = init_table(len(lst))
    for ele in lst:
        i = hash(ele)
        table[i] = ele
    return table

In [7]:
# search for key

def search(table, key):
    i = hash(key)
    return table[i] == key

In [12]:
lst = ['WONG CHU HENG TIM', 'KATHRYN CHAN HUI', 'PATEL KRISH KADAMB', 'MANSIB MIRAJ', 'SEAN NG JING HAO']
data_table = hash_table(lst)

print(search(data_table, 'SEAN NG JING HAO'))  # true
print(search(data_table, 'TANG HAOYANG'))      # false

True
False


## Collision

Sometimes, some strings/elements we want to add have the same hash.  
So how do we overcome this problem?

### Seperate Chain
 
we create a new list(2d) and add both elements into the new list  
  
e.g  
['a', '']  
new ele: 'hello'  
[['a', 'hello'], '']


### Time complexity
Create: O(n)  
Search: O(n)

In [13]:
def hash_table2(lst):
    table = init_table(len(lst))
    for ele in lst:
        i = hash(ele)
        
        # if current index is empty
        if table[i] == '':
            table[i] = ele
        
        # current index not empty
        else:
            
            # only one element in current index
            # create new list
            if type(table[i]) != list:
                table[i] = [table[i], ele]
            
            # add to list
            else:
                table[i] += [ele]
    return table

In [14]:
def search2(table, key):
    i = hash(key)
    
    # current index is key
    if table[i] == key:
        return True
    
    # current index is a list - search list
    elif type(table[i]) == list:
        return key in table[i]
    
    else:
        return False

In [15]:
lst2 = ['WONG CHU HENG TIM', 'KATHRYN CHAN HUI', 'PATEL KRISH KADAMB', 'MANSIB MIRAJ', 'TANG HAOYANG']
data_table2 = hash_table2(lst)
"""

['WONG CHU HENG TIM', 
'PATEL KRISH KADAMB', 
['KATHRYN CHAN HUI', 'TANG HAOYANG'],
'',
'MANSIB MIRAJ']

"""

print(search2(data_table2, 'TANG HAOYANG'))     # true
print(search2(data_table2, 'KATHRYN CHAN HUI')) # true

True
True


### Open Hashing

we find the next empty index by iterating through the hash table  
  
e.g  
['a', '']  
new ele: 'hello'  
['a', 'hello']  

### Time complexity
Create: O(n)  
Search: O(n)

In [19]:
def hash_table3(lst):
    table = init_table(len(lst))
    for ele in lst:
        i = hash(ele)
        
        # if index is empty
        if table[i] == '':
            table[i] = ele
        
        # if index is not empty - use open hash
        else:
            while table[i] != '':
                i = (i+1) % len(table)
            table[i] = ele
    
    return table

In [20]:
def search3(table, key):
    i = hash(key)
    if table[i] == key:
        return True
    elif table[i] == '':
        return False
    
    # linear search
    else:
        for i in range(len(table)):
            if table[i] == key:
                return True
            else:
                i = (i+1) % len(table)
        return False

In [21]:
lst3 = ['WONG CHU HENG TIM', 'KATHRYN CHAN HUI', 'PATEL KRISH KADAMB', 'MANSIB MIRAJ', 'TANG HAOYANG']
data_table = hash_table3(lst3)
"""
['WONG CHU HENG TIM', 
'PATEL KRISH KADAMB', 
'KATHRYN CHAN HUI', 
'TANG HAOYANG', 
'MANSIB MIRAJ']
"""

print(search3(data_table, 'TANG HAOYANG'))       # true
print(search3(data_table, 'NG JUN JIE, KEITH'))  # false

True
False


# Binary Search
  
visualisation: https://www.cs.usfca.edu/~galles/visualization/Search.html

**Binary search is only used for sorted list/array**  
It works by comparing the middle element of the list with the key  
The middle element is arr[mid] where mid = (high + low) // 2, where high = last index & low = first index  


### Time Complexity
Best case:      
O(1)  

Average case:  
O(logn)  

Worst case:  
O(logn)

In [3]:
# iterative method

def binary_search(arr, key):
    h = len(arr) - 1
    l = 0
    while l <= h:
        m = (l+h) // 2
        if arr[m] == key:
            return True
        elif key < arr[m]:
            h = m - 1
        else:
            l = m + 1
    return False

lst = list(range(10))
binary_search(lst, 0)

True

In [None]:
# recursive method

def find(arr, key, l, h):
    if l > h:
        return False
    else:
        mid = (l+h) // 2
        if key == arr[mid]:
            return True
        elif key < arr[mid]:
            return find(arr, key, l, mid-1)
        else:
            return find(arr, key, mid+1, h)
        
def binary_search(arr, key):
    l = 0
    h = len(arr) - 1
    return find(arr, key, l, h)
    