### Example of a hash table for strings

In [1]:
import numpy as np

In [37]:
data = ['Tom', 'Nick', 'Barry', 'Barrold', 'Barrold the great', 'Barrold the wise']
# ord() converts character argument into ASCII; chr() converts ASCII back into a character

# Pre-allocate the hash table
hashTable = ['']*len(data)

In [38]:
# Looks like this loop has complexity of order O(d*s) for data array of length d and string length of s
for name in data:
    
    # Loop over the characters in the name and sum up their ascii codes (This is the hashing function)
    asciiSum = 0
    for char in name:
        asciiSum += ord(char)
    hashLocation = asciiSum % len(data)  # Take remainder after int division of ascii sum with length of array

    # Put the data in the appropriate place
    print(name, asciiSum, hashLocation)
    hashTable[hashLocation] = name
    
print(hashTable)

Tom 304 4
Nick 389 5
Barry 512 2
Barrold 710 2
Barrold the great 1626 0
Barrold the wise 1535 5
['Barrold the great', '', 'Barrold', '', 'Tom', 'Barrold the wise']


### Looks like there have been some collisions here as a result of duplicate hash function outputs

### Common fixes include open and closed addressing - open addressing involves shifting the data to the next open space, closed addressing involves appending the data to a linked list in the original hashLocation

### Open addressing

In [35]:
hashTable = [None]*len(data)

for name in data:
    
    # Loop over the characters in the name and sum up their ascii codes (This is the hashing function)
    asciiSum = 0
    for char in name:
        asciiSum += ord(char)
    hashLocation = asciiSum % len(data)  # Take remainder after int division of ascii sum with length of array
    print(name, asciiSum)
    
    # If the desired location is empty, put the data there
    if hashTable[hashLocation] == None:
        hashTable[hashLocation] = name
        print('Open addressing hash location: {:}'.format(hashLocation))
    else:
        locationOccupied = True  # Otherwise, assume the location is occupied and do a linear probe of the susequent locations
        while locationOccupied:
            hashLocation += 1  # Increment the query location by one            
            if hashTable[hashLocation%len(hashTable)] == None:  # % so you wrap around and try to fill it up from the beginning
                locationOccupied = False
                hashTable[hashLocation%len(hashTable)] = name
                print('Open addressing hash location: {:}'.format(hashLocation%len(hashTable)))
    
print(hashTable)

Tom 304
Open addressing hash location: 4
Nick 389
Open addressing hash location: 5
Barry 512
Open addressing hash location: 2
Barrold 710
Open addressing hash location: 3
Barrold the great 1626
Open addressing hash location: 0
Barrold the wise 1535
Open addressing hash location: 1
['Barrold the great', 'Barrold the wise', 'Barry', 'Barrold', 'Tom', 'Nick']


### Now the hash table is fully occupied without overwrites because of the open addressing implemented above.

### Basic premise is to check whether a location is occupied, and if it is, increment the desination location by one repeatedly until you find a place that's empty. We wrap around to the front of the table rather than append new entries to the end as we know the table and data array should be the same size