In [1]:
ord('f')

102

In [5]:
sum(map(ord, 'hello world'))

1116

In [8]:
sum(map(ord, 'gello xorld'))

1116

To ensure that each string we hash has a unique value(This is called avoiding collisions).We can use a multiplier below

In [12]:
def myhash(string):
    multiplier = 1
    hashval = 0
    for character in string:
        hashval += multiplier * ord(character)
        multiplier += 1
    return hashval

In [13]:
#To test the above function:
for sentence in ('hello world', 'world hello', 'gello xorld'):
    print("{}: {}".format(sentence, myhash(sentence)))

hello world: 6736
world hello: 6616
gello xorld: 6742


However, when we try the above hashing function with other strings, it still results in collisions, so it is yet to be perfect. See below

In [14]:
print("{}: {}".format('ad', myhash('ad')))
print("{}: {}".format('ga', myhash('ga')))

ad: 297
ga: 297


For more strategies on resolving collisions like the above, we will come back to it later, next, we are going to implement Hash Tables

A hash table is a form of list where elements are accessed by a keyword rather than an
index number. At least, this is how the client code will see it. Internally, it will use a slightly
modified version of our hashing function in order to find the index position in which the
element should be inserted. This gives us fast lookups, since we are using an index number
which corresponds to the hash value of the key

In [15]:
#We start by implementing a Hash Item class to represent how the Hash Table items #will look and their actions

class HashItem:
    def __init__(self, key, value):
        self.key = key
        self.value = value
    #Each Item in the Hash Table will have a key and a value to represent itself        #as implemented above
    #This gives us a very simple way to store items. Next, we start working on          #the hash table class itself

In [25]:
class HashTable:
    #So, the definite size of our hash table is going to be 256 slots/buckets
    def __init__(self):
        self.size = 256
        self.slots = [None for i in range(self.size)]
        self.count = 0#The count is used to get the number of filled slots/buckets i.e the number of complete key-value pairs in the Hash Table

    def _myhash(self, key):
        #We implemented a hash function to convert the string keys into unique              #values
        #We ensured the function is only used internally by putting a hyphen in             #front of it
        mult = 1
        hashval = 0
        for ch in key:
            hashval += mult * ord(ch)
            mult += 1
        #To ensure that the hash value is going to be within the size of the Hash           #Table, we will divide it by the size so that it's between 0 to 255
        return hashval % self.size

    def put(self, key, value):
        item = HashItem(key, value)
        hashedKey = self._myhash(key)
        while self.slots[hashedKey] is not None:
            if self.slots[hashedKey].key is key:
                break
            hashedKey = (hashedKey + 1) % self.size
        if self.slots[hashedKey] is None:
            self.count += 1
        self.slots[hashedKey] = item


    def get(self, key):
        hashedKey = self._myhash(key)
        while self.slots[hashedKey] is not None:
            if self.slots[hashedKey].key is key:
                return self.slots[hashedKey].value
            hashedKey = (hashedKey + 1) % self.size
        return None

    #Considering that we want to make our hash table behave like a list, we want #the putting and getting operations not to work like e.g ht.get("key"), but #instead to work like this e.g ht["key"]. Below, we will try and make that #possible:
    def __setitem__(self, key, value):
        self.put(key, value)

    def __getitem__(self, key):
        return self.get(key)

In [23]:
ht = HashTable()
ht.put("Teenager", "Celestine")
ht.put("Babygirl", "VarcyMarie")
ht.put("FirstBorn", "Michael")
ht.put("Mother", "Margaret")
ht.put("BornFirst", "CollisionTest")

In [24]:
for key in ("Teenager", "Babygirl", "Mother", "FirstBorn", "BornFirst", "Cousin"):
    value = ht.get(key)
    print(value)

Celestine
VarcyMarie
Margaret
Michael
CollisionTest
None


Our test code, after adding the __setitem__ and the __getitem__ methods, should look like this below:

In [26]:
ht1 = HashTable()
ht1["Father"] = "Dennis"
ht1["EldestSon"] = "Michael"
ht1["SonEldest"] = "CollisionTest"
ht1["YoungestSon"] = "Celestine"
ht1["Mother"] = "Margaret"

In [27]:
for key in ("Father", "EldestSon", "SonEldest", "YoungestSon", "Mother", "Sister"):
    value = ht1[key]
    print(value)
print("The number of elements is: {}".format(ht.count))

Dennis
Michael
CollisionTest
Celestine
Margaret
None
The number of elements is: 5
