# Hash-table.

A hash-table is a data structure that maps keys to values. Keys are generally either integers or strings, and values are any types of data stored. This is made possible through a <b>hash function</b>, which is a function that converts a key into an array index.

Python already has a built-in hash-table called a <b>dictionary</b> that is widely used.

Thanks to https://www.henleypassportindex.com/global-ranking, we have data over how powerful certain passports are.

If we want to look up a country and see how many visa-free destination its passport allows, a hash table allows a certain key (country) to get to a certain value (numbers). If it was done as a dictionary, it would look like:

In [1]:
passport = {}

# adding entries
passport['Japan'] = 193
passport['Singapore'] = 192
passport['South Korea'] = 191
passport['Germany'] = 191
passport['Italy'] = 190
passport['Finland'] = 190
passport['Spain'] = 190
passport['Luxembourg'] = 190

# output
print('Passport dictionary---')
print(passport)
print()

# select entry
print(f"Finland: {passport['Finland']}")
print()

# remove entry
print('Removing Finland---')
passport.pop('Finland')
print(passport)

Passport dictionary---
{'Japan': 193, 'Singapore': 192, 'South Korea': 191, 'Germany': 191, 'Italy': 190, 'Finland': 190, 'Spain': 190, 'Luxembourg': 190}

Finland: 190

Removing Finland---
{'Japan': 193, 'Singapore': 192, 'South Korea': 191, 'Germany': 191, 'Italy': 190, 'Spain': 190, 'Luxembourg': 190}


Now coding the hash table from scratch.

The centrepiece of a hash table is the <b>hash function</b>, which takes the character values of the key, adds them up, and has a value generated from modulo a size value. For example, the character values of "Finland" are 70, 105, 110, 108, 97, 110, and 100. The total sum is 600, and the modulo 511 gives us the index value 89.

- $600 \pmod{511} \equiv 89$

The index value is then used as the location where the data is stored.

But what happens if two values map to the same index? This is what is called a <b>hash collision</b>. One way to resolve a hash collision is to insert the value into the next available spot. This is called <b>opened addressing</b>.The other way is to have a linked list at an indexing position and have values chained. This is called <b>closed addressing</b>.

In [2]:
class hashTable:
    def __init__(self):
        self.size = 511
        self.map = [None] * self.size

    def _get_hash(self, key):
        hash = 0
        for char in str(key):
            hash += ord(char)
        return hash % self.size

    def add(self, key, value):
        key_hash = self._get_hash(key)
        key_value = [key, value]

        if self.map[key_hash] is None:
            self.map[key_hash] = list([key_value])
            return True
        else:
            for pair in self.map[key_hash]:
                if pair[0] == key:
                    pair[1] = value
                    return True
            self.map[key_hash].append(key_value)
            return True

    def get(self, key):
        key_hash = self._get_hash(key)
        if self.map[key_hash] is not None:
            for pair in self.map[key_hash]:
                if pair[0] == key:
                    return pair[1]
        return None

    def delete(self, key):
        key_hash = self._get_hash(key)

        if self.map[key_hash] is None:
            return False

        for i in range(0, len(self.map[key_hash])):
            if self.map[key_hash][i][0] == key:
                self.map[key_hash].pop(i)
                return False

    def print(self):
        for item in self.map:
            if item is not None:
                print(str(item))

if __name__ == "__main__":
    hashtable = hashTable()

    print('Adding entries---')
    hashtable.add('Japan', 193)
    hashtable.add('Singapore', 192)
    hashtable.add('South Korea', 191)
    hashtable.add('Germany', 191)
    hashtable.add('Italy', 190)
    hashtable.add('Finland', 190)
    hashtable.add('Spain', 190)
    hashtable.add('Luxembourg', 190)

    hashtable.print()
    print()

    print(f"Finland: {hashtable.get('Finland')}")
    print()

    print('Deleting Finland---')
    hashtable.delete('Finland')

    hashtable.print()
    print()

    print(f"Finland: {hashtable.get('Finland')}")

Adding entries---
[['Italy', 190]]
[['South Korea', 191]]
[['Luxembourg', 190]]
[['Finland', 190]]
[['Germany', 191]]
[['Singapore', 192]]
[['Japan', 193]]
[['Spain', 190]]

Finland: 190

Deleting Finland---
[['Italy', 190]]
[['South Korea', 191]]
[['Luxembourg', 190]]
[]
[['Germany', 191]]
[['Singapore', 192]]
[['Japan', 193]]
[['Spain', 190]]

Finland: None


For runtime considerations, please also see: https://bigocheatsheet.io/