# Hash Tables

## Contents
1. Introduction
2. The Concept
3. Applications
4. Python Implementation - The Dictionary
    - Bonus - **kwargs
5. Additional Resources

******

## Introduction

This post is a continuation of a series on Data Strucutres and their implementation in Python. If you haven't already, I recommend checking out the other posts [here](https://thefmlblog.blogspot.com/). <br>
Hash Tables are something whose name you have heard a lot but never really paid much attention to. If you are a new python-ista, you probably have been using them without even knowing, as dictionaries!<br>
In this post, we'll go over what hash tables really are, what is a "Hash" in the first place, why use them and how to use them in python.<br> If you are really interested, I will add a link in the Additional Resources section that takes you through creating your **own** Hash Table Class in python and implementating it's various functionalities, from scratch. Highly recommended!<br>
Let's get started.

******

## The Concept

A Hash Table is a data structure that is used to store information *VALUES*. But we already have arrays, and all sorts of other ways of storing our information.<br>
So why come up with a new one? 

### Why Hash Tables?

Say you have an array, and you need to access an element in said array. What do you do?<br>
You either need to know exactly what the index of the item you want to access, is, or you have to search through the entire array till you get to your desired element. Pretty expensive!<br>
A Hash Table provides nearly constant time ( O(1) ) access! How? <br>
Imagine if your Data consisted of *key*-*value* pairs, like a lot of Real-World data, kind of like -  

In [1]:
[("Name","Thor"),("Age",2000)]

[('Name', 'Thor'), ('Age', 2000)]

Here your ***values*** (Thor, 2000) are pertaining to certain aspects or ***keys*** (Name, Age), and say you have thousands of records like these. <br>
Wouldn't it be convenient if we were able to access our *values* from their *keys* , and skip the indexing jargon and problems?<br>
But how would you do it?<br>
What if we could convert our *keys* into indices of the *values* themselves? That way, we can straight away access our *value* using it's *key*, voila.<br>
And how would we do that? Using a brilliant idea called a **Hash Function**

### The Hash function

A Hash function is a function that we will use to convert our string ***key*** into an integer index.<br>This integer index can then be used to access the corresponding ***value***. <br>
In general, A hash function is any function that can be used to map data of arbitrary size to fixed-size values.<br>
The output of the hash function, called a hash value or simply a hash, will be unique for every input and thus can be used for indexing. 

![hash_function.png](attachment:d328f163-6570-42b7-8dc2-50b326fb955d.png)

********

## Applications

Hash Tables work very well in scenarios where you have a large amount of data and we need to quicky search and retrieve information.<br>You can probably imagine several use-cases where such functionality is required.<br>
Some important background uses are - 
- For compiler symbol tables. The compiler uses a symbol table to keep track of the user-defined symbols in a C++ program. This allows the compiler to quickly look up attributes associated with symbols (for example, variable names)
- For implementing passwords for systems with multiple users. Hash Tables allow for a fast retrieval of the password which corresponds to a given username.

Some that you can probably guess why - 
- Driver's license record's. With a hash table, you could quickly get information about the driver (ie. name, address, age) given the licence number.
- Telephone book databases. You could make use of a hash table implementatation to quickly look up John Smith's telephone number.
- Electronic library catalogs. Hash Table implementations allow for a fast search among the millions of materials stored in the library.


And there's lots more where that came from.

**********

## Python Implementation - The Dictionary

You have almost certainly used Hash tables in python. Maybe even unknowingly! <br>
In Python, Hash Tables are internally implemented as what we know as Dictionaries.<br>
Below is a refresher on dictionaries and and various operations on them. And also one **Amazing** way of using them.

- A dictionary is a collection which is **unordered**, **changeable** and **indexed**. In Python dictionaries are written with curly brackets, and they have keys and values.

In [2]:
# A normal Dictionary
#elements are key:value pairs. elements are seperated by commas.
normal_dictionary = {"first_name":"steve","last_name":"rogers", "hero_name":"Captain America" }

We can have a whole another dictionary as an element of a Dictionary!

In [3]:
#nested dict
'''
"avengers" will be our main dictionary and each of it's elements (ironman, cap america, etc ) will be a dictionary on its own
containing fields like name and year.
P.S. A lot of the values i put in here are going to be arbitrary, so Marvel nerds, Back Off!
'''
avengers = {
  "ironman" : {
    "name" : "tony",
    "year" : 2004
  },
  "cap america" : {
    "name" : "steve",
    "year" : 2007
  },
  "hulk" : {
    "name" : "bruce",
    "year" : 2011
  }
}

### Accessing the elements 

In [4]:
avengers["cap america"] #Method 1

{'name': 'steve', 'year': 2007}

In [5]:
avengers.get("hulk") # Method 2

{'name': 'bruce', 'year': 2011}

In [6]:
avengers["ironman"]["name"] # Accessing inner dictionary using method 1

'tony'

In [7]:
avengers.get("hulk")["name"]  #Accessing the inner dictionary using method 2

'bruce'

### Looping through the dictionary

In [8]:
for x in avengers:
    print(x) # prints all KEY names in dict
    print(avengers[x]) #prints inner dict with key x
    print(avengers[x]["name"]) #accessing level 2 VALUES

ironman
{'name': 'tony', 'year': 2004}
tony
cap america
{'name': 'steve', 'year': 2007}
steve
hulk
{'name': 'bruce', 'year': 2011}
bruce


In [9]:
for x in avengers.values(): #directly accesses nested dict, since nested dicts are values of level 1 dict.
    print(x)
    print(x["name"])

{'name': 'tony', 'year': 2004}
tony
{'name': 'steve', 'year': 2007}
steve
{'name': 'bruce', 'year': 2011}
bruce


In [10]:
for x,y in avengers.items(): # x = key    y = value 
    print(x,y)
    print(y["name"])

ironman {'name': 'tony', 'year': 2004}
tony
cap america {'name': 'steve', 'year': 2007}
steve
hulk {'name': 'bruce', 'year': 2011}
bruce


### Adding and Removing items

In [11]:
avengers["thor"] = {'name': 'thor odinson', 'year': 1004} # adding a value
avengers["cap marvel"] = {'name': 'carol', 'year': 1980} 
avengers

{'ironman': {'name': 'tony', 'year': 2004},
 'cap america': {'name': 'steve', 'year': 2007},
 'hulk': {'name': 'bruce', 'year': 2011},
 'thor': {'name': 'thor odinson', 'year': 1004},
 'cap marvel': {'name': 'carol', 'year': 1980}}

In [12]:
avengers.pop("thor") # removing "thor"
avengers

{'ironman': {'name': 'tony', 'year': 2004},
 'cap america': {'name': 'steve', 'year': 2007},
 'hulk': {'name': 'bruce', 'year': 2011},
 'cap marvel': {'name': 'carol', 'year': 1980}}

In [13]:
del avengers["cap marvel"]["year"] # another method for deleting
avengers

{'ironman': {'name': 'tony', 'year': 2004},
 'cap america': {'name': 'steve', 'year': 2007},
 'hulk': {'name': 'bruce', 'year': 2011},
 'cap marvel': {'name': 'carol'}}

Use >>>  dict.clear() to empty the dictionary

Note: Use >>> dict.copy() to copy the dictionary,<br>
dict2 = dict1 just creates a reference to dict1, and thus a change made in dict2 **won't** reflect a change in dict1, but a change made in dict1 **Will** reflect in dict2. 

Now for the Amazing application.<br>
Function keyword arguments!<br>
Functions many times use named arguments of the form *function_name(argument_name=argument_value*)<br> 
Due to the nature of the dictionary elements to be *Key:Value* pairs, we can pass an entire dictionary to a function as arbitrary number of keyword arguments, and python will unpack it and map *keys* as argument names and their respective *values* as argument values.<br>
let us see an example below- 

In [14]:
def test_func(a = 4, b = 5): 
    print("The value of a is : " , a) 
    print("The value of b is : " , b) 
test_dict = {'a' : 1, 'b' : 2} 
print("The original dictionary is : " + str(test_dict)) 
print("The default function call yields : ") 
test_func()
print("The function values with unpacking : ") 
test_func(**test_dict) # "**" is called splat operator

The original dictionary is : {'a': 1, 'b': 2}
The default function call yields : 
The value of a is :  4
The value of b is :  5
The function values with unpacking : 
The value of a is :  1
The value of b is :  2


******

More articles in the series soon!
Upcoming-
<br>
* Trees
* Graphs, and more

<br>
Any feedback or comments are highly appreciated.
<br>
<br>
Cheers!

***

## Additional Resources
- [Hash Function](https://en.wikipedia.org/wiki/Hash_function)
- [Hash Tables from scratch](https://coderbook.com/@marcus/how-to-create-a-hash-table-from-scratch-in-python/)
- [Dictionary Methods](https://www.w3schools.com/python/python_ref_dictionary.asp)