# Complexity 
For c++  
* For unordered_map/set, both insert() and find() have O(1) complexity. 
* For std::map/set both insert() and find() have O(logN) complexity. However, the extra cost provides an ordered keys. 
* For multiset and multimap, we can have repetitive values. Also note set/multiset and map/multimap/multiset in the same `<set>` and `<map>` header, while `unordered_map/set` are in different headers with prefix unordered_. 

# Application
* 	The key point is Hash map can achieve O(1) access from the O(N) linear search and O(logN) binary search. However, this is only the starting point for using the idea of Hash map.  

* Using array element as keys and array index as values. See the example of finding the distance between duplicate elements.

* Sometimes vector or other sequential data structure can also be used to implement Hash property, as in the Anagram problem, we use vector(128,0) for hash searching. In this case, it can be even faster because we don't have to use Hash function. 

* Using array elements as keys, and check whether umap/uset.find(arrayelement - shift) is in the map. 
    * Find whether an element is a start of consecutive sequence. 
    * Find the two sum and three sum. 
    * Using the idea of dynamic programming, pre-calculate some quantity such sum to index i and then use Hash, as in subarray-sum-k problem.

* Note although sometimes it is preferable to use one-pass Hash table, i.e, inserting into map while looping. However, this is not always the case. For example, in finding longest subsequence finding, we need insert all the elements to a map first. 

* Note set is just a special form of map where key = value. So set is also a hash-table thing, just special. We can use set to generate distinct numbers. 


# Two intuitive examples
As will be explained later, Hash function normally is customized and optimized for every specific situation. It cannot be general. The next are two examples from my old projects are actually examples of naïve Hash tables. 

* Declaring a very big vector whole size is much bigger than the maximum orderId. Then insert a reqId and orderId pair to the vector. Then when an orderId is returned, it is just the index of the vector, and therefore I can access my reqId with O(1) complexity.

* If I have a vector of strings and I need search one of them, then I can have another naïve Hash idea. I can transfer e.g., abcd to a four digits. For example ASCII_A-65 = 0, b is 1, etc. Then we use the number as the array index. 

# Hash Tables
### How Hash Tables work?
When coming to how the hash tables work, we talk about three main following things:  
* Hash Function - Hash Function is a function which leverages accessing and searching data in the hash table. It could be any algorithm customized and optimized as per a particular scenario. It should be a logic which takes our key as the input, and generally gives out the value corresponding to it in the hash table. However, with the same input key to the logic, output should be same every time.

* Key Value Pair - The Key-Value pair is the basic essence of the hash table data structure. For the suited situations, it makes searching really efficient and optimized.

* Collisions - Collision is when for one key, we have more than one data value to be stored. A hash table would never guarantee that for every key, there would be just one data value corresponding to it. This is because generally a hash table would have a fixed number of keys. Any new entry after the size of hash table has to be accommodated causing the collision. However, a good hash table would always try to avoid collisions and use efficient ways to handle collisions, and at the same time optimize the size of the table. This introduces the tradeoff. This surface outs another important feature of a good hash table which is uniform distribution.

### Hash Tables Implementation in C
Here I have tried to implement a simple hash table in C. The scenario is: A telephone directory storing std codes for each state. All the states would be used as key and its std code as its value. Here is the general depiction of our hash table:
Code:
[PB] -> 176
[JK] -> 172
[HP] -> 177
...
To think of, basic implementation data structure taken is an array of std codes. For each state, a two character code is defined. For each state two character code, we use a hash function to compute the index, which can be used to determine/store the std code in the array.

To start with, here is our basic data structure of a fixed size i.e. an array  
Code:  
int code[TOTAL];
We need a hash function for computing index out of the state. The algorithm used is as follows:

* Each state is a key
* Each key is stored as a two uppercase character string code.
* Compute a value corresponding to each character by subtracting value '65' from its ASCII value. In this article, this value for each character will be called as its character number.
* Now multiply the character number of the second character by 26 and add it to the character number of the first character. The formulae is:  
`val = (state[1] *26 ) + state[0]`
Note, the value 26 is a magic number taken to reduce the collisions. However, the trade off is, it will increase the size of the hash table.
* Now this value is the index in our std code array which will store the std code for that particular state.

Here is how our hash function looks like:  
`int hashForKey(char* state)
{
    int index;
    if (strlen(state) == 0)
        return 0;
    //state string has to be just 2 chars
    if (strlen(state) != 2)
    {
        printf("Invalid State string\n");
        return -1;
    }
    index = (state[0] - ASCII_A) + ((state[1] - ASCII_A) * MAGICNUM) ;
    if (index < 0)
    {
        printf("Error in hash function\n");
        return -1;
    }
    return index;
}`

Now retrieving/storing can happen with the help of this hash function. The max entries this hash table can store is 26*26, a limit, defined and possible with our English Character Set.



This program is just a simple implementation of a hash table to give a general idea to a beginner. There is a huge scope of improvement and optimization. Once, you understand it, a few limitations in context of hash table, which can be worked upon are:

* A better magic number, which can reduce the fixed size of hash table considerably.
* To handle collisions if any
* A better algorithm to have the states as key in a more optimized way.