# Hashing Technique

+ Chaining

## Chaining

___


One of the methods to avoid or resolve collision, and falls under open hashing


We have the following keys, and the hashtable can accommodate keys from 0 to 9. 

So the hashing function would as follows:

$$ h(x) = x\%{10}$$

Using the above function we will insert the keys in the above function:

$$16,12,25,39,6,122,5,68,75$$


And need to go into:

```

                           Hashtable
                             +---+
                             |   |0
                             +---+
                             |   |1
                             +---+
                             |   |2
                             +---+
                             |   |3
                             +---+
                             |   |4
                             +---+
                             |   |5
                             +---+
                             |   |6
                             +---+
                             |   |7
                             +---+
                             |   |8
                             +---+
                             |   |9
                             +---+

```

Remember we inserting into a Linked List (just to make sure we understand). That is we have an array for pointers to Linked List (which is also Node to pointer)


We get the following Hashtable array of pointers to LL:

```

              Analysis             Hash Table

              +---+---+              +---+
              | 16| 6 |             0|   |
              +---+---+              +---+
              | 12| 2 |             1|   |
              +---+---+              +---+     +---+---+    +---+---+
              | 25| 5 |             2| +-----> | 12| +----> |122| / |
              +---+---+              +---+     +---+---+    +---+---+
              | 39| 9 |             3|   |
              +---+---+              +---+
              | 6 | 6 |             4|   |
              +---+---+              +---+     +---+---+    +---+---+    +---+---+
              |122| 12|             5| +-----> | 5 | +----> | 25| +----> | 75| / |
              +---+---+              +---+     +---+---+    +---+---+    +---+---+
              | 5 | 5 |             6| +-----> | 6 | +----> | 16| / |
              +---+---+              +---+     +---+---+    +---+---+
              | 68| 8 |             7|   |
              +---+---+              +---+     +---+---+
              | 75| 5 |             8| +-----> | 68| / |
              +---+---+              +---+     +---+---+
              |   |   |             9| +-----> | 39| / |
              +---+---+              +---+     +---+---+



```

From the above it is clear, that when inserting into hashtable, we need to insert in sorted order if there is more than one value per index.

From the above we can see that the keys are inserted in chains.

So with the 0-9 array indexes, the LL can go up to any size. Also, as we saw with LL, it we can insert at the beginning or end without any effort when it comes to LL. 

But how can we do our search now?  Lets say I need to find the key=12. How do I go about getting it.


Procedure

1. Use the hashing function on 12.
2. Which gives 2.
3. Do to that index in hashing table
4. Then search in the LL chain to get the value 12.

So 12 is found in just one comparison.

Also, if you need tp find key, that is not there, e.g. key = 15.

1. Apply mod hashing function
2. We need to go index 5.
3. We search chain, pass 3, get to 25. This is where we stop, since the LL is sorted.
4. We thus know we do not need to go further (i.e. advantage of sorted list)


So its Easy:

1. Inserting
2. Searching










## Analysis

____

Lets say we have 100 keys, and mod function is mod 10. That means we have 0-9 spaces.


That is

$$n = 100$$
$$size = 9$$

But here out analysis will use different terms. 

First one is dividing the elements size, and that is called the loading factor:

$$\text{Loading Factor} = \lambda = \frac{n}{size}$$

$$\text{where size}= x\%{10} \text{, to give range:} 0 - 9$$


Loading factor, which is number of keys, divided by size of table, is very important in hashing.

Remember when we looked at the time complexity previously, we only looked at number of elements, n.

But here we also need to look at the size of the hashing table, and that needs to worked into our time complexity calculation as well, That why we have loading factor now.


So analysis of hashing is always based on these two properties:

1. Numbber of elements (n)
2. Size of hashing table

Which we call loading factor.

So using our example, what is the loading factor:


$$\lambda = \frac{n}{size}$$

$$\lambda = \frac{100}{10}=10$$


But what does it mean? That means for each index, which should "average" 10 keys (LL)(for that index), in a perfect scenario.

So we can say, we expecting the loading factor, will give us a uniformlly districbuted hashing table.



## Time Taken for Searching

___


Time taken for Successful Search,

If we take hash function, we will get index, in constant time: $t = 1$.

Then once we have this index, we need to search the LL, for that particular key, and that takes whatever the lamda is, divided 2: $\frac{\lambda}{2}$


So the average time for a **successfull search**:

$$\text{Ave Time Succ.} = t = 1 + \frac{\lambda}{2}$$


What is then time for finding a key that is not in list. e.g. 99. Again constant time, to get to inde:: $t = 1$.

Then we need to check the entiree chain of LL, so this wil have to be the maximum time. So how many keys do we have? 100. So we take full lamda:  $\lambda$




So the average time for a **unsuccessfull search**:

$$\text{Ave Time Unsucc.} = t = 1 + \lambda$$




## ## Time Taken for Deleting a key

___

Lets say we need to delete key = 12.


Procedure

1. Use hash function
2. We get to index 12 in hashing table
3. The search key in LL, then delete it.
4. But this is same as deleting ket from LL (so we can use that reference)





## Further Analysis

___


The mod function do not have to be mod 10. 

Remember by taking mod we are looking at the last digit.

What if we want to look at the second last digit? Then we can modify the hashing function as:

$$ h(x) = (x\%{10})\%10$$

But the reason why we use mod, is that we can limit the table to 10 entries, (0-9). So limiting the table is one reason why we use mod 10.


But here is a drawback of mod 10. Suppose we have the keys:

$$5,35,95,145,175,265,345,635,15,25,55$$

We see that all the keys are ending with 5. so using mod, we will get a clustering of keys at index 5, and rest of hashing table will be empty.

And loading factor thaat keys is uniformly distributed is false. 

So the hash function can control this distribution. So we have responsibility, in order to maintain the uniformity of keys, to select the correct hash function.




## Lets Code

___

We need code for LL
+ Inserting 
+ Deleting 
+ Searching

The above was done when covering LL topic.


In [None]:
#include <iostream>
#include <climits>
#include <math.h>
#define INSERTION_OPERATOR operator<<
#define EXTRACTION_OPERATOR operator>>
#define ADDITION_OPERATOR operator+
using namespace std;

In [None]:
class Node{
public:
    int data;
    Node *next;
}*f=NULL;

In [None]:
void InsertSorted(Node **H, int x){
    Node *t=NULL,*q=NULL,*p=*H;
    
    t = new Node;
    t->data = x;
    t->next=NULL;
    
    //if it is first node
    if (*H==NULL)
        *H=t;
    else{    
        while(p && p->data < x){
                //cout<<"p data: "<<p->data<<endl;
                q=p;
                p=p->next;
        }
        t->next=p;
        q->next=t;
    }
}

In [None]:
Node *SearchLL(Node *p, int key){
    while(p){
        if(p->data==key)
            return p;
        p = p->next;    
    }
   return NULL; 
}

In [None]:
//Hashing function
int HashFunc(int key){
    return key%10;
 
}

In [6]:
//Insert into an array H[] of pointers
void HashTableInsert(Node *H[], int key){
    
    int index = HashFunc(key);
    
    //f= H[index];
        
    InsertSorted(&H[index],key);
    
}

In [7]:
Node *HT[10];

int i; 

//Initialize HT,using for Loop
for(i=0; i < 10; i++)
    HT[i]=NULL;


HashTableInsert(HT,12);
HashTableInsert(HT,22);
HashTableInsert(HT,42);

HashTableInsert(HT,13);
HashTableInsert(HT,23);
HashTableInsert(HT,43);


HashTableInsert(HT,16);
HashTableInsert(HT,26);
HashTableInsert(HT,46);


//display
for(int i=0; i < 10; i++){
   
    if (HT[i]){
        cout<<"["<<i<<"]:";
        Node *p= HT[i];
        while(p){
            cout<<p->data<<" ";
            p=p->next;
        }
        cout<<endl;
    }    
    else
    {
        cout<<"["<<i<<"]:"<<HT[i]<< " "<<endl; 
    }
}    


[0]:0 
[1]:0 
[2]:12 22 42 
[3]:13 23 43 
[4]:0 
[5]:0 
[6]:16 26 46 
[7]:0 
[8]:0 
[9]:0 


In [8]:
Node *temp;

int key = 22;

temp=SearchLL(HT[HashFunc(key)], key);
cout<<"Found Node: "<<temp->data<<endl;

Found Node: 22


In [9]:
// will get error as it is not found
Node *temp2;

int key2 = 99;

temp2=SearchLL(HT[HashFunc(key2)], key2);
cout<<"Found Node: "<<temp2->data<<endl;




Found Node: 

cout<<"Found Node: "<<temp2->data<<endl;
[0;1;32m                      ^~~~~
[0m

Interpreter Exception: 