# Hashing

Hashing is a technique or process of mapping keys, and values into the hash table by using a hash function. It is done for faster access to elements. The efficiency of mapping depends on the efficiency of the hash function used.

* Search
* Insert
* Delete
> All operations are done in O(1) on average in hashing

*Not Useful for*
1. Finding closest value (AVL Tree Data Structure used)
2. Sorted Data (AVL Tree Data Structure used)
3. Prefix Searching (Trie Data Structure used )

*Collision Handling*

1. Chaining
    * [Implementation of Chaining](#Implementation-of-Chaining)
2. Open Addressing 
    1. Linear Probing
    2. Quadratic Probing
    3.Double Hashing
    
    
**Problems**

1. [Frequencies of Array Elements](#Frequencies-of-Array-Elements)

# Collision Handling

## Chaining


* It creates a link in the same index of an array

#### Implementation of Chaining

In [1]:
class MyHash:
    def __init__(self,b):
        self.BUCKET = b
        self.table = [[] for x in range(b)]
        
    def hashFunction(self,x):
        i = x % self.BUCKET
        return i
    
    def insert(self, x):
        i = self.hashFunction(x)
        self.table[i].append(x)
        print(self.table)
        
    def remove(self,x):
        i = self.hashFunction(x)
        
        if x in self.table[i]:
            self.table[i].remove(x)
            print(self.table)
        else:
            print("Item is not present in Hash Table")
            
    def search(self,x):
        i = self.hashFunction(x)
        return x in self.table[i]

In [2]:
h = MyHash(7)

In [3]:
h.insert(70)

[[70], [], [], [], [], [], []]


In [4]:
h.insert(71)

[[70], [71], [], [], [], [], []]


In [5]:
h.insert(9)

[[70], [71], [9], [], [], [], []]


In [6]:
h.insert(56)

[[70, 56], [71], [9], [], [], [], []]


In [7]:
h.insert(72)

[[70, 56], [71], [9, 72], [], [], [], []]


In [8]:
h.remove(23)

Item is not present in Hash Table


In [9]:
h.remove(56)

[[70], [71], [9, 72], [], [], [], []]


In [10]:
h.search(72)

True

In [11]:
h.remove(56)

Item is not present in Hash Table


In [12]:
h.remove(70)

[[], [71], [9, 72], [], [], [], []]


## Open Addressing

* No. of slots in Hash Table >= No of values to be inserted in Hash Table
* Cache Friendly

### Linear Probing

* Linearly search for next empty slots when there is a collision.
```python
hash(key) = key % 7
hash(key,i) = ((hash(key) + i) % 7)
```
* Insert Operation:
```bash
if !collisionOccur:
    append the number/item
else:
    linearly iterate index and append to next empty slot
```
* In search function we stop searching when one of the three cases occur
    1. Empty SLots
    2. Key Found
    3. Once we have completed traversing the entire hash table.
* In delete Operation
    * After Deletion:
        1. Simply making Slot Empty => which results to empty slots and when searching it will stop in empty slot and does not iterate further
        
        To reduce this problem we should mark deleted slots as `deleted`
        
**Problem With Linear Probing**


    1. Clustering
    
To handle **Clustering problem with linear probing** we use 


1.  Quadratic Probing


### Quadratic Probing

*The function of quadratic probing will be* :  
```python
hash(key,i) = ((hash(key) + i**2) % m)
```

*Condition for quadratic probing to give best results*

```python
if loadFactor < 0.5 and m is prime:
    then only quadratic probing will give good result
else:
    it can form Secondary Clustering
```
**Problems with Quadratic Probing**:
1. Secondary Clustering

To handle **Secondary Clustering problem with quadratic probing** we use 


Double Hashing

### Double Hashing

1. Double Hashing is generalized version of Linear Probing

2. *The function of double hashing will be* :  
```python
hash1(key) = key % m
hash2(key) = (m-1) - (key % (m-1))
hash(key,i) = ((hash1(key) + i*hash2(key)) % m)
```
3. If hash2(key) is relatively prime to m , then it always find a free slot if there is one.
4. Distributes keys more uniformly than linear probing and quadratic probing
5. It creates **No clustering**

6. Agorithm for double hashing

```python
def DoubleHashing(table,key):
    if table is full:
        return error
    probe = hash1(key)
    offset = hash2(key) #in linear probing offset = 1
    while (table[probe] is occupied):
        probe = (probe + offset ) % m
    table[probe] = key
```

# Problems

#### Frequencies of Array Elements

In [22]:
def FreqArrElementUsingHash(arr):
    hash = {}
    for i in arr:
        if i not in hash.keys():
            hash[i] = 1
        else:
            hash[i] += 1
    for i in hash.keys():
        print(i,hash[i])

In [23]:
arr1 = [10,12,10,15,10,20,12,12]

In [24]:
FreqArrElementUsingHash(arr1)

10 3
12 3
15 1
20 1


In [25]:
arr2 = [10,10,10,10]

In [26]:
FreqArrElementUsingHash(arr2)

10 4


In [27]:
arr3 = [10,20]

In [28]:
FreqArrElementUsingHash(arr3)

10 1
20 1


In [35]:
def convertinotlink(s):
    print(s)
    print(f"[{s}](#{s.replace(' ','-')})")

In [36]:
convertinotlink('Frequencies of Array Elements')

Frequencies of Array Elements
[Frequencies of Array Elements](#Frequencies-of-Array-Elements)
