# Hashing

Hashing is a technique or process of mapping keys, and values into the hash table by using a hash function. It is done for faster access to elements. The efficiency of mapping depends on the efficiency of the hash function used.

* Search
* Insert
* Delete
> All operations are done in O(1) on average in hashing

*Not Useful for*
1. Finding closest value (AVL Tree Data Structure used)
2. Sorted Data (AVL Tree Data Structure used)
3. Prefix Searching (Trie Data Structure used )

*Collision Handling*

1. Chaining
2. Open Addressing 
    1. Linear Probing
        
    2. Quadratic Probing
    3. Double Hashing
    
    
**Problems**

#### Make the below links workable?? [Click Me](https://nbviewer.org/github/ChandrashekharRobbi/GFG-DSA/blob/main/Hashing.ipynb?flush_cache=true)

1. [Frequencies of Array Elements](#Frequencies-of-Array-Elements)
2. [Implementation of Chaining](#Implementation-of-Chaining)
3. [Hashing for pair - 1](#Hashing-for-pair---1)
4. [Hashing for pair - 2](#Hashing-for-pair---2)
5. [Linear Probing in Hashing](#Linear-Probing-in-Hashing)
6. [Implementation of Linear Probing](#Implementation-of-Linear-Probing)
7. [Quadratic Probing in Hashing](#Quadratic-Probing-in-Hashing) 
8. [Separate chaining in Hashing](#Separate-chaining-in-Hashing)
9. [Count Non-Repeated Elements](#Count-Non-Repeated-Elements)
10. [Print Non-Repeated Elements](#Print-Non-Repeated-Elements)
11. [Non Repeating Character](#Non-Repeating-Character)
12. [Winner of an election](#Winner-of-an-election)
13. [First Repeating Element](#First-Repeating-Element)
1. [Intersection of Two Arrays](#Intersection-of-Two-Arrays)
1. [count of dictinct elements ](#count-of-dictinct-elements-)
1. [Number Containing 1,2,3](#Number-Containing-1,2,3)
1. [Subarray with 0 sum](#Subarray-with-0-sum)

In [1]:
%run imp_personal.py
z = MyFunction()

Note that after executing new function your next cell will automatically converted to markdown cell :)


# Collision Handling

## Chaining


* It creates a link in the same index of an array

## Implementation of Chaining

In [2]:
class MyHash:
    def __init__(self,b):
        self.BUCKET = b
        self.table = [[] for x in range(b)]
        
    def hashFunction(self,x):
        i = x % self.BUCKET
        return i
    
    def insert(self, x):
        i = self.hashFunction(x)
        self.table[i].append(x)
        print(self.table)
        
    def remove(self,x):
        i = self.hashFunction(x)
        
        if x in self.table[i]:
            self.table[i].remove(x)
            print(self.table)
        else:
            print("Item is not present in Hash Table")
            
    def search(self,x):
        i = self.hashFunction(x)
        return x in self.table[i]

In [3]:
h = MyHash(7)

In [4]:
h.insert(70)

[[70], [], [], [], [], [], []]


In [5]:
h.insert(71)

[[70], [71], [], [], [], [], []]


In [6]:
h.insert(9)

[[70], [71], [9], [], [], [], []]


In [7]:
h.insert(56)

[[70, 56], [71], [9], [], [], [], []]


In [8]:
h.insert(72)

[[70, 56], [71], [9, 72], [], [], [], []]


In [9]:
h.remove(23)

Item is not present in Hash Table


In [10]:
h.remove(56)

[[70], [71], [9, 72], [], [], [], []]


In [11]:
h.search(72)

True

In [12]:
h.remove(56)

Item is not present in Hash Table


In [13]:
h.remove(70)

[[], [71], [9, 72], [], [], [], []]


## Open Addressing

* No. of slots in Hash Table >= No of values to be inserted in Hash Table
* Cache Friendly

### Linear Probing

* Linearly search for next empty slots when there is a collision.
```python
hash(key) = key % 7
hash(key,i) = ((hash(key) + i) % 7)
```
* Insert Operation:
```bash
if !collisionOccur:
    append the number/item
else:
    linearly iterate index and append to next empty slot
```
* In search function we stop searching when one of the three cases occur
    1. Empty SLots
    2. Key Found
    3. Once we have completed traversing the entire hash table.
* In delete Operation
    * After Deletion:
        1. Simply making Slot Empty => which results to empty slots and when searching it will stop in empty slot and does not iterate further
        
        To reduce this problem we should mark deleted slots as `deleted`
        
**Problem With Linear Probing**


    1. Clustering
    
To handle **Clustering problem with linear probing** we use 


1.  Quadratic Probing


## Implementation of Linear Probing

In [14]:
class MyHashLinearProbig:
    def __init__(self, c):
        self.cap = c
        self.table = [-1]*c
        self.size = 0
        
    def hash(self, x):
        return x % self.cap
    
    def insert(self,key):
        if self.size == self.cap:
            return False           # when table is full
        if self.search(key) == True:
            return False    # when key is already present
        t = self.table
        i = self.hash(key)
        while t[i] not in (-1,-2):
            i = (i + 1) % self.cap
        t[i] = key
        self.size += 1
        return True
                
    def search(self, key):
        h = self.hash(key)
        t = self.table
        i = h
        while t[i] != -1:
            if t[i] == key:
                return True     # key found
            i = (i + 1) % self.cap
            if i == h:
                return False    # Traversed entire table
        return False  # Empty Slot
    
    def remove(self,key):
        h = self.hash(key)
        t = self.table
        i = h
        while t[i] != 1:
            if t[i] == key:
                t[i] = -2        # we mark as deleted by inserting -2
                return True
            i = (i + 1) % self.cap
            if i == h:
                return False     # Traversed entire table
        return False             # Key not found

> Note : we can handle `-1` and `-2` numbers by replacing `None` and `dummy reference` respectively

In [15]:
a = MyHashLinearProbig(7)

In [16]:
a.insert(49)

True

In [17]:
a.insert(50)

True

In [18]:
a.insert(51)

True

In [19]:
a.search(50)

True

In [20]:
a.search(69)

False

In [21]:
a.remove(69)

False

In [22]:
a.remove(50)

True

In [23]:
a.insert(123)

True

### Quadratic Probing

*The function of quadratic probing will be* :  
```python
hash(key,i) = ((hash(key) + i**2) % m)
```

*Condition for quadratic probing to give best results*

```python
if loadFactor < 0.5 and m is prime:
    then only quadratic probing will give good result
else:
    it can form Secondary Clustering
```
**Problems with Quadratic Probing**:
1. Secondary Clustering

To handle **Secondary Clustering problem with quadratic probing** we use 


Double Hashing

In [24]:

class MyHashQuadraticProbing:
    def __init__(self, c):
        self.cap = c
        self.table = [-1]*c
        self.size = 0

    def hash(self, x):
        return x % self.cap

    def insert(self,key):
        if self.size == self.cap:
            return False           # when table is full
        if self.search(key) == True:
            return False    # when key is already present
        t = self.table
        i = self.hash(key)
        while t[i] not in (-1,-2):
            i = (key + power**2) % self.cap
            power += 1
        t[i] = key
        self.size += 1
        return True

    def search(self, key):
        h = self.hash(key)
        t = self.table
        i = h
        while t[i] != -1:
            if t[i] == key:
                return True     # key found
            i = (key + power**2) % self.cap
            power += 1
            if i == h:
                return False    # Traversed entire table
        return False  # Empty Slot

    def remove(self,key):
        h = self.hash(key)
        t = self.table
        i = h
        while t[i] != 1:
            if t[i] == key:
                t[i] = -2        # we mark as deleted by inserting -2
                return True
            i = (key + i**2) % self.cap
            if i == h:
                return False     # Traversed entire table
        return False             # Key not found

### Double Hashing

1. Double Hashing is generalized version of Linear Probing

2. *The function of double hashing will be* :  
```python
hash1(key) = key % m
hash2(key) = (m-1) - (key % (m-1))
hash(key,i) = ((hash1(key) + i*hash2(key)) % m)
```
3. If hash2(key) is relatively prime to m , then it always find a free slot if there is one.
4. Distributes keys more uniformly than linear probing and quadratic probing
5. It creates **No clustering**

6. Agorithm for double hashing

```python
def DoubleHashing(table,key):
    if table is full:
        return error
    probe = hash1(key)
    offset = hash2(key) #in linear probing offset = 1
    while (table[probe] is occupied):
        probe = (probe + offset ) % m
    table[probe] = key
```

# Problems

## Frequencies of Array Elements

In [25]:
def FreqArrElementUsingHash(arr):
    hash = {}
    for i in arr:
        if i not in hash.keys():
            hash[i] = 1
        else:
            hash[i] += 1
    for i in hash.keys():
        print(i,hash[i])

In [26]:
arr1 = [10,12,10,15,10,20,12,12]

In [27]:
FreqArrElementUsingHash(arr1)

10 3
12 3
15 1
20 1


In [28]:
arr2 = [10,10,10,10]

In [29]:
FreqArrElementUsingHash(arr2)

10 4


In [30]:
arr3 = [10,20]

In [31]:
FreqArrElementUsingHash(arr3)

10 1
20 1


## Hashing for pair - 1

You are given an array of distinct integers and a sum. Check if there's a pair with the given sum in the array.

In [32]:
def sumExists(arr, N, sum):
    #Your code here
    for i in arr:
        d = sum - i
        if (d in arr) and (d != i):
            return 1
        
    return 0

## Hashing for pair - 2

You are given an array of integers and an integer sum. You need to find if two numbers in the array exists that have sum equal to the given sum.

In [33]:
def sumExists(arr, N, sum):
    #Your code here
    hash = {}
    for i in arr:
        hash[i] = 1
    for x in hash.keys():
        d = sum - i
        if (d in hash) and (d != x):
            return 1
    return 0

## Linear Probing in Hashing

Given an array of integers and a hash table size. Fill the array elements into a hash table using Linear Probing to handle collisions.

In [34]:
def linearProbing(self,hashSize, arr, sizeOfArray):
        #Your code here
        class MyHashLinearProing:
            def __init__(self,c):
                self.cap = c
                self.table = [-1]*c
                self.size = 0
            def hash(self,key):
                return key % self.cap
            def search(self,key):
                h = self.hash(key)
                t = self.table
                i = h
                while t[i] != -1:
                    if t[i] == key:
                        return True
                    i = (i + 1) % self.cap
                    if i == h:
                        return False
                return False
            def insert(self, key):
                if self.size == self.cap:
                    return False
                if self.search(key) == True:
                    return False
                t = self.table
                i = self.hash(key)
                while t[i] not in (-1,-2):
                    i = (i + 1) % self.cap
                t[i] = key
                self.size += 1
                return True
        m = MyHashLinearProing(hashSize)
        for i in arr:
            m.insert(i)
        return m.table

## Quadratic Probing in Hashing

Given an array of integers and a Hash table. Fill the elements of the array into the hash table by using Quadratic Probing in case of collisions.

In [35]:
def QuadraticProbing(self,hash, hashSize, arr, N):
        #Your code here
        class MyHashQuadraticProbing:
            def __init__(self, c,hash):
                self.cap = c
                self.table = hash
                self.size = 0
                
            def hash(self, x):
                return x % self.cap
            
            def insert(self,key):
                if self.size == self.cap:
                    return False           # when table is full
                if self.search(key) == True:
                    return False    # when key is already present
                t = self.table
                i = self.hash(key)
                power = 0
                while t[i] not in (-1,-2):
                    i = (key + power**2) % self.cap
                    power += 1
                t[i] = key
                self.size += 1
                return True
                        
            def search(self, key):
                h = self.hash(key)
                t = self.table
                i = h
                power = 0
                while t[i] != -1:
                    if t[i] == key:
                        return True     # key found
                    i = (key + power**2) % self.cap
                    power += 1
                    if i == h:
                        return False    # Traversed entire table
                return False  # Empty Slo
            def final():
                return self.table
                   # Key not found
        b = MyHashQuadraticProbing(hashSize, hash)
        for i in arr:
            b.insert(i)

In [36]:
def QuadraticProbing(self,hash, hashSize, arr, N):
        #Your code here
        for value in arr:
            i = 0 
            p = lambda x: (value + x**2) % hashSize
            while hash[p(i)] not in [-1, value]: 
                i += 1
            hash[p(i)] = value 
            
        return hash

## Separate chaining in Hashing

In [37]:
def separateChaining(self, hashSize, arr, sizeOfArray):
    #Your code here
    #return hashtable
    hash = [[] for x in range(hashSize)]
    for value in arr:
        i = value % hashSize
        hash[i].append(value)
    return hash

## Count Non-Repeated Elements

You are given an array of integers. You need to print the count of non-repeated elements in the array.

In [38]:
def countNonRepeated(self,arr,n):
    #Your code here
    count = 0
    hash = {}
    for i in arr:
        if i not in hash:
            hash[i] = 1
        else:
            hash[i] += 1
    for i in hash.keys():
        if hash[i] == 1:
            count += 1
    return count

## Print Non-Repeated Elements

You are given an array of integers. You need to print the non-repeated elements as they appear in the array.

In [39]:
def printNonRepeated(self,arr,n):
    #Your code here
    hash = {}
    lst = []
    for i in arr:
        if i not in hash:
            hash[i] = 1
        else:
            hash[i] += 1

    for x in arr:
        if hash[x] == 1:
            lst.append(x)
    return lst

## Non Repeating Character

Given a string S consisting of lowercase Latin Letters. Return the first non-repeating character in S. If there is no non-repeating character, return '$'.

In [40]:
def nonrepeatingCharacter(self,s):
    #code here
    hash = {}
    for i in s:
        if i not in hash:
            hash[i] = 1
        else:
            hash[i] += 1
    for i in s:
        if hash[i] == 1:
            return i

    return '$'

## Winner of an election

Given an array of names (consisting of lowercase characters) of candidates in an election. A candidate name in array represents a vote casted to the candidate. Print the name of candidate that received Max votes. If there is tie, print lexicographically smaller name.

In [41]:
def winner(self,arr,n):
    # Your code here
    # return the name of the winning candidate and the votes he recieved
    hash = {}
    for i in arr:
        if i not in hash:
            hash[i] = 1
        else:
            hash[i] += 1
    max_votes = max(hash.values())
    max_arr = []
    for x in hash.keys():
        if hash[x] == max_votes:
            max_arr.append(x)
    max_arr.sort()
    return max_arr[0], max_votes

## First Repeating Element

Given an array arr[] of size n, find the first repeating element. The element should occur more than once and the index of its first occurrence should be the smallest.

In [42]:
def firstRepeated(self,arr, n):

    #arr : given array
    #n : size of the array
    hash = {}
    repeat = -1
    for i in arr:
        if i not in hash:
            hash[i] = 1
        else:
            hash[i] += 1

    for i in range(n):
        if hash[arr[i]] > 1:
            repeat = i + 1
            break
    return repeat

In [54]:
def firstRepeated(self,arr,n):
    repeat = -1
    for i in range(n):
        if arr.count(arr[i]) > 1:
            repeat = i + 1
            break
    return repeat

## Intersection of Two Arrays

In [56]:
class Solution:
    def NumberofElementsInIntersection(self,a, b, n, m):
        #return: expected length of the intersection array.
        values = set()
        for i in a:
            values.add(i)
        count = 0
        for i in b:
            if i in values:
                values.remove(i)
                count += 1
        return count

## count of dictinct elements 

In [58]:
class Solution:    
    #Function to return the count of number of elements in union of two arrays.
    def doUnion(self,a,n,b,m):
        values = set()
        for i in a:
            values.add(i)
        for j in b:
            values.add(j)
        return len(values)


## Number Containing 1,2,3

In [60]:
#Function to find all the numbers with only 1,2 and 3 in their digits.
def findAll():
    #code here
    set_num = {'1','2','3'}
    for i in range(1,1000001):
        s = set(str(i))
        if s.issubset(set_num):
            mp[i] = 1

## Subarray with 0 sum

In [62]:
# Naive Solution as it takes O(n^2) time complexity
def subArrayExists(self,arr,n):
    for i in range(n):
        for j in range(i+1, n+1):
            if sum(arr[i:j]) == 0:
                return True
    return False

In [None]:
class Solution:
    
    #Function to check whether there is a subarray present with 0-sum or not.
    def subArrayExists(self,arr,n):
        h1 = set()
        prefix_sum = 0
        for i in range(n):
            prefix_sum += arr[i]
            if prefix_sum == 0 or prefix_sum in h1:
                return True
            h1.add(prefix_sum)
        return False