## Hashing
Consider the following array: [1,2,1,3,4]  
To count the number of time an element appears in an array we loop through the array.  

In [1]:
def counter(arr,number):
    c = 0
    for i in range(len(arr)):
        if arr[i] == number:
            c+=1
    return c

print(counter([1,2,1,3,4],1))

2


The above code has a time complexity is of O(n) and to count the number of times K elements appear in array is O(k x n). For larger array the time complexity will be large. To solve this problem we have ***hashing***.  
Hasing refers to pre-store something and fetch it later.  
***Prestoring***: consider we have number stored in a array with values between 1 and 12, now we will create another array called **hash array** that is empty and has a length of 12+1 = 13.  
<table>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
<td>5</td>
<td>6</td>
<td>7</td>
<td>8</td>
<td>9</td>
<td>10</td>
<td>11</td>
<td>12</td>
</tr>
</table>

We parse through each element of array and add 1 at the index of hash array whose index is same as the value.
<table>
<tr>
<td>0</td>
<td>2</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
<td>5</td>
<td>6</td>
<td>7</td>
<td>8</td>
<td>9</td>
<td>10</td>
<td>11</td>
<td>12</td>
</tr>
</table>

***Fetching***: If we want how many times an element has appeared in array, we look at the index of hash array which is same as the element, the value at the position is our answer.  
Ex: 1 has appeared 2 times.

In [2]:
max_number = 12
hasharr = [0]*(max_number+1)
arr = [1,2,1,3,4]
# pre-fetching
for i in arr:
    hasharr[i] +=1
# Fetching
number = 1
count_number = hasharr[number]
print(count_number)

2


The above code has time complexity just of O(n) and complexity of fetching is O(1). If we have to find count of k elements, the time complexity is of O(n+k), if length of k is less than or equal to n, then total time complexity is of O(n).

### Character hashing
Consider the following string: "abcdabejc" and we have to count the number of times each character has appeared. Now using standard approach for k elements, we found that time complexity if of O(k x n).  
We have that integers are commonly used as the indexing. We now somehow make characters as indexes. Assume that we have characters in a string and all are lower case alphabets. There are 26 lower case alphabets. So, now size of our hash array will be 26, indexed from 0 - 25. For characters to be index we have to map "a" - "z" to "0" - "25".  
Ex: a mapped to 0, b mapped to 1,........z mapped to 25.  

For mapping we are going to use the ASCII values of the characters. The ASCII of 'a' is 97.  
To get the index for the character in hash array  
index = ord(character) - ord('a')  
Ex:  
index for 'b' : ord('b') - ord('a')  
index for 'f' : ord('f') - ord('a')  
...

In [3]:
hasharr = [0]*26
s = "abcdabejc"
# pre storing
for i in s:
    hasharr[ord(i)-ord('a')] +=1
a = 'a'
# Fetching
print(hasharr[ord(a)-ord('a')])

2


Including uppercase and lowercase along with special characters there are total 256 characters in ASCII format. So the hash array can be of size 256.

In [4]:
hasharr = [0]*256
s = "Asjjoak%##"
# Pre storing
for i in s:
    hasharr[ord(i)] += 1
# Fetching
print(hasharr[ord('j')])

2


In python, dictionaries are built-in hash arrays

In [5]:
hash_dict = {}
arr = [1,2,1,3,4]
for i in arr:
    if i not in hash_dict.keys():
        hash_dict[i] = 1
    else:
        hash_dict[i] += 1
print(hash_dict)

{1: 2, 2: 1, 3: 1, 4: 1}


In [6]:
hash_dict = {}
s = "abcdedjkc"
for i in s:
    if i not in hash_dict.keys():
        hash_dict[i] = 1
    else:
        hash_dict[i] += 1
print(hash_dict)

{'a': 1, 'b': 1, 'c': 2, 'd': 2, 'e': 1, 'j': 1, 'k': 1}


### Division method in hashing
Consider we have array: [2,5,16,28,139] for this you end up creating a hash array of size 140, because the highest value in the array is 139. But lets say you are not allowed to create an hash array of size more than 10. This is where division method is used.  
The division method says that the index mapped for element in hash array will be equal to element % 10, if the maximum size allowed is only 10.  
Ex:  
index for 2 -> 2%10 = 2  
index for 5 -> 5%10 = 5  
index for 16 -> 16%10 = 6

In [7]:
arr = [2,5,16,28,139]
hasharr = [0]*10
for i in arr:
    hasharr[i%10] += 1
print(hasharr)

[0, 0, 1, 0, 0, 1, 1, 0, 1, 1]


### Collision
Now lets consider the following array given: [2,5,16,28,139,38,48,28,18]  
We found that i % 10 for elements 28,38,48, and 18 is same. This is where the collision happens.  
To solve this we do linear chaining, where a ***Linked-List*** is stored at the index 8, and each element is stored in that linked-list.