# Hashlib 
hashlib is a in-built mobule in python which is used for hashing the message into a binary hash values. This module implements a common interface to many different secure hash and message digest algorithms.

It includes FIPS secure hah alogrithms SHA1, SHA224, SHA256, SHA384 and SHA512, as well as RSA's MD5 algorithm.

### Hash algorithms
There is one constructor method named for each type of hash. All return a hash object with the same simple interface. for example use `sha256()` or `md5` to create a SHA-256 or MD5 hash object.

After creating a hash object we can feed this hash object with a byte-like object (or in simple word binary string or b' string ') using the update method.

Now after feeding we need to digest the byte-like object (or in simple word we need to secure the hash object.) by using the digest() or hexdigest() methods.

To use the hashlib first we need to import the hashlib.

In [127]:
import hashlib
import os

First of all we need to check all the available object and function in hashlib module by using __all__ attribute.

In [5]:
hashlib.__all__

('md5',
 'sha1',
 'sha224',
 'sha256',
 'sha384',
 'sha512',
 'blake2b',
 'blake2s',
 'sha3_224',
 'sha3_256',
 'sha3_384',
 'sha3_512',
 'shake_128',
 'shake_256',
 'new',
 'algorithms_guaranteed',
 'algorithms_available',
 'pbkdf2_hmac')

In the above list we have hash algorithms 'md5', 'sha1', 'sha224', 'sha256','sha384', 'sha512', 'blake2b','blake2s', 'sha3_224', 'sha3_256','sha3_384', 'sha3_512','shake_128','shake_256'.

`hashlib.new()` is a generic consturctor that takes the string name of the desired algorithms as its first parameters, it also exists to allow access to the above listed hashes as well as any other algorithms that your OpenSSL library may offer.

`hashlib.algorthms_guaranteed`: A set containing the names of the hash algoritms guaranteed to be supported by this module on all platforms.

`hashlib.algorithms_available`: A set containing the names of the hash algorithms that are available in the running python interpreter. These names will be recognized when passed to new(). _algorithms_guaranteed_ will always be a subset.
 

**Now we will see a md5 hash algorithm**

#### MD5 algorithm:

In [7]:
# to make a md5 hash object we need to use md5 method form the hashlib class
md5_hash = hashlib.md5()

Now we have created a hash object using a MD5 algorithm, all the consturctors can be initiated by using a b' string '  like  *md5_hash = hashlib.md5(b"Hello world")*,. 

Now after creating a hash object we need to feed it some bytes by using update() method.

In [43]:
md5_hash.update(b"Hello world")

Now after feeding some bytes to the update function we need to digest all the bytes by using the _digest()_ or _hexdigest()_ methods.

In [44]:
# digesting by using the digest method.
digest1 = md5_hash.digest()
print(digest1)

b'e\xaf\xd4;\xba\x1aZ~\xd7\x96\x1b\xb6\x9e\x02\xect'


In [45]:
# digesting by using the hexdigest method. it returns a hexadecimal formetted string.
digest2 = md5_hash.hexdigest()
print(digest2)

65afd43bba1a5a7ed7961bb69e02ec74


if we concate somting to "Hello world" then the hash value after digesting will be different.

In [46]:
# add some thing to Hello World and feeding to the md5_hash object.
md5_hash = hashlib.md5() # here we are re-initiating the constructor for new stirng.

'''
Note: if we dont make a new constructor here, then this new string will be concated into the 'Hello World' and new string 
will beccome 'Hello WorldHello World! Good Morning.'

To avoid this we have created a new hash object.
''' 
md5_hash.update(b"Hello World! Good Morning.")

# digesting with hexdigest
digest2 =  md5_hash.hexdigest()
print(digest2)

cd240beb1569adcdf21f652157c25871


Now you can see that both string have different hash value, Now we will look into some attribute of a hash object.

Since every hash object contains some attribute. We will use the dir method to see all the available attribute.

Note: dir() method returns all the availabel attributes, classes or metods if available in corresponding object and there will be a subset of common attributes of pyton object. we will remove all the common attributes. 

In [47]:
common_attribute = dir(object())

Now we will remove all the attributes which are common to every object.

In [30]:
print([i for i in dir(md5_hash) if i not in common_attribute])

['block_size', 'copy', 'digest', 'digest_size', 'hexdigest', 'name', 'update']


As you can see we have got some attribute as well as function. we already have seen update, digest and hexdigest so now will look into remaining attributes.
```ruby
blocksize: ' block size is the internal block size of the hash algoritm in bytes.'
digest_size: ' The digest_size is the size of the resulting hash in bytes.'
name: 'This returns the name of the hash algorithm which is used to create that hash object.'
copy: 'This is a method to make copy of the hash object, the copy of the hash object can be stored in anoter vaiable or instance.'
```
Now we will look into each attribute one by one.

##### Hash object attribute and function

In [48]:
# first of all we will look into the digest_size
md5_hash.digest_size

16

Here 16 tells us that after hashing a string it will give us a 16 bytes of data.

In [49]:
# block size of hash object.
md5_hash.block_size
# this 

64

In [50]:
# name of the algorithm used in that hash object.
md5_hash.name

'md5'

Now we will make a copy of the hash object (md5_hash).

In [51]:
new_hash = md5_hash.copy()

Now we have copied our hash object into another object, now we will perform all the above operation on this copy.

In [53]:
# working with b"Hello World! Good Morning."
print(new_hash.hexdigest()+"\n")


# now will look each attribute of the new_hash object
print("digest size: ",new_hash.digest_size)
print("block size: ",new_hash.block_size)
print("Name of the hash algorithm: ", new_hash.name)


cd240beb1569adcdf21f652157c25871

digest size:  16
block size:  64
Name of the hash algorithm:  md5


In [55]:
print(new_hash.hexdigest()+"\n")
print("len of hex string: ", len(new_hash.hexdigest()))

cd240beb1569adcdf21f652157c25871

len of hex string:  32


Now if this new_hash object will return the same hash value untill we don't feed new bytes to it. Now if we use the _update()_ method on the hash object it wiil concatenate bytes to new bytes and it will returns a new hash value for combined bytes. To see this we will use to different hash object sharing same hash alogritm.

In [59]:
# Now we will create two hash object 

# We will use this  string: Hello World! how are you.
single = hashlib.md5() 
'''
In this single hash object we will not use the update function, insted of this we will give complete string and we will match
the hash value from this object to other one.
'''

multi = hashlib.md5()
'''
In this we will use the update() mehtod to concatenate the string and after all concatenate we will match the finale hash value 
from the hash vlaue of single hash object.
'''

# feeding to single at once.
single.update(b"Hello World! how are you.")

single_hash_value = single.hexdigest()
print("hash value from single hash object: ",single_hash_value)


# feeding to multi by two times
multi.update(b"Hello World!")
print("first hash value from multi hash object: ",multi.hexdigest())

#feeding again
multi.update(b" how are you.") # not that, space shoud be considered during update.
print("final hash value from multi hash object: ",multi.hexdigest())
      
multi_hash_value = multi.hexdigest()

hash value from single hash object:  1cbb0a57f31511b426e2b9f89d52d9d9
first hash value from multi hash object:  ed076287532e86365e841e92bfc50d8c
final hash value from multi hash object:  1cbb0a57f31511b426e2b9f89d52d9d9


Now we will match both hash values. (_single_hash_value_  and _multi_hash_vlaue_)

In [60]:
print("is same single_hash_value and multi_hash_value same:", single_hash_value == multi_hash_value)

is same single_hash_value and multi_hash_value same: True


As we can see that they are same, becaue after concatenation both string will be same and there is same hash value for same string.

##### Putting all steps in single function.

In [80]:
def md5_single_hash(message):
    
    if isinstance(message, str):
        message = message.encode()
        # make a new hash object at every time
        Hash = hashlib.md5()
        Hash.update(message)
        return Hash.hexdigest()
        
    
    elif isinstance(message,bytes):
        Hash = hashlib.md5()
        Hash.update(message)
        return Hash.hexdigest()
    
    else:
        print(f"{message.__class__} is not supported in hashing please pass a bytes like object.")

In [82]:
md5_single_hash("Manish") # passing a string not binary string.

'c6cbb5c7a518d7e4152f77e90227a83f'

In [83]:
md5_single_hash(b"Manish") # passing a binary string.

'c6cbb5c7a518d7e4152f77e90227a83f'

## Key Derivation
Key derivation andd key stretching algorithms are diesgned for secure password hashing. Navie alforithms such as sha1() aor not resistant against brute-force attacks. A good passrd hashing function must be tunable, slow ans include a salt.

In this section we will see two key derivation functions.
1. pbkdf2_mac()
2. scrypt()

To know more about these function we can see in python documantation or wikipedia.

### 1. pbkdf2_mac()

This function generally takes five arguments to create a hash object:
```ruby
name: "In this we need to pass the name of the hash algorithm."
password: "In this we will pass a password "
salt: "This is the extra bytes to make the hashing more random."
iterations: "This is the iteration of hashing."
dklen: "In this we defines the digest_size of the hash object"
```
**Making  a hex key using key derivation method**

In [92]:
# making a key derivation using md5 hash algorithm
Hash = hashlib.pbkdf2_hmac(hash_name="md5" ,password=b"manish", salt=b"Kumar" , iterations =100100 , dklen = 64 )

Now if we print this.

In [95]:
# printing the hash value
print(Hash)

b'\xde\xf6\xc3!\xee\x1b\\(I\xa2Y\xfe\xfd\xac\xd9\xfd\xb9\x923\x83Z\xc9U\x08m\x8ahOp\xcc#G\x85\xaeK\\ x\xb9\xd3E\x00\xde#\xeb\x12;\x9b$\x86\xe5\xe7{7\xfc-\xda\xd1\xa4,J\x1b\xad4'


In [96]:
# printing the hash value in hex format
Hash.hex()

'def6c321ee1b5c2849a259fefdacd9fdb99233835ac955086d8a684f70cc234785ae4b5c2078b9d34500de23eb123b9b2486e5e77b37fc2ddad1a42c4a1bad34'

### 2. scrypt()

This function provides scrypt pasword-based key deerivation function as defined in RFC  7914. in general it takes seven parameters.
```ruby
scrypt(password, *, salt=None, n=None, r=None, p=None, maxmem=0, dklen=64)

password: "The b string by which we are generating a hash key."
salt: "extra b string to create a random and complex hash key"
n: "This is the CPU/Memory cost factor."
r: "This is block size of the hash object."
p:"parallelization factor."
maxmen:"maximum memory limit using by the hash algorithm."
dklen: "it is the digest lenght of hash object"
```

**Now creating a new key using the scrypt key derivation function**

In [102]:
Hash = hashlib.scrypt(password = b"manish", salt=b"kumar", dklen = 64 , n=4, r=32, p=1, maxmem=1024 )
# printing the hash key
print(Hash)

ValueError: Invalid parameter combination for n, r, p, maxmem.

we will see it later , with more information about this algorithm

----
## BLAKE2
BLAKE2 is a cryptogarphic hash function defined in RFC RFC 7693 that comes in two flavours:

`BLAKE2B:` optimized for 64-bit platforms and produces digests of any isze b/w 1-64 bytes.

`BLAKE2S:`optimized for 32-bit platforms and produces digests of any size b/w 1-32 bytes.

BLAKE2 supports keyed mode, salted hashing, personalization and tree hashing.

#### Creating hash objects
```ruby
blake2b(data=b'', /, *, digest_size=64, key=b'', salt=b'', person=b'', fanout=1, depth=1, leaf_size=0, node_offset=0, node_depth=0, inner_size=0, last_node=False)
```
`data:` initial chunk of data to hash, which must be bytes-like object. it can be passed only as positional argument.

`digest_size:` size of output digest in bytes.

`key:` key for keyed hashing:

`salt:` salt for the randomized hashing.

`person:` Personalization string.

In [104]:
# description of the black2b object using help function
help(hashlib.blake2b)

Help on class blake2b in module _blake2:

class blake2b(builtins.object)
 |  blake2b(data=b'', /, *, digest_size=64, key=b'', salt=b'', person=b'', fanout=1, depth=1, leaf_size=0, node_offset=0, node_depth=0, inner_size=0, last_node=False)
 |  
 |  Return a new BLAKE2b hash object.
 |  
 |  Methods defined here:
 |  
 |  copy(self, /)
 |      Return a copy of the hash object.
 |  
 |  digest(self, /)
 |      Return the digest value as a bytes object.
 |  
 |  hexdigest(self, /)
 |      Return the digest value as a string of hexadecimal digits.
 |  
 |  update(self, data, /)
 |      Update this hash object's state with the provided bytes-like object.
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data descrip

---
#### 1. Simple Hashing
This simple hashing is same as we done in above section (md5 section.), in this we dont provide any addtional paramaters to the blake2b function.

In [106]:
# simple hashing needs no addintional parameters.
# using 64-bit 
Hash = hashlib.blake2b()
Hash.update(b"Hello World")

print(Hash.hexdigest())
print("digest size: ",Hash.digest_size)

4386a08a265111c9896f56456e2cb61a64239115c4784cf438e36cc851221972da3fb0115f73cd02486254001f878ab1fd126aac69844ef1c1ca152379d0a9bd
digest size:  64


**we can also use a different digest size.**

In [107]:
Hash = hashlib.blake2b(digest_size=30)
Hash.update(b"Hello World")

print(Hash.hexdigest())
print("digest size: ",Hash.digest_size)

080b13e1d4ef82de77f15ec360e0eee3e3b51cf974f512d9f8ce6b974b6d
digest size:  30


`Note:` Note that digest size is an important fator to make a secure hash. larger bytes make a more complex hash values and reduces the collision risks.

---
#### 2. Keyed Hashing
Keyed hashing can be used for authentication as faster and simpler repleacement for hash based message authentication code (HMAC). BLAKE2 can be securrelu used in prefix-MAC mode.

To make a keyed hashing we just need to pass a key in the blake2b function.

In [108]:
# making a keyed hashing.
Hash = hashlib.blake2b(key = b"Manish", digest_size=30)
Hash.update(b"Hello World")

print(Hash.hexdigest())
print("digest size: ",Hash.digest_size)

db158dca338ce0a4b92c547b5471363e17a8488430e8f18fe3b6647f10c6
digest size:  30


In [109]:
# making a keyed hashing.
Hash = hashlib.blake2b(key = b"Kumar", digest_size=30)
Hash.update(b"Hello World")

print(Hash.hexdigest())
print("digest size: ",Hash.digest_size)

4dd0db4006738a2139d26b0ea76cb92eaa82eb0ad79ba485b3ad6442c707
digest size:  30


As you can see that with different key we have different hash value for common message. by using a keyed hashing we can make a simple user verifiation programme.

In [110]:
from hmac import compare_digest

In [118]:
# now to make a keyed hashing we need two things one key and digest size.
KEY = b"Hello thie is security key"
SIZE = 30 # 30 bytes

# Now we will make a sign function which will hash the user information.
def user_sign(user_info):
    # making a hash object
    Hash = hashlib.blake2b(key = KEY , digest_size=SIZE)
    # feed the message to the object
    Hash.update(user_info.encode())
    return Hash.hexdigest()


# now we will 
def varify_information(user_info, stored_info):
    """
    stored_info: it is the information which is stored in the system during the sign in.
    user_info: it the information when we login in the system then it will be mathched with our stored information. 
                if they match it means that user varified.
                
                we will use the compare_digest() fucntion from hmac module.
    """
    # using the user_sign to hash the user info.
    result = user_sign(user_info)
    
    return compare_digest(result, stored_info)
    
    

Now we have the user_sing_info functionality to matches the user functionallity 

In [121]:
signin = user_sign("Manish Kumar")

Now we have signed up, now if we loging. with different user then it will return False if the user informatin wrong.

In [124]:
login = varify_information("Rahul Kumar", signin)
print("login states: ",login)

login states:  False


Now if we login with the signed up user.

In [125]:
login = varify_information("Manish Kumar", signin)
print("login states: ",login)

login states:  True


---
#### 3. Randomized Hashing
By setting salt parameters users can introduce randomization to the hash function. Randomized hashing is useful for protecting against collistion attacks on the hash funcition used in digital signatures.

In [126]:
hashlib.blake2b.SALT_SIZE

16

In [128]:
os.urandom(hashlib.blake2b.SALT_SIZE)

b'\t\x998\x995\xa18Eh8\x14\x11)\xf6\xaaG'

In [130]:
# making a randomized hashing using a salt key.
# in this we will hash a message with two different random salt
salt1 = os.urandom(hashlib.blake2b.SALT_SIZE)
salt2 = os.urandom(hashlib.blake2b.SALT_SIZE)

message = b"Hello every one how are you."

# hashing with first salt.
Hash1 = hashlib.blake2b(salt =salt1 ,digest_size=30)
Hash1.update(message)

# hashing with second salt.
Hash2 = hashlib.blake2b(salt =salt2 ,digest_size=30)
Hash2.update(message)

print("Hash with salt1: ",Hash1.hexdigest())
print("Hash with salt2: ",Hash2.hexdigest())
print("Is both are same: ",compare_digest(Hash1.hexdigest(),Hash2.hexdigest()))

Hash with salt1:  e8d9bd2e61442718691cac8a8e2b1b42ca19aa541b9045fc3251f337d12c
Hash with salt2:  048113583a2c63fd7985d9877caa6b928bc3e55d0918749db87e4a049369
Is both are same:  False


---
#### 4. Personalization Hashing
Sometimes it is useful to force hash function to produce different digests for the same input for different purposes.

In [131]:
# now we will use the same message but with to different person key.
Message = b"Hello every one how are you."
Person1 = b"Manish Kuamr"
Person2 = b"Rahul Kumar"

# hashing with first salt.
Hash1 = hashlib.blake2b(person=Person1 ,digest_size=30)
Hash1.update(Message)

# hashing with second salt.
Hash2 = hashlib.blake2b(person=Person2 ,digest_size=30)
Hash2.update(Message)

print("Hash with salt1: ",Hash1.hexdigest())
print("Hash with salt2: ",Hash2.hexdigest())
print("Is both are same: ",compare_digest(Hash1.hexdigest(),Hash2.hexdigest()))

Hash with salt1:  10c249724ce7a0f83200d0f5e5d0eee479182e10f1d4ebd3e908f52ce802
Hash with salt2:  1422220c19571721c300c1a90a2a943167d243609bdf344e82dd963b70a2
Is both are same:  False
