# Hashing

- cryptographic package: https://www.pycryptodome.org/en/latest/src/hash/hash.html

In [1]:
from Crypto.Hash import MD5

In [2]:
# MD5.new() only accepts binary values so by putting a 'b' in front of the string, it converts the string to binary
easy_password_hash = MD5.new(b'password')
easy_password_hash.hexdigest()

'5f4dcc3b5aa765d61d8327deb882cf99'

### Avalanche effect

In [3]:
print('The MD5 hash of \'{}\' is {}'.format('password', MD5.new(b'password').hexdigest()))
print('The MD5 hash of \'{}\' is {}'.format('passwore', MD5.new(b'passwore').hexdigest()))

for i in range(10):
    plaintext_password = 'password{}'.format(i)
    print('The MD5 hash of \'{}\' is {}'.format(plaintext_password, MD5.new(plaintext_password.encode('ascii')).hexdigest()))

The MD5 hash of 'password' is 5f4dcc3b5aa765d61d8327deb882cf99
The MD5 hash of 'passwore' is a826176c6495c5116189db91770e20ce
The MD5 hash of 'password0' is 305e4f55ce823e111a46a9d500bcb86c
The MD5 hash of 'password1' is 7c6a180b36896a0a8c02787eeafb0e4c
The MD5 hash of 'password2' is 6cb75f652a9b52798eb6cf2201057c73
The MD5 hash of 'password3' is 819b0643d6b89dc9b579fdfc9094f28e
The MD5 hash of 'password4' is 34cc93ece0ba9e3f6f235d4af979b16c
The MD5 hash of 'password5' is db0edd04aaac4506f7edab03ac855d56
The MD5 hash of 'password6' is 218dd27aebeccecae69ad8408d9a36bf
The MD5 hash of 'password7' is 00cdb7bb942cf6b290ceb97d6aca64a3
The MD5 hash of 'password8' is b25ef06be3b6948c0bc431da46c2c738
The MD5 hash of 'password9' is 5d69dd95ac183c9643780ed7027d128a


In cryptography, the avalanche effect is the desirable property of cryptographic algorithms wherein if an input is changed slightly (for example, flipping a single bit), the output changes significantly (e.g., half the output bits flip).

This property provides the following benefits:
- prevent people from making predictions about the input, being given only the output
- makes it obvious that the data's integrity is affected even if it is only a 1 bit change.



### Reverse lookup hashes
Try searching for '5f4dcc3b5aa765d61d8327deb882cf99' in the internet. Are you able to find the original text from the hash?

Rainbow table attacks makes use of precomputed hashes tables to quickly identify your password from hashes.

To circumvent such attacks, add salt before hashing to significantly increase the size of the precomputed hash table needed to do reverse lookup of hashes. The longer the salt length, the larger the required precomputed hash table

### Salt generation

Salt are additional characters that are randomly generated and added to the back of the original string before hashing to make dictionary attacks harder to execute (as the dictionary of known passwords would have to be multiple times bigger than dictionaries for unsalted password)

## Exercise

1. Write a function that returns a random string based on given character set and length, or if arguments were not provided, to use the following defaults

```
   char_set = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
```

In [4]:
import random

def generate_salt(char_set: str = None, length: int = 16)-> str:
    """Generates a random string of alphanumeric characters as salt
    
    Args:
        char_set: string containing the characters used for salt generation
        length: the length of salt required
    Return:
        randomly generated salt based on char_set and length
    
    """
    ##################
    # YOUR CODE HERE #
    ##################
    
    if char_set is None:
        char_set = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" # fill in the default character set here
    
    chars=[]
    for i in range(length):
        chars.append(random.choice(char_set))

    return "".join(chars)


``` 
2. Generate and print the salt using the function above
```

In [5]:
salt = generate_salt()
print(salt)

VhEV1XOFu6UDzvcG


```
3. Generate a salt that will contain only the letters "ABCDEF"
```

In [6]:
salt = generate_salt("ABCDEF")
print(salt)

DECFACEFEBADFCBB


```
4. Hash 'password123' using MD5 and retrieve the hexdigest()
```

In [7]:
hashed_data = MD5.new(b'password123')
hashed_data.hexdigest()

'482c811da5d5b4bc6d497ffa98491e38'

With the retrieved value from the hash, go to https://md5.gromweb.com/ and try to reverse the hash.

Using the SALT that you have randomly generated, update the hashing to include the SALT.

In [8]:
# append salt to the original easy password
hashed_data.update(salt.encode('ascii'))
hashed_data.hexdigest()

'58ac22dfe729c40fa2414d212a3b8f65'

With the newly retrieved value from the hash, you can try to reverse the hash again at https://md5.gromweb.com/

It is very likely that you cannot reverse the hash after applying SALT.

```So remember to add SALT to your hashes!```

## End of Exercise