# Encryption and Hashing with Python

We've experimented with creating simple ciphers for encryption, but for true encryption purposes, we will want to make sure to use well established protocols and alogrithms, as the simple ciphers we've created can still be easily broken through simple techniques such as frequency analysis. Let's explore to concepts in cryptography, hashing and encryption. We will explore real libraries for this instead of creating our own methods. 

## Hashing

Hashing refers to the process of passing information through a hashfunciton. A hash function is any function that can be used to map data of arbitrary size to data of fixed size. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. 

In the context of security, hashing refers to taking a piece of data, such as a document or password, and passing it through a cryptographically secure hash function that is close to impossible to reverse. This allows for secure storage of passwords. For example, when you type a password into Gmail, the password is passed through a hash function that turns it into a hashed version of your password, which can be thought of as a mix of letters and numbers. Gmail then checks that the hash you provided matches the hash they have on file, if it does, they let you log in. This is secure, because even if someone "hacks" Gmail's servers and can see everything, they only will have access to the hashed versions of the passwords, which are close to impossible to reverse. 

There are [many hashing algorithms](https://en.wikipedia.org/wiki/Cryptographic_hash_function), some older ones such as MD5 have been deemed to insecure to still be used. Some of the most popular cryptographic hash functions were created by the NSA.

The ideal cryptographic hash function has five main properties:

* it is deterministic so the same message always results in the same hash
* it is quick to compute the hash value for any given message
* it is infeasible to generate a message from its hash value except by trying all possible messages
* a small change to a message should change the hash value so extensively that the new hash value appears uncorrelated with the old hash value
* it is infeasible to find two different messages with the same hash value

Let's explore hashing with Python's built in hashlib module

In [1]:
import hashlib

As you can see there are many algorithms available, most of these algorithms have their own dedicated documentation you can find with a quick google search.

In [4]:
hashlib.algorithms_available
# Some not recommended e.g. MD4 MD5
# Other are cutting edge e.g. Blake, SHA3
# can download more alogorithms as well

{'blake2b',
 'blake2b512',
 'blake2s',
 'blake2s256',
 'md4',
 'md5',
 'md5-sha1',
 'mdc2',
 'ripemd160',
 'sha1',
 'sha224',
 'sha256',
 'sha3-224',
 'sha3-256',
 'sha3-384',
 'sha3-512',
 'sha384',
 'sha3_224',
 'sha3_256',
 'sha3_384',
 'sha3_512',
 'sha512',
 'sha512-224',
 'sha512-256',
 'shake128',
 'shake256',
 'shake_128',
 'shake_256',
 'sm3',
 'whirlpool'}

We can create a hash object that will allow us to use an algorithm to create a hexdigest version of the string provided. Remember that hashes are one-way only, for two way wait until we discuss encryption and decryption.

In [6]:
hash_obj = hashlib.sha3_256()
# sha3-256 created in open competition by NSA

In [7]:
hash_obj.update(b'hello') # byte string

In [9]:
hash_obj.hexdigest() # the hashed output of the byte string

'3338be694f50c5f338814986cdf0686453a888b84f424d792af4b9202398f392'

Now let's try to make a minimal change to the string and see how much the hash changes:

In [10]:
hash_obj.update(b'Hello')  # upper case the H

In [11]:
hash_obj.hexdigest() # encrypted output is completely different

'dbf2373d7f1f49c6bc4ded308a7d22426b376d7c028619fc03d8a0daa14d307d'

## Encryption

Now that understand how hashing works with the built-in hashlib library in Python, let's explore encryption. For this we will need to install another open source library, simply known as cryptography. Install it by running the following at your command line:

    pip install cryptography
    
Encryption is a hard problem, because now you need to be able to go both ways, encrypt a message so that others can't read it, but then also be able to have someone with the proper access be able to read the message as well. You also need to be able to try to prevent people from using techniques such as [frequency analysis](https://en.wikipedia.org/wiki/Frequency_analysis) or other decryption methods. [Cryptography itself is heavily rooted in mathematics and interested parties should definitely read about the amazing history of the topic.](https://en.wikipedia.org/wiki/Cryptography#Modern_cryptography)

For our use cases we will use a simple private-key cryptograph. In this situation we will generate a "secret" key, that allows someone to decrypt our message. We will then use this key to create a cipher that can encrypt or decrypt messages that use that secret key.

Let's see an example:

In [12]:
from cryptography.fernet import Fernet

In [13]:
key = Fernet.generate_key()

In [14]:
key

b'9TKSwUSH5fIxxIIzZ2Py1FSgVGkcIvrOwWgVAAWrZHU='

In [15]:
cipher = Fernet(key)

Now let's encrypt a simple message.

In [16]:
cipher.encrypt(b'Hello are you there?')

b'gAAAAABegtESrTtgH_26yZfkilnCIz_BAjsb0mEnscVjSuJXeY5HUuU4wexOVFFJ3Gy057bFX3cUQtKLdOxhZz0zOQMhkCrMTrRvV9UBB-3nXLZJTLjl75s='

In [17]:
text = b'Hello are you there?' # Note the b'' in front of the string

In [18]:
encrypted_text = cipher.encrypt(text)

In [19]:
encrypted_text

b'gAAAAABegtFnVxMziEpphskPBSVmN6GZqwtXKjILGgwxboGw77xG7Im_Zc67fUwi1hqGxI28fXXF0m-LsqqfC6IlWc4XPqmIPii030G9zLIhIlIooHawCX4='

Now it becomes a very difficult task for someone to try to decode what is in this message without the secret key. With the key however, we can easily decrypt it:

In [20]:
other_cipher = Fernet(key) # The other user could create their own cipher

In [22]:
decrypted_text = other_cipher.decrypt(encrypted_text)

In [23]:
decrypted_text

b'Hello are you there?'

### Creating Custom Key

So far we've only generated keys. These keys are SHA256 hashes. If we wanted to generate our own key that is a simple string, we could pass that simple string through the SHA256 algorithm and then use that hash as the Fernet key. We can do this using the [base64 library](https://docs.python.org/3/library/base64.html).

**Please note, that in general this process of creating a custom password from a string is not recommended, as it will lead to passwords that are not as secure as the ones generated by the cryptography library. [Full Discussion Link](https://github.com/pyca/cryptography/issues/1333).**

Let's look at an example:

In [25]:
keyword = b'my custom string' # Create the keyword string password

In [26]:
# Use SHA3 256 bit to Hash the keyword string
key = hashlib.sha3_256(keyword)

In [27]:
key

<_sha3.sha3_256 at 0x2a2cf433530>

In [30]:
# Get the digest - this is not the format that Fernet can accept, need to encode to base64
key.digest()

b'\x86\x04<\x12\xb1\xbc\x97\xa2\xc3hd\x1b\xdf\xcb\xa6\xa6\x88\x1bSU<kE\xd2\x13\xdf\x7f\xca\xb7\x91,\x81'

In [31]:
key.hexdigest()

'86043c12b1bc97a2c368641bdfcba6a6881b53553c6b45d213df7fcab7912c81'

Now we need to necode the digest to base64 to have it in a format that Fernet will accept:

In [32]:
import base64

In [33]:
# The bytes digest
key_bytes = key.digest()

# Encode the bytes digest
fernet_key = base64.urlsafe_b64encode(key_bytes)

In [34]:
fernet_key

b'hgQ8ErG8l6LDaGQb38umpogbU1U8a0XSE99_yreRLIE='

In [35]:
custom_cipher = Fernet(fernet_key)

In [37]:
encrypted_msg = custom_cipher.encrypt(b'this is a secret message')

In [38]:
encrypted_msg

b'gAAAAABegtQEm5_7Zc5zu5TFc1ARWlvBnD58UawnIwiuaH87NZRZcLmzkudgi-gKYe5SV0oPqKQN1FO1S9sT7MdN7i_WGOBmyfu9lV6LFkTqNXf1qZmgMDE='

In [39]:
custom_cipher.decrypt(encrypted_msg)

b'this is a secret message'

_____
Excellent work agent! Its time to apply this to the field!