# Encryption and Hashing with Python

We've experimented with creating simple ciphers for encryption, but for true encryption purposes, we will want to make sure to use well established protocols and alogrithms, as the simple ciphers we've created can still be easily broken through simple techniques such as frequency analysis. Let's explore to concepts in cryptography, hashing and encryption. We will explore real libraries for this instead of creating our own methods. 

## Hashing

Hashing refers to the process of passing information through a hashfunciton. A hash function is any function that can be used to map data of arbitrary size to data of fixed size. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. 

In the context of security, hashing refers to taking a piece of data, such as a document or password, and passing it through a cryptographically secure hash function that is close to impossible to reverse. This allows for secure storage of passwords. For example, when you type a password into Gmail, the password is passed through a hash function that turns it into a hashed version of your password, which can be thought of as a mix of letters and numbers. Gmail then checks that the hash you provided matches the hash they have on file, if it does, they let you log in. This is secure, because even if someone "hacks" Gmail's servers and can see everything, they only will have access to the hashed versions of the passwords, which are close to impossible to reverse. 

There are [many hashing algorithms](https://en.wikipedia.org/wiki/Cryptographic_hash_function), some older ones such as MD5 have been deemed to insecure to still be used. Some of the most popular cryptographic hash functions were created by the NSA.

The ideal cryptographic hash function has five main properties:

* it is deterministic so the same message always results in the same hash
* it is quick to compute the hash value for any given message
* it is infeasible to generate a message from its hash value except by trying all possible messages
* a small change to a message should change the hash value so extensively that the new hash value appears uncorrelated with the old hash value
* it is infeasible to find two different messages with the same hash value

Let's explore hashing with Python's built in hashlib module

In [1]:
import hashlib

As you can see there are many algorithms available, most of these algorithms have their own dedicated documentation you can find with a quick google search.

In [3]:
hashlib.algorithms_available

{'DSA',
 'DSA-SHA',
 'MD4',
 'MD5',
 'RIPEMD160',
 'SHA',
 'SHA1',
 'SHA224',
 'SHA256',
 'SHA384',
 'SHA512',
 'blake2b',
 'blake2s',
 'dsaEncryption',
 'dsaWithSHA',
 'ecdsa-with-SHA1',
 'md4',
 'md5',
 'ripemd160',
 'sha',
 'sha1',
 'sha224',
 'sha256',
 'sha384',
 'sha3_224',
 'sha3_256',
 'sha3_384',
 'sha3_512',
 'sha512',
 'shake_128',
 'shake_256',
 'whirlpool'}

We can create a hash object that will allow us to use an algorithm to create a hexdigest version of the string provided. Remember that hashes are one-way only, for two way wait until we discuss encryption and decryption.

In [4]:
hash_obj = hashlib.sha3_256()

In [5]:
hash_obj.update(b"Hello")

In [6]:
hash_obj.hexdigest()

'8ca66ee6b2fe4bb928a8e3cd2f508de4119c0895f22e011117e22cf9b13de7ef'

Now let's try to make a minimal change to the string and see how much the hash changes:

In [7]:
hash_obj.update(b'hello')

In [9]:
hash_obj.hexdigest()

'e52212c71ea57784000b60cae4d0d6a8ab08e17ad72902525a2cbe7e87f77ab6'

## Encryption

Now that understand how hashing works with the built-in hashlib library in Python, let's explore encryption. For this we will need to install another open source library, simply known as cryptography. Install it by running the following at your command line:

    pip install cryptography
    
Encryption is a hard problem, because now you need to be able to go both ways, encrypt a message so that others can't read it, but then also be able to have someone with the proper access be able to read the message as well. You also need to be able to try to prevent people from using techniques such as [frequency analysis](https://en.wikipedia.org/wiki/Frequency_analysis) or other decryption methods. [Cryptography itself is heavily rooted in mathematics and interested parties should definitely read about the amazing history of the topic.](https://en.wikipedia.org/wiki/Cryptography#Modern_cryptography)

For our use cases we will use a simple private-key cryptograph. In this situation we will generate a "secret" key, that allows someone to decrypt our message. We will then use this key to create a cipher that can encrypt or decrypt messages that use that secret key.

Let's see an example:

In [27]:
from cryptography.fernet import Fernet

In [28]:
key = Fernet.generate_key()

In [29]:
# Note the b'' in front of the string
key

b'X9yqpLviaoNgsA6lQzmb5exX_AnWkoHvt7nCqZ-gb6I='

In [30]:
cipher = Fernet(key)

Now let's encrypt a simple message.

In [31]:
text = b"Hello. Are you there?"

In [32]:
encrypted_text = cipher.encrypt(text)

In [34]:
encrypted_text 

b'gAAAAABaQXsCt_lr6ISA_Zn4VVKMGyksSaaNkKJf2ZEX-Rhw5IPwCHfd_6EM4vDh9vmZ2QeU-CFVzHoZS6SAKdJzbU3HI_ggCV3hgzU5VpQ92tK5vaacriE='

Now it becomes a very difficult task for someone to try to decode what is in this message without the secret key. With the key however, we can easily decrypt it:

In [35]:
# The other user could create their own cipher
other_cipher = Fernet(b'X9yqpLviaoNgsA6lQzmb5exX_AnWkoHvt7nCqZ-gb6I=')

In [36]:
decrypted_text = other_cipher.decrypt(encrypted_text)

In [37]:
decrypted_text

b'Hello. Are you there?'

Excellent work agent! Its time to apply this to the field!