# Hashing
Message -> Hash Function -> Hash Value(number)

variable-length input -> fixed-length output

## Hash function properties
- deterministic behaviour - for given input produces the same output
- fixed-length hash values 
- avalanche effect - when small difference in message results in large differences between hash values

eg. `hash()` function used to hash keys in Python dicts. For cryptographic hash function additional properties must be met.


## Cryptographic hash function properties 
- One-way function property - must be difficult to identify input from given output
- Weak collision resistance - given one message it's infeasible to identify second message that computes to the same hash value
- Strong collision resistance - it's infeasible to find any collision at all


In [None]:
print(hash("dupa"))
print(hash("dupa"))
print(hash("dupa2"))

## Cryptographic hashing in Python

In [None]:
import hashlib

# list of all hash algorithms
print(hashlib.algorithms_available)

# list of hash algorithms available on all platforms
print(hashlib.algorithms_guaranteed)

`MD5` and `SHA1` are no longer suitable for data integrity.

Use `SHA2` (standard) or `SHA3` (new standard) or `Blake` (fast). Most common is `SHA256`.


In [None]:
import hashlib

# Python 3 strings saved in unicode code points (UTF-8)
# Hash function argument must be bytes; strings must be encoded to become bytes
hash1 = hashlib.sha256(b'duuupa')
hash2 = hashlib.sha256('duuupa'.encode())
print(hash1.digest_size, 'bytes')

# Hash value in str
print(hash1.hexdigest())
print(hash2.hexdigest())

# Hash value in bytes
print(hash1.digest())
print(hash2.digest())


In [None]:
from hashlib import sha256

# Chunked hash generation using update()
many = sha256()
many.update(b'm')
many.update(b'e')
many.update(b's')
many.update(b's')
print(many.hexdigest())

print(sha256(b'mess').hexdigest())

## Checksum functions
Checksums (eg. CRC, Adler-32) are fast and have insufficient collision resistance - can be used for error detection.

Hash functions (SHA2 family, SHA3 family) are slower and have sufficient collision resistance - can be used for testing data integrity.

In [None]:
import zlib

# CRC checksum collision
print(zlib.crc32(b'gnu'))
print(zlib.crc32(b'codding'))

# Adler-32 checksum no collision
print(zlib.adler32(b'gnu'))
print(zlib.adler32(b'codding'))


# Keyed hashing
## Data authentication
Data authentication (who authored the change?) - requires __key__ and a __keyed hash function__

### Key generation
Key can be in form of:
- random number - sequence of random numbers
- passphrase - sequence of random words

### Random number
Keys that are hard to remember

Use `secrets` module. Do not use `random` module.

In [None]:
import os

# random secret generation - 16 bytes
print(os.urandom(16))

In [None]:

from secrets import token_bytes, token_hex, token_urlsafe

# random secret generation - 16 bytes
print(token_bytes(16))
print(token_hex(16))
print(token_urlsafe(16))

### Passphrases
Keys that are easy to remember

In [None]:
import secrets
from pathlib import Path

words = Path('wordlist.txt').read_text().splitlines()
passphrase = ' '.join(secrets.choice(words) for i in range(4))
print(passphrase)

### Keyed hashing
__Keyed hash functions__ use _key and message_ to produce _hash value_. The same message with different key is hashed to different hash value. Only some functions can do that by default like `blake2b`.

In [None]:
from hashlib import blake2b

m = b'message'
x = b'key x'
y = b'key y'

print(blake2b(m, key=x).hexdigest())
print(blake2b(m, key=y).hexdigest())

## HMAC functions
Hash-based Message Authentication Code.

HMAC functions allow any generic hashing function to become keyed hash functions. It has 3 inputs:
- message
- key
- hash function

In [None]:
import hashlib
import hmac

xx = hmac.new(key=b'key', msg=b'message', digestmod=hashlib.sha3_256)
print(xx.name)  # protocol name prefixed with HMAC
print(xx.hexdigest())

### Data authentication between parties

In [None]:
import hashlib
import hmac
import json
hmac_sha256 = hmac.new(b'shared_key', digestmod=hashlib.sha256)
message = b'from Bob to Alice'
hmac_sha256.update(message)
hash_value = hmac_sha256.hexdigest()
authenticated_msg = {
    'message': list(message),
    'hash_value': hash_value, 
    }
outbound_msg_to_alice = json.dumps(authenticated_msg)

In [None]:
import hashlib
import hmac
import json
authenticated_msg = json.loads(inbound_msg_from_bob)
message = bytes(authenticated_msg['message'])
hmac_sha256 = hmac.new(b'shared_key', digestmod=hashlib.sha256)
hmac_sha256.update(message)
hash_value = hmac_sha256.hexdigest()
if hash_value == authenticated_msg['hash_value']:
    print('trust message')

## Timing attacks
String comparison of 2 hash values is faster if they are different (evaluates to False faster). Attacker can measure response time to invalid hash (_side channel attack_).

To mitigate this problem use length-constant time or random time. Always use `compare_digest()` for hashes.


In [None]:
from hmac import compare_digest

compare_digest('abc', 'abcd')

# Symmetric encryption
Encryption - process of obfuscating plaintext into ciphertext using cipher (encryption algorithm) together with key. Encryption ensures confidentiality.

Decryption - reverse process

Fernet guarantees that a message encrypted using it cannot be manipulated or read without the key. Fernet is an implementation of symmetric (also known as “secret key”) authenticated cryptography. Fernet also has support for implementing key rotation via MultiFernet.

In [None]:
from cryptography.fernet import Fernet
key = Fernet.generate_key()  # random 32 byte key in bytes format
print(key)
fernet = Fernet(key)  # general purpose class initialized with key

# token - combined ciphertext HMAC hash value from that ciphertext (confidentiality + message authenticity???)
token = fernet.encrypt(b'duuupa')
token


## Key rotation
Key rotation - retire one key with another. It means decrypting all ciphertext with old key and reencrypting them with new key

In [12]:
from cryptography.fernet import Fernet, MultiFernet

# Encrypting with old key
old_key = Fernet.generate_key()
old_fernet = Fernet(old_key)
old_token1 = old_fernet.encrypt(b'dupa')
old_token2 = old_fernet.encrypt(b'krowa')

# Creating new key
new_key = Fernet.generate_key()
new_fernet = Fernet(new_key)

multi_fernet = MultiFernet([new_fernet, old_fernet])
# List of tokens encrypted with old key
old_tokens = [old_token1, old_token2]
print(old_tokens)
# Decrypting old tokens and reencrypting them with new key
new_tokens = [multi_fernet.rotate(t) for t in old_tokens]
print(new_tokens)

#replace_old_tokens(new_tokens)
#replace_old_key_with_new_key(new_key)
#del old_key

# Decrypt after rotation
for new_token in new_tokens:
    plaintext = new_fernet.decrypt(new_token)
    print(plaintext)

[b'gAAAAABhZf2m8fsWiJSZZ-nr1O7zz303Y_hh4cvc4I3BKru7foFj5LkEiY0mvS-y6QUOPbYJUMe3RHjx2mahgykAnpH0_zmxHw==', b'gAAAAABhZf2msgsd5HBERRvlJLkS_ZwBqFr4EzKEmvv56Yyg9vlWHu1-XS9Lv2c6rZ73A9gH37p6Vg1Apa3KiF7p58oDa9c_sA==']
[b'gAAAAABhZf2mOlIZSZWO584VDsvhWALqC_ibYnZUXR9D7A1GGDAaTtJpQSgK2sSowtUZuhLrmDq4rPiNXIJMtf7alpGFQtNKFg==', b'gAAAAABhZf2m-vtRV8LyCt4JFqEnnP4WaheX6OyL4hQK34SSCuLb2pqq25hPHZkHErMg8ty3LXNMOwWLfK5vc2ytKOYBG2m5HA==']
b'dupa'
b'krowa'


# Asymmetric encryption

# Transport Layer Security