# Table of contents:

* [Introduction to block ciphers](#intro-block)
* [Padding a message](#message-padding)
* [The Advanced Encryption Standard (AES)](#AES)
* [Modes of operation of block ciphers](#modes)
* [Size of the output ciphertex on AES](#size)
* [Bonus: Fernet cipher](#fernet)
    
Author: [Sebastià Agramunt Puig](https://github.com/sebastiaagramunt) for [OpenMined](https://www.openmined.org/) Privacy ML Series course.



## Block Ciphers <a class="anchor" id="intro-block"></a>

Block ciphers as opposed to stream ciphers take a block of the plaintext (a specific amount of bytes) and encrypts it into a block with the same size. In this section we will use the Advanced Encryption Standard (AES) to understand block ciphers. In the next schema it is shown how an original message of arbitrary $N$ bytes is converted into a ciphertext having blocks of $K$ bytes. The ciphertext size is always a multiple of $K$ bytes.

<img src="img/block_cipher.png" style="width:1100px"/>

## Padding a message <a class="anchor" id="message-padding"></a>

Most of the times the lenght of the message is not a multiple of the block size so we need to "pad" the message to have the required length. A common padding function is [PKCS7](https://en.wikipedia.org/wiki/Padding_(cryptography)). Basically what PKCS7 does is appendinng a list of bytes with the same value corresponding to the number of bytes needed to complete the block.


We will use PKCS7 it in the next example:

In [1]:
from crypto import bytes_to_bin, bytes_to_hex

message = b"Cryptography is a complex subject after all..."
bin_repr = bytes_to_bin(message, pre="")
hex_repr = bytes_to_hex(message, pre="")

print(f"message:\n\t{message}")
print(f"(bin) \n\t{bin_repr}")
print(f"(hex) \n\t{hex_repr}")

message:
	b'Cryptography is a complex subject after all...'
(bin) 
	01000011011100100111100101110000011101000110111101100111011100100110000101110000011010000111100100100000011010010111001100100000011000010010000001100011011011110110110101110000011011000110010101111000001000000111001101110101011000100110101001100101011000110111010000100000011000010110011001110100011001010111001000100000011000010110110001101100001011100010111000101110
(hex) 
	43727970746f677261706879206973206120636f6d706c6578207375626a65637420616674657220616c6c2e2e2e


In [2]:
print(f"message is {len(message)} bytes or {len(bin_repr)} bits")

message is 46 bytes or 368 bits


In [3]:
def PKCS7(m: bytes, block_size_bytes = 16):
    n_bytes = block_size_bytes - len(m)%block_size_bytes
    pad = bytes([n_bytes for _ in range(n_bytes)])
    return m + pad

padded_message = PKCS7(message)
print(padded_message)

b'Cryptography is a complex subject after all...\x02\x02'


In [4]:
from cryptography.hazmat.primitives import padding

block_size_bits = 128

padder = padding.PKCS7(block_size_bits).padder()
padded_message = padder.update(message) + padder.finalize()

print(f"message:\n\t'{message}'")
print(f"\npadded_data: \n\t{padded_message}\n")

print(f"bytes per block: {int(block_size_bits/8)}")
print(f"bits per block: {block_size_bits}")
print(f"message length: {len(message)}")
print(f"padded_message lenght: {len(padded_message)}")

message:
	'b'Cryptography is a complex subject after all...''

padded_data: 
	b'Cryptography is a complex subject after all...\x02\x02'

bytes per block: 16
bits per block: 128
message length: 46
padded_message lenght: 48


## Encrypting using AES (Advanced Encryption Standard) <a class="anchor" id="AES"></a>

AES is a block cipher that was established as a standard by NIST in 2001 (after a public call to improve/substitute DES encryption algorithm in 1997). AES is a subset of the Rijndael block cipher developed by Vincent Rijmen and Joan Daemen submitted to NIST during the [AES selection process](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard_process).


We are not going to go into the details of te exact implementation but the readers are referred to the book of [Katz and Lindell](http://www.cs.umd.edu/~jkatz/imc.html) Chapter 6 section 2. Also Mike Pound explains AES in this [video](https://www.youtube.com/watch?v=O4xNJsjtN6E&t=524s&ab_channel=Computerphile), check it out!

In [5]:
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend
import os

secret_key = os.urandom(32)

cipher = Cipher(algorithms.AES(secret_key), modes.ECB(), backend=default_backend())

encryptor = cipher.encryptor()
decryptor = cipher.decryptor()

In [6]:
ctx = encryptor.update(padded_message) + encryptor.finalize()
plx = decryptor.update(ctx) + decryptor.finalize()

print(f"ciphertext:\n\t{ctx}")
print(f"plaintext:\n\t{plx}")

ciphertext:
	b' \x86\xbf\x9b\xbb\x13\xe4\xb0\x1d\xba\x99\xe7\xae\xbe\x9d!\xbc\xe3\x07\xeb\x01r!\xf1\xce^\x08\xe8J\x9f\x8a\xf7\xff\xd9\x9b\xfcx\x03dJ\xce:K\x80\xa0~!\x18'
plaintext:
	b'Cryptography is a complex subject after all...\x02\x02'


## Modes of operation of block ciphers <a class="anchor" id="mode"></a>

A block cipher by itself is only suitable for the secure cryptographic transformation (encryption or decryption) of one fixed-length group of bits called a block. A mode of operation describes how to repeatedly apply a cipher's single-block operation to securely transform amounts of data larger than a block ([Wikipedia](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation)).

The first mode is "not doing anything", this is the Electronic Codebook mode. See the figure below (from Wikipedia).

<img src="img/ECB_mode.png" style="width:1100px"/>

We are lucky and in ```cryptography``` package ECB implemented in ```cryptography.hazmat.primitives.ciphers.ECB``` function (we've seen in the previous example!).

In [7]:
secret_key = os.urandom(32)

cipher = Cipher(algorithms.AES(secret_key), modes.ECB(), backend=default_backend())

encryptor = cipher.encryptor()
decryptor = cipher.decryptor()

Now we can encrypt the same message twice and see what we get in the ciphertext:

In [8]:
padded_message

b'Cryptography is a complex subject after all...\x02\x02'

In [9]:
new_message = padded_message+padded_message

ctx = encryptor.update(new_message) + encryptor.finalize()
print(ctx[0: len(padded_message)])
print(ctx[len(padded_message):])

b'\xeer^\x84\xe1\xae#\x01\xbcQ19\x17\xd6%H\x11\x13Uh\xfa2\xa7\xf4\x9d" \xfe\tj\xf7.#)cQ\xe0SF\xdf\xd1z\x1e\xfdAML\xf6'
b'\xeer^\x84\xe1\xae#\x01\xbcQ19\x17\xd6%H\x11\x13Uh\xfa2\xa7\xf4\x9d" \xfe\tj\xf7.#)cQ\xe0SF\xdf\xd1z\x1e\xfdAML\xf6'


This is not a desirable outcome. If I want to send the same message twice, I really don't want to send the same ciphertext. What if in all comunications I start by "Dear..." and the attacker knows it?. A better mode is the Cipher block chaining (CBC):

<img src="img/CBC_mode.png" style="width:1100px"/>

In this case we take a random initialization vector and perform XOR operation with the block of plaintext, then we feed this into the encryptor, after that we obtain the ciphertext. This ciphertext is used as the initialization vector to encrypt the next block.

In [10]:
secret_key = os.urandom(32)
iv = os.urandom(16)

cipher = Cipher(algorithms.AES(secret_key), modes.CBC(iv), backend=default_backend())

encryptor = cipher.encryptor()
decryptor = cipher.decryptor()

In [11]:
ctx = encryptor.update(padded_message+padded_message) + encryptor.finalize()
print(ctx[0: len(padded_message)])
print(ctx[len(padded_message):])

b'\xa4\x11\xe0\xbcH\xb7\x13\xb9\xe7\x8d\x80\x8a\x8a9\xfa\x0fJ\xcd\xd6\xb7\xef\x10\xf6\xadd\xb7\xdc\x85F\xe4,\x00\x91\x86\x11~\x92\x86\xd9\x189\xae\xdb\xe6=\x97\x1f\x1a'
b'\x93\x9dV]\xbe\xba2\x1b\xe2\\\xaf\xf4 \xf1Y\xbc\x0eNo\xe1\x0e\xaaX&#\xcb\xc2\x8a\xb6\x89\x08\xa8\rx\xf68E\t3\xff)\x0c\x82\xa1\x86\x1b\xe4\xe2'


## Size of ciphertext <a class="anchor" id="size"></a>

In [12]:
secret_key = os.urandom(32)
iv = os.urandom(16)

cipher = Cipher(algorithms.AES(secret_key), modes.CBC(iv), backend=default_backend())
block_size = 16

for message_len in range(128):
    m = str.encode("a"*message_len)
    padder = padding.PKCS7(8*block_size).padder()
    m_padded = padder.update(m) + padder.finalize()
    encryptor = cipher.encryptor()
    
    ctx = encryptor.update(m_padded) + encryptor.finalize()
    #print(f"message_len={message_len}, padded_m_len={len(m_padded)}, ctx_len={len(ctx)}")
    print(f"message_len={m}, message_len={message_len}, padded_m={m_padded}")

message_len=b'', message_len=0, padded_m=b'\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10'
message_len=b'a', message_len=1, padded_m=b'a\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f'
message_len=b'aa', message_len=2, padded_m=b'aa\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e'
message_len=b'aaa', message_len=3, padded_m=b'aaa\r\r\r\r\r\r\r\r\r\r\r\r\r'
message_len=b'aaaa', message_len=4, padded_m=b'aaaa\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c'
message_len=b'aaaaa', message_len=5, padded_m=b'aaaaa\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b'
message_len=b'aaaaaa', message_len=6, padded_m=b'aaaaaa\n\n\n\n\n\n\n\n\n\n'
message_len=b'aaaaaaa', message_len=7, padded_m=b'aaaaaaa\t\t\t\t\t\t\t\t\t'
message_len=b'aaaaaaaa', message_len=8, padded_m=b'aaaaaaaa\x08\x08\x08\x08\x08\x08\x08\x08'
message_len=b'aaaaaaaaa', message_len=9, padded_m=b'aaaaaaaaa\x07\x07\x07\x07\x07\x07\x07'
message_len=b'aaaaaaaaaa', message_len=10, padded_m=b'aaaaaaaaaa

## Bonus: Fernet <a class="anchor" id="fernet"></a>

Another block cipher implemented in cryptography package is [Fernet](https://asecuritysite.com/encryption/fernet). 

In [13]:
from cryptography.fernet import Fernet

# secret key generation
secret_key = Fernet.generate_key()
box = Fernet(secret_key)

max_len = 100
for n in range(1, max_len):
    # generate messages of n a's
    message = "".join(["a" for _ in range(n)])
    message = str.encode(message)
    
    ciphertext = box.encrypt(message)
    print(f"len_message: {len(message)}, len_ciphertext: {len(ciphertext)}")


len_message: 1, len_ciphertext: 100
len_message: 2, len_ciphertext: 100
len_message: 3, len_ciphertext: 100
len_message: 4, len_ciphertext: 100
len_message: 5, len_ciphertext: 100
len_message: 6, len_ciphertext: 100
len_message: 7, len_ciphertext: 100
len_message: 8, len_ciphertext: 100
len_message: 9, len_ciphertext: 100
len_message: 10, len_ciphertext: 100
len_message: 11, len_ciphertext: 100
len_message: 12, len_ciphertext: 100
len_message: 13, len_ciphertext: 100
len_message: 14, len_ciphertext: 100
len_message: 15, len_ciphertext: 100
len_message: 16, len_ciphertext: 120
len_message: 17, len_ciphertext: 120
len_message: 18, len_ciphertext: 120
len_message: 19, len_ciphertext: 120
len_message: 20, len_ciphertext: 120
len_message: 21, len_ciphertext: 120
len_message: 22, len_ciphertext: 120
len_message: 23, len_ciphertext: 120
len_message: 24, len_ciphertext: 120
len_message: 25, len_ciphertext: 120
len_message: 26, len_ciphertext: 120
len_message: 27, len_ciphertext: 120
len_messag