In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("lab03.ipynb")

In [None]:
%%capture
import random
import os
import sys
import numpy as np
!{sys.executable} -m pip install pycryptodome
from Crypto.Cipher import AES
from Crypto.Hash import SHA256

# Lab 3: Block Ciphers and Differential Cryptanalysis

# Block Ciphers

Up to this point we have primarily been concerned with elementary ciphers or asymmetric encryption. Real-world symmetric encryption is most often accommplished by **block ciphers**, which operate on fixed-length blocks of data, much like our Merkle-Damgard hash functions did. We can model this as some function $f(k, d) = c$, taking in the key and some data $d$ to output some ciphertext $c$.

### Advanced Encryption Standard

The most popular symmetric cryptosystem in the world, AES (aka Rijndael), was first published in 1998. Since then, it has become the premier block cipher, used even by the NSA for top-secret data. To date, there are no real-world attacks against correctly-implemented AES.

We will explore how to use AES in this lab!

### Modes of Operation

AES itself only specifies $f(k, d)$ for us, meaning we have to figure out how to encrypt the data blocks $P_1, P_2, \ldots P_k$ into ciphertext. The different ways of doing this are called **modes of operation**.


## AES-ECB (Electronic Code Book)

The most straightforward way to encrypt using AES is to set $C_i = \text{Enc(}k, P_i\text{)} $, in effect each ciphertext block being the direct encryption of its corresponding plaintext. This can be visualized by the following (credit to the [CS 161 Textbook](https://textbook.cs161.org/crypto/symmetric.html)):

![AES-ECB Encryption](https://textbook.cs161.org/assets/images/crypto/symmetric/ECB_encryption.png)
![AES-ECB Decryption](https://textbook.cs161.org/assets/images/crypto/symmetric/ECB_decryption.png)

We've implemented a more friendly version of AES-ECB here to use later.

In [None]:
def padBytestring(string, length):
    bas = bytearray(string)
    return bytes(bas + bytearray(b'A'*(length-len(bas))))

In [None]:
def blockEncrypt(key, data): # Takes in 16-byte key and data and returns their encryption.
    cipher = AES.new(padBytestring(key, 16), AES.MODE_ECB) # Ensure consistent behavior on one-block encryptions
    
    return cipher.encrypt(padBytestring(data, 16))

def blockDecrypt(key, data): # Takes in 16-byte key and data and returns their encryption.
    cipher = AES.new(padBytestring(key, 16), AES.MODE_ECB) # Ensure consistent behavior on one-block encryptions
    
    return cipher.decrypt(padBytestring(data, 16))

In [None]:
def AES_ECB_Encrypt(key, dataBlocks): # Key is a 16 byte bytestring, dataBlocks is an array of 16 byte bytestrings.
    return [blockEncrypt(key, block) for block in dataBlocks]

def AES_ECB_Decrypt(key, cipherBlocks): # Key is a 16 byte bytestring, dataBlocks is an array of 16 byte bytestrings.
    return [blockDecrypt(key, block) for block in dataBlocks]

Let's look at an example of why AES-ECB **is bad and you should never use it**. Suppose we've intercepted the medical records of two patients:

In [None]:
hospital_key = os.urandom(16)

patientOneData = AES_ECB_Encrypt(hospital_key, [b'flu', b'pneumonia'])
patientTwoData = AES_ECB_Encrypt(hospital_key, [b'pneumonia', b'arthritis'])

print('Patient One Encrypted Data:','\n', patientOneData[0], '\n', patientOneData[1])
print('--')
print('Patient Two Encrypted Data:','\n', patientTwoData[0], '\n', patientTwoData[1])

We see that the 2nd block of patient one and 1st block of patient two are the same! This constitutes a severe breach of security, since if we knew the medical details of patient one, we would immediately know that patient two also has pneumonia.

## Differential Cryptanalysis

Another technique for block-cipher cryptanalysis is that of $\textbf{differential cryptanalysis}$. Instead of linear equations, we focus on $\textbf{differentials}$, which are specific XOR differences between an input and output of a substitution box:
$$\Delta_y = SBOX(X \oplus \Delta_x) \oplus SBOX(X)$$

For example, the pair (5, 7) means if two inputs differ by an XOR of 5, their outputs will differ by an XOR of 7. In a one-sbox cipher, we can narrow down the possibilities for the round keys by finding a $\textbf{good pair}$, a pair of plaintext/ciphertext pairs that exhibits this desired differential. We then consult a lookup table of what inputs to the SBOX allow for such a differential, and conclude that the internal sbox input (post-round 1 XOR) must be one of these inputs. From there, we can brute force the possible inputs and derive the respective round keys using the plaintext/ciphertext pair. 

We will explore this property in the next section!

In [None]:
sbox = [3, 14, 1, 10, 4, 9, 5, 6, 8, 11, 15, 2, 13, 12, 0, 7]

This is the representation of our SBOX as a one-to-one mapping of input [0,15] to output. For example:

In [None]:
sbox[15]

Our first step is to analyze the differential characterists present in our substitution box. We check for every combination of input and differential, and tally the results:

In [None]:
differentials = [list() for _ in range(16)]
inputs = [list() for _ in range(16)]

for x in range(16):
    for delta in range(16):
        output_one = sbox[x ^ delta]
        output_two = sbox[x]
        
        inputs[delta] += [x]
        differentials[delta] += [output_one ^ output_two]

In [None]:
differentials

We want to find a row that has a lot of repeat numbers, and index 4 is a good candidate with 6 pairs of input 4.

In [None]:
differentials[4]

In [None]:
[(i, differentials[4][i]) for  i in range(16) if differentials[4][i] == 7]

In [None]:
[(sbox[i] ^ sbox[i^4]) for i in range(16)]

Therefore, if we the observe the behavior of $SBOX(x) \oplus SBOX(x \oplus 4) = 7$, then $x \in \{0, 1, 4, 5, 9, 13\}$.

In [None]:
possible_inputs = [0, 1, 4, 5, 9, 13]

Our cipher is the same from lecture: two round keys with an inner substitution box.

In [None]:
def encrypt(P, round_keys):
    return sbox[P ^ round_keys[0]] ^ round_keys[1]

![Example Cipher](cipher.png)

In [None]:
# Convince yourself that this attack works for all possible keys!
secret_keys = [np.random.randint(0, 15), np.random.randint(0, 15)]

In [None]:
i = 0
good_pair = None

while not good_pair:
    p, c = i, encrypt(i, secret_keys)
    p_x, c_x = i^4,  encrypt(i^4, secret_keys)
    
    if c^c_x == 7: 
        print('Found good pair:', (p, c))
        good_pair = (p, c)
        break
        
    i += 1

**Question 1.1**: Write a function to derive the round keys given the plaintext P, midpoint value M, and ciphertext C.

*HINT: The midpoint is the value being input to the substitution box. Write out the steps to derive this input.*

*HINT: The output is derived as a function of the sbox output and round key 2.*

In [None]:
def genRKFromM(P, M, C):
    rk1 = ...
    rk2 = ...
    
    return [rk1, rk2]

In [None]:
grader.check("q1_1")

Next, we generate 10 known plaintext-ciphertexts pair for testing possible keys with.

In [None]:
example_kpc = [(i, encrypt(i, secret_keys)) for i in range(10)]

Finally, we tests all possible inputs to see which ones encrypt as expected.

In [None]:
final_keys = None

for possible in possible_inputs: # Test every possible input to the sbox
    round_keys = genRKFromM(good_pair[0], possible, good_pair[1])
    print('Testing possible midpoint', possible, 'with corresponding round keys', round_keys)
    
    correct = True
    
    for p, c in example_kpc: # verify keys successfully encrypt 
        if c != encrypt(p, round_keys):
            correct = False
            break
        
    if correct:
        print('Found round keys!')
        final_keys = round_keys
        break

In [None]:
assert final_keys == round_keys

We've just found our keys by only searching 6 possible keys instead of 16! This advantage becomes much larger with larger keysizes, of course, making this attack a potent weapon for cryptanalysis.

## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**

Once you have generated the zip file, go to the Gradescope page for this assignment to submit.

In [None]:
# Save your notebook first, then run this cell to export your submission.
grader.export(pdf=False, run_tests=True)