UW user id: `g66xu`

# Problem 1
A large number of helper functions have been implemented to facilitate the solution (and it was just fun anyways). They are included at the end of the write-up for problem 1

## a)
The bias of the linear relationship over my inputs is approximately 0.00205.

In [1]:
import os
import sys

sys.path.insert(0, "/Users/ganyuxu/opensource/linear-cryptanalysis")

from cryptanalysis.heys import HeysCipher
from cryptanalysis.attack import read_inputs, get_bias, check_sec34_linear_approx

plaintexts = read_inputs("inputs/a2q1plaintexts.txt")
ciphertexts = read_inputs("inputs/a2q1ciphertexts.txt")
n_pairs = len(plaintexts)

round_key_guess = 0b0000011100000110
guess = HeysCipher([0, 0, 0, 0, round_key_guess])

print(
    get_bias(
        plaintexts, ciphertexts, guess, check_sec34_linear_approx
    )
)

0.0020499999999999963


## b)

We will iterate through all possible values of these 8 bits and compute the "section 3.4" bias. The values that yield the largest amount of vias is the most likely value.

The most possible values are as follows:

|key bits (5th, 6th, ..., 15th, 16th bit of $K_5$)|bias|
|:---|:---|
|0,0,0,0,0,1,0,0|0.026800|
|0,0,0,0,1,0,0,1|0.026800|
|0,0,1,1,0,1,0,0|0.026350|
|1,1,0,1,0,1,0,0|0.026000|
|0,0,0,1,0,1,0,0|0.024650|

In [2]:
rankings = []

for bits_5_to_8 in range(0b0000, 0b1111 + 1):
    for bits_13_to_16 in range(0b0000, 0b1111 + 1):
        round_key = (bits_5_to_8 << 8) + bits_13_to_16
        guess = HeysCipher([0, 0, 0, 0, round_key])
        bias = get_bias(
            plaintexts, ciphertexts, guess, check_sec34_linear_approx
        )
        rankings.append((bias, round_key))

rankings.sort(key=lambda x: x[0], reverse=True)
for bias, round_key in rankings[:5]:
    print(f"round key: {round_key:016b}, bias: {bias:0.6f}")

round key: 0000000000000100, bias: 0.026800
round key: 0000000000001001, bias: 0.026800
round key: 0000001100000100, bias: 0.026350
round key: 0000110100000100, bias: 0.026000
round key: 0000000100000100, bias: 0.024650


## c)
Finding the bias of the individual linear relationship involves computing the input sum and the output sum. Then we use table 4 to look up the expected bias. The results are as follows:

|linear relation|input sum|output sum|expected bias|
|:---|:---|:---|:---|
|$S_{11}: X_1 \oplus X_4 \approx Y_1$|9|8|-4/16|
|$S_{13}: X_1 \oplus X_4 \approx Y_1$|9|8|-4/16|
|$S_{21}: X_1 \oplus X_3 \approx Y_2$|10|4|-4/16|
|$S_{32}: X_1 \approx Y_1 \oplus Y_2 \oplus Y_3 \oplus Y_4 $|8|15|-6/16|

We can trace through the Heys Cipher write down the linear relationships of the intermediary state before arriving at a final linear approximation:

$$
\begin{aligned}
U_{1,1} &= P_1 \oplus K_{1,1} \\
U_{1,4} &= P_4 \oplus K_{1,4} \\
U_{1,9} &= P_9 \oplus K_{1,9} \\
U_{1,12} &= P_{12} \oplus K_{1,12} \\
V_{1,1} &\approx U_{1,1} \oplus U_{1,4}, \epsilon = -\frac{1}{4} \\
V_{1,9} &\approx U_{1,9} \oplus U_{1,12}, \epsilon = -\frac{1}{4} \\
U_{2,1} &= V_{1,1} \oplus K_{2,1} \\
U_{2,3} &= V_{1,9} \oplus K_{2,3} \\
V_{2,2} &\approx U_{2,1} \oplus U_{2,3}, \epsilon = -\frac{1}{4} \\
U_{3,5} &= V_{2,2} \oplus K_{3,5} \\
V_{3,5} \oplus V_{3,6} \oplus V_{3,7} \oplus V_{3,8} &\approx U_{3,5}, \epsilon = -\frac{3}{8}
\end{aligned}
$$

Therefore:

$$
\begin{aligned}
V_{3,5} \oplus V_{3,6} \oplus V_{3,7} \oplus V_{3,8} &\approx U_{3,5} \\
&\approx V_{2,2} \oplus K_{3,5} \\
&\approx U_{2,1} \oplus U_{2,3} \oplus K_{3,5} \\
&\approx (V_{1,1} \oplus K_{2,1}) \oplus (V_{1,9} \oplus K_{2,3}) \oplus K_{3,5} \\
&\approx U_{1,1} \oplus U_{1,4} \oplus K_{2,1} \oplus U_{1,9} \oplus U_{1,12} \oplus K_{3,5} \\
&\approx P_1 \oplus K_{1,1} \oplus P_4 \oplus K_{1,4} \oplus K_{2,1} \oplus P_9 \oplus K_{1,9} \oplus P_{12} \oplus K_{1,12} \oplus K_{3,5} \\
\end{aligned}
$$

Add to the approximation above the following relationship:

$$
\begin{aligned}
U_{4,2} &= V_{3,5} \oplus K_{4,2} \\
U_{4,6} &= V_{3,6} \oplus K_{4,6} \\
U_{4,10} &= V_{3,7} \oplus K_{4,10} \\
U_{4,14} &= V_{3,8} \oplus K_{4,14} \\
\end{aligned}
$$

Finally, move everything to onde side, We have the following approximation:

$$
P_1 \oplus P_4 \oplus P_9 \oplus P_{12} 
\oplus U_{4,2} \oplus U_{4,6} \oplus U_{4,10} \oplus U_{4,14}
\oplus \sum\text{bunch of key bits}
\approx 0
$$

Where the bunch of key bits include: $K_{1,1}, K_{1,4}, K_{1,9}, K_{1,12}, K_{2,1}, K_{3,5}, K_{4,2}, K_{4,6}, K_{4,10}, K_{4,14}$

The final bias can be approximated using the piling-up lemma:

$$
\epsilon = 2^3 \cdot \Pi_{i=1}^{4}\epsilon_i = 8 \cdot (-\frac{1}{4}) \cdot (-\frac{1}{4}) \cdot (-\frac{1}{4}) \cdot (-\frac{3}{8}) = \frac{3}{64}
$$

Also, a helpful diagram:

<img src="./static/a2q1c.png" width=500></img>

## d)
Fixing the sum of key bits to be $0$, we arrive at the following linear approximation:

$$
P_1 \oplus P_4 \oplus P_9 \oplus P_{12} 
\oplus U_{4,2} \oplus U_{4,6} \oplus U_{4,10} \oplus U_{4,14}
\approx 0
$$

Because $U_{4,2}, U_{4,6}, U_{4,10}, U_{4,14}$ are involved in all four S-boxes, without further insights into the constructions of the S-boxes, the best approach to finding the fifth round key is to brute-force all $2^{16}$ possible values, which is impractical to do on a personal computer running Python. So I choose (ii) and will manually analyze the first 10 pairs using the round key `0b1100011111100110`.

|plaintext|ciphertext|U4|$P_1 \oplus P_4 \oplus P_9 \oplus P_{12} \oplus U_{4,2} \oplus U_{4,6} \oplus U_{4,10} \oplus U_{4,14}$|
|:---|:---|:---|:---|
|`0000000000000001`|`1111010100100110`|`1000010010111110`|0|
|`0000000000000010`|`0100110011011100`|`0111011010001001`|0|
|`0000000000000011`|`1110101100010001`|`0100101101011111`|1|
|`0000000000000100`|`1011001010101110`|`1111110000010111`|1|
|`0000000000000110`|`1111101111000001`|`1000101101001111`|0|
|`0000000000001000`|`1110010101000000`|`0100010010011010`|0|
|`0000000000001010`|`1011000010100110`|`1111111100011110`|1|
|`0000000000010001`|`1111100100010111`|`1000000001010011`|0|
|`0000000000010101`|`1000010011010011`|`0001100010001100`|0|
|`0000000000011010`|`1101110000001101`|`0011011000000110`|1|

The relationship holds 6 out of 10 times, so the bias is $\frac{6}{10} - \frac{1}{2} = \frac{1}{10}$
