# Exercise 2: Performing SIFA and SEFA on real data

Now that we have wrote our implementation of SIFA and SEFA, and have tested it on simulated data. We can now apply it to real data. In this exercise we will apply SIFA and SEFA to the you've collected from the Piñata training target. 

If you haven't collected data from the Piñata training target, do this first before continuing with this exercise.

In [None]:
from tqdm.notebook import tqdm
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import secrets
import ipytest

ipytest.autoconfig()

%load_ext autoreload
%autoreload 2

## Load data from Piñata training target

First we will load the data from the Piñata training target. Put the data in the folder and write code below to load the data.

In [None]:
# TODO Load the data

try:
    with open("./data/ciphertexts.txt") as f:
        ciphertexts = np.array([bytes.fromhex(line.strip()) for line in f.readlines()])
except FileNotFoundError:
    raise FileNotFoundError("File not found, make sure you have places the data in the 'data' folder of this exercise.")

# Make sure the shape is correct, shoud be (n,)
print(ciphertexts.shape)

## Implementing SIFA on Piñata data
Use the code from the previous exercise to implement SIFA on the Piñata data.

In [None]:
correct_key = bytes.fromhex("CAFEBABEDEADBEEF0001020304050607")
keys = np.array([correct_key] + [secrets.token_bytes(16) for _ in range(10)],
                dtype="|S16")

# 1. TODO Calculate the intermediate values from the ciphertexts for the 
#         correct key and 10 random keys
intermediates = np.zeros((keys.shape[0], ciphertexts.shape[0]), dtype=np.uint8)

# 2. TODO Calculate the p distribution for each intermediate value
hists = np.zeros((keys.shape[0], 16, 256), dtype=np.float64)

# 3. TODO Calculate SEI distribution for each intermediate value
sei_scores = np.zeros((keys.shape[0], 16), dtype=np.float64)

### Bias for each byte

In [None]:
fig, axes = plt.subplots(4,4, dpi=150, sharex=True, sharey=True)
fig.suptitle('Distribution of $p$ per byte index')
fig.text(0.5, 0.04, 'Byte value', ha='center')
fig.text(0, 0.5, '$p$ distribution', va='center', rotation='vertical')

for byte_index in range(16):
    ax = axes[byte_index%4][byte_index//4]
    ax.set_title(f"Byte {byte_index}", fontsize=8, y=0.75)
    ax.plot(hists[0, byte_index])
    ax.axhline(1/256, color = 'red')

### SIFA SEI score per byte

In [None]:
# Plot the SEI scores
fig, axes = plt.subplots(4,4, dpi=150, sharex=True, sharey=True)
fig.suptitle('SIFA: SEI score of all states per byte index')
fig.text(0.5, 0.04, 'Keys', ha='center')
fig.text(0, 0.5, 'SEI score', va='center', rotation='vertical')

for byte_index in range(16):
    ax = axes[byte_index%4][byte_index//4]
    ax.set_title(f"Byte {byte_index}", fontsize=8, y=0.75)
    color = ["red"] + ["blue"] * (sei_scores.shape[0]-1)
    ax.bar(np.arange(sei_scores.shape[0]), sei_scores[:, byte_index], color=color)

# Add legend
leg = fig.legend(["Correct key", "Random key"], bbox_to_anchor=(0.95, 0.5), loc='center left', borderaxespad=0)
leg.legend_handles[0].set_color('red')
leg.legend_handles[1].set_color('blue')