## Tutorial: First-Order CPA Attack
This tutorial demonstrates how to perform a first-order Correlation Power Analysis (CPA) attack using the accelerated CPA implementation provided by the ChipWhisperer APIs.

### Overview
In this tutorial, we will use the DPA Contest V2 dataset as the target for the CPA attack. The workflow consists of the following steps:
1. **Download the DPA Contest V2 dataset.**
2. **Convert the dataset to ChipWhisperer Project format.**
3. **Setup a CPA attack and execute it.**
4. **Visualize and analyze the results.**

## Step 1: Download the DPA Contest V2 Dataset
The first step is to download the DPA Contest V2 dataset. You can obtain it from the [DPA Contest V2 website](https://dpacontest.telecom-paris.fr/v2/download.php).

The dataset provides two types of files, but for this tutorial, we will use the *public database*. Follow the instructions on the website to prepare the following files in the appropriate locations:
- `DPA_contest2_public_base_index_file` (text file containing the index of each trace)
- `DPA_contest2_public_base_diff_vcc_a128_2009_12_23` (directory containing traces)

The dataset contains 640,000 traces of AES encryption operations with 32 different keys (20,000 traces per key). Each line of the index file is comma-separated and contains the following fields:
- **Key**: The encryption key used for the operation.
- **Plaintext**: The input plaintext for the AES encryption.
- **Ciphertext**: The resulting ciphertext after encryption.
- **Trace file name**: The file name of the corresponding power trace.


Run the following code block to select the directory path that contains the required files.
File chooser will be displayed.
Once you select the valid directory, checkmart will be displayed.

In [None]:
from ipyfilechooser import FileChooser
import ipywidgets as widgets
import os
import warnings
dataset_chooser = FileChooser()
dataset_status = widgets.Valid(
    value=False,
)
index_file = None
trace_dir = None
def check_dataset_selection():
    global index_file, trace_dir
    dataset_status.value = False
    if not dataset_chooser.selected:
        return
    selected_path = dataset_chooser.selected
    # check index file
    index_file = os.path.join(selected_path, 'DPA_contest2_public_base_index_file')
    if not os.path.exists(index_file):
        return
    # check trace directory
    trace_dir = os.path.join(selected_path, 'DPA_contest2_public_base_diff_vcc_a128_2009_12_23')
    if not os.path.isdir(trace_dir):
        return
    dataset_status.value = True

dataset_chooser.register_callback(check_dataset_selection)
display(widgets.HBox([widgets.Label("Dataset check"), dataset_status]))
display(dataset_chooser)

## Step 2: Convert the Dataset to ChipWhisperer Project Format
The CPA attack will be performed on traces with a fixed key, so we need to extract the traces corresponding to a specific key from the dataset. In this example, we will use the first key (`6a51a0d2d8542f68960fa728ab5133a3`). If you wish to analyze a different key, simply update the `target_key` variable accordingly.

Example of other keys:
- `3b8f48986b4bb9afc4bfe81b66282193`
- `e65525f3aa55ab945748986263e81440`



In [None]:
import chipwhisperer as cw
import csv
import numpy as np
from matplotlib import pyplot as plt
from tqdm.notebook import tqdm
from chipwhisperer.common.api import ProjectFormat as Project

target_key = "6a51a0d2d8542f68960fa728ab5133a3"

# Function to convert waveform data in csv format to numpy array
def csv_to_numpy(trace_file):
    with open(trace_file, "r") as f:
        reader = csv.reader(f)
        wave = []
        for row in reader:
            # Convert each row to a list of floats
            if not row[0].startswith("#"):
                if len(row) > 1:
                    print(f"Warning: Row {row} has more than one column, only the first column will be used")
                wave.append(float(row[0]))

    # Convert to numpy array
    wave = np.array(wave, dtype=float)

    # Return the wave
    return wave

if not dataset_status.value:
    print("Please select a valid dataset directory.")
else:
    traces = []
    # find traces with the specified target key
    with open(index_file, "r") as f:
        for line in f.readlines():
            key, pt, ct, trace_file = line.strip().split()
            if key == target_key:
                trace_path = os.path.join(trace_dir, trace_file)
                traces.append((pt, ct, trace_path))
if len(traces) > 0:
    print(f"Found {len(traces)} traces for target key {target_key}.")

    # convert traces to ChipWhisperer format
    project = Project.Project()
    # each data should be numpy array
    np_key = np.frombuffer(bytes.fromhex(target_key), dtype=np.uint8)
    for pt, ct, trace_path in tqdm(traces, desc="Converting traces"):
        np_pt = np.frombuffer(bytes.fromhex(pt), dtype=np.uint8)
        np_ct = np.frombuffer(bytes.fromhex(ct), dtype=np.uint8)
        wave = csv_to_numpy(trace_path)
        project.traces.append(cw.Trace(wave, np_pt, np_ct, np_key))

    # Display the first 10 traces
    for i in range(min(10, len(project.traces))):
        plt.plot(project.traces[i].wave)
else:
    print(f"No traces found for target key {target_key}. Please check the dataset directory and the target key.")


### Optional step: Save the Project
If you save the converted project for future use, you can skip the conversion step next time.
Run the following code block to save the project. File chooser will be displayed.
You need to select the directory where you want to save the project and specify the project name.
After saving, `project_name.cwp` file and `project_name_data` directory will be created in the selected directory.

In [None]:
from ipywidgets import Label
project_choose = FileChooser()

msg = Label()
def save_project():
    # check if the selected path is directory
    if not project_choose.selected:
        return
    if os.path.isdir(project_choose.selected):
        msg.value = "Please fill project name in output filename field."
        return
    project_path = Project.ensure_cwp_extension(project_choose.selected)
    project.setFilename(project_path)
    project.save()
    msg.value = f"Project saved to {project_path}"
project_choose.register_callback(save_project)
display(msg)
display(project_choose)

## Step 3: Setup a CPA Attack and Execute It
To begin, run the first code block to display a dropdown menu for selecting the CPA algorithm implementation, such as OpenMP, CUDA, etc. Refer to the documentation for detailed descriptions of each implementation.

Before executing the second code block, ensure you have selected the desired implementation from the dropdown menu.
In addition to our library, you can also select the original ChipWhisperer implementation.
However, the original implementation may take significantly longer to complete the attack (e.g., over 20 minutes).
To reduce execution time, consider limiting the number of traces by adding `attack.set_point_range(0, 1000)` to the code block.

Next, run the second code block to execute the CPA attack. The DPA Contest V2 dataset is known to be recoverable using the Hamming-distance model for the last round transition.
If you try to use another model such as `cwa.leakage_models.sbox_output`, you will not be able to recover the key.

During the CPA attack, PGE (Partial Guessing Entropy) is displayed in the output.
Partial GE for each key byte represents the ranking of the correct key among all possible candidate keys. If it is zero, it indicates that the attacker can identify the correct key byte.
Note that the hamming-distance model for the last round recovers the key after key scheduling, so the key byte is not the same as the original key byte used for encryption.

In [None]:
from cw_plugins.analyzer.attacks.cpa_algorithms.fast_progressive import *
from ipywidgets import Dropdown, HBox, Label
import chipwhisperer.analyzer as cwa
algorithm_select = Dropdown(
    options=[
       ("FastCPAProgressive", FastCPAProgressive),
       ("FastCPAProgressiveTiling", FastCPAProgressiveTiling),
       ("FastCPAProgressiveCuda", FastCPAProgressiveCuda),
       ("FastCPAProgressiveCudaFP32", FastCPAProgressiveCudaFP32),
       ("FastCPAProgressiveOpenCL", FastCPAProgressiveOpenCL),
       ("FastCPAProgressiveOpenCLFP32", FastCPAProgressiveOpenCLFP32),
       ("Original", cwa.attacks.cpa_algorithms.Progressive),
    ],
    value=FastCPAProgressive,  # Default value
)
display(HBox([Label("CPA Algorithm"), algorithm_select]))

In [None]:
import time
# select the leakage model
leakage_model = cwa.leakage_models.last_round_state_diff

# create an attack object
attack = cwa.cpa(project, leakage_model, algorithm_select.value)

# register callback with # tips to record the PGE history
pge_history = []
cb = cwa.get_jupyter_callback(attack)
def callback():
    global pge_history
    pge_history.append(list(attack.results.pge))
    cwa._default_jupyter_callback(attack)

# run the attack
start = time.time()
results = attack.run(callback=callback, update_interval=1000)
end = time.time()
print(f"Attack finished in {end - start:.2f} seconds.")

# Step 4: Visualize and Analyze the Results

Lastly, we will visualize the results of the CPA attack.

The first code block will display the Guessing Entropy (GE) and success rate of the attack, showing the transient those values updated during the attack.

The second code block will display the correlation coefficients for the correct key byte and other key bytes across all sample points.
If the attack is successful, you should observe a distinct leakage point in the correlation plot, where the correct key byte exhibits a significantly higher correlation compared to other key bytes.


In [None]:
from math import log2, log10
# calculate Guessing Entropy (Averaged PGE)
ge = [sum([pge[b] for b in range(16)])/16 for pge in pge_history]

# calcuate success rate
success_rate = [sum([pge[b] == 0 for b in range(16)])/16 for pge in pge_history]

scale = round(10 ** int(log10(attack.reporting_interval)))
interval = attack.reporting_interval / scale
x = [interval * i for i in range(1, len(ge)+1)]

fig = plt.figure()
ax1 = fig.add_subplot(211)
ax1.plot(x, ge, marker='o', markersize=3, linewidth=1)
ax1.set_ylabel("GE")

ax2 = fig.add_subplot(212)
ax2.plot(x, success_rate, marker='o', markersize=3, linewidth=1)
ax2.set_ylabel("Success Rate")
ax2.set_ylim(0, 1)
ax2.set_xlabel(f"Number of traces (x{scale})")
plt.show()

In [None]:
wave = project.traces[0].wave
# plot 1st byte of correlation
key_byte = 0
fig = plt.figure(figsize=(15, 4))
wave_ax = fig.add_subplot(2,1,1)
wave_ax.plot(wave, label=f"Waveform")

correct_key = leakage_model.process_known_key(project.traces[0].key)[key_byte]
corr = results.diffs[key_byte][correct_key]
corr_ax = fig.add_subplot(2,1,2)
for i in range(16):
    if i != correct_key:
        corr_ax.plot(results.diffs[key_byte][i], color='gray')
corr_ax.plot(corr, color="red")
max_pos = np.abs(corr).argmax()
corr_ax.plot(max_pos, corr[max_pos], marker='o', markersize=8, markeredgewidth=4,
                  fillstyle='none', color = "blue")
plt.show()