# EZKL Pipeline Integration for fNIRS Classification

## Introduction

This is the backend for the  **EZKL Pipeline**, the final stage of our project dedicated to processing private Functional Near-Infrared Spectroscopy (fNIRS) data using a publicly available neural network. This pipeline leverages [EZKL](https://github.com/zkonduit/ezkl) to generate and verify cryptographic proofs related to data processing, ensuring data integrity and privacy through secure and verifiable computations.

As part of the Buildathon, three components have been made:


1.   
code for re-training a neural network on fNIRS data, , outputting files compatible with EZKL, and plotting graphs on the performance of the model [the github repository submitted on Dorahacks](https://github.com/rainbowpuffpuff/ez_think2earn/blob/master/fNIRSNET/KFold_Train.py)
2.   code for processing onnx and data files from participant #1, building a circuit and creating a proof with ezkl (the current file, on [github](https://github.com/rainbowpuffpuff/ez_think2earn/tree/master/ZuThailand%20EZKL%20demo.ipynb))
3. a research paper on the available open hardware fNIRS devices that anyone can manufacture, along with an experimental methodology for data collection for the purpose of decoding basic visual imagery based on fNIRS data from participants, which can be found on [the github repository submitted on Dorahacks](https://github.com/rainbowpuffpuff/ez_think2earn/blob/master/Investigating%20Hemodynamic%20Responses_%20%5Bzuthailand%20draft%5D-1.pdf)



## Purpose

The primary objectives of the EZKL pipeline are to:

- **Process Private Data**: Utilize a pre-trained fNIRSNet model in `.onnx` format to analyze fNIRS data.
- **Generate and Verify Proofs**: Produce cryptographic proofs that validate the data processing steps, ensuring the accuracy and security of the computations.

## Requirements

To successfully implement the EZKL pipeline, the following are required:

- **Model File**: An `.onnx` formatted neural network model.
- **Data**: fNIRS data compatible with the model for processing, so the data that the model was trained on.

## Obtaining the Model

The model file is obtained by adapting a GitHub pipeline that trains the **fNIRSNet** model using the Mental Arithmetic (MA) dataset. The MA dataset is selected for its compatibility with near-infrared spectroscopy readings, matching the data format used by upcoming think2earn devices.

## Dataset Details

### Mental Arithmetic (MA) Dataset

In the MA dataset, the classification task involves distinguishing between two conditions:

1. **MA (Mental Arithmetic) Task**:
   - **Description**: Subjects perform mental arithmetic operations, such as repeatedly subtracting a one-digit number from a three-digit number.
   
2. **BL (Baseline) Task**:
   - **Description**: Subjects remain relaxed, focusing on a simple fixation cross, providing a neutral baseline signal.

The data is available to download from [tu-Berlin repository](https://doc.ml.tu-berlin.de/hBCI/contactthanks.php), named NIRS data "NIRS_01-29". For the ZuThailand buildathon, only the first 9 participants are selected for processing.

This dataset is processed using MATLAB following the guidelines provided in the [fNIRSNet repository](https://github.com/wzhlearning/fNIRSNet/), ensuring compatibility with our processing pipeline.

## Training Process

The training process utilizes a **K-fold cross-validation** paradigm to evaluate the model's performance. For each fold, the following steps are performed:

- **Training**: The model is trained on the training subset of the data.
- **Evaluation**: Training accuracy, loss, and a confusion matrix are generated to assess performance across the two classes.

### Key Outputs

For each training run within the K-fold setup, the following outputs are generated:

- **Training Metrics**:
  - **Accuracy**: Percentage of correctly classified instances.
  - **Loss**: Measures the model's prediction error over epochs.
  
- **Performance Evaluation**:
  - **Confusion Matrix**: Visual representation of the model’s performance across the two classes (MA vs. BL), highlighting true positives, false positives, true negatives, and false negatives.
  
- **Model Export**:
  - **ONNX Files**: Each trained model is exported in `.onnx` format, ready for integration with the EZKL pipeline.

## Integration with EZKL

The exported `.onnx` models and the processed fNIRS data are utilized by the EZKL pipeline to generate and verify cryptographic proofs related to the data processing tasks. This step ensures that the computations are both verifiable and secure, maintaining the integrity and privacy of the processed data.

---

## Running the Google colab script

To run the script, click on **Runtime** on the top left (next to File, Edit, View, Insert), and then select the "Run all" option



# EZKL: private data processed with a public network


In [6]:
import os

# Method 1: Direct Download output files after training model

def direct_download():
    print("Starting Method 1: Direct Download Using wget")
    BASE_URL = "https://raw.githubusercontent.com/rainbowpuffpuff/ez_think2earn/master/"
    files = [
        "metrics.txt",
        "model.onnx",
        "model.pt",
        "test_data.npz",
        "train_data.npz",
    ]
    for file in files:
        url = BASE_URL + file
        print(f"Downloading {file} from {url}")
        !wget -q -O /content/{file} {url}
        print(f"Downloaded {file} to /content/{file}")
    print("Method 1 completed.\n")

direct_download()
print("Verifying downloaded files:")
!ls -lh /content/

Starting Method 1: Direct Download Using wget
Downloading metrics.txt from https://raw.githubusercontent.com/rainbowpuffpuff/ez_think2earn/master/metrics.txt
Downloaded metrics.txt to /content/metrics.txt
Downloading model.onnx from https://raw.githubusercontent.com/rainbowpuffpuff/ez_think2earn/master/model.onnx
Downloaded model.onnx to /content/model.onnx
Downloading model.pt from https://raw.githubusercontent.com/rainbowpuffpuff/ez_think2earn/master/model.pt
Downloaded model.pt to /content/model.pt
Downloading test_data.npz from https://raw.githubusercontent.com/rainbowpuffpuff/ez_think2earn/master/test_data.npz
Downloaded test_data.npz to /content/test_data.npz
Downloading train_data.npz from https://raw.githubusercontent.com/rainbowpuffpuff/ez_think2earn/master/train_data.npz
Downloaded train_data.npz to /content/train_data.npz
Method 1 completed.

Verifying downloaded files:
total 324M
-rw-r--r-- 1 root root 882K Dec  9 14:29 calibration.json
-rw-r--r-- 1 root root  45K Dec  9 14

In [7]:
# Check if notebook is in Colab and install required packages
try:
    import google.colab
    import subprocess
    import sys
    print("Running in Google Colab. Installing required packages...")
    subprocess.check_call([sys.executable, "-m", "pip", "install", "ezkl"])
    subprocess.check_call([sys.executable, "-m", "pip", "install", "onnx"])
    print("Packages installed successfully.")
except:
    print("Not running in Google Colab. Assuming ezkl and onnx are already installed.")

# Import necessary libraries
from torch import nn
import ezkl
import os
import json
import torch
import numpy as np

# Define paths based on your specified locations in /content/
model_path = os.path.join('/content', 'model.onnx')
compiled_model_path = os.path.join('/content', 'model.compiled')
pk_path = os.path.join('/content', 'test.pk')
vk_path = os.path.join('/content', 'test.vk')
settings_path = os.path.join('/content', 'settings.json')

witness_path = os.path.join('/content', 'witness.json')
data_path = os.path.join('/content', 'input.json')
cal_path = os.path.join('/content', 'calibration.json')

proof_path = os.path.join('/content', 'test.pf')

# Print all paths for verification
print("Model path:", model_path)
print("Compiled model path:", compiled_model_path)
print("Proving key (pk) path:", pk_path)
print("Verifying key (vk) path:", vk_path)
print("Settings path:", settings_path)
print("Witness path:", witness_path)
print("Input (data) path:", data_path)
print("Calibration path:", cal_path)
print("Proof path:", proof_path)

# ==============================
# Step 1: Prepare input.json
# ==============================

# Load test data from 'test_data.npz'
test_data_npz_path = os.path.join('/content', 'test_data.npz')
if not os.path.isfile(test_data_npz_path):
    raise FileNotFoundError(f"Test data file not found at {test_data_npz_path}")

test_data = np.load(test_data_npz_path)
X_test = test_data['X']
y_test = test_data['y']  # Not used for EZKL, but available

print(f"Loaded test data from {test_data_npz_path}")
print(f"Test data shape: {X_test.shape}")  # Expected shape: (N, 1, 72, 30)

# Select a single test sample to create input.json
# Here, we take the first sample; you can choose any
sample_input = X_test[0]  # Shape: (1, 72, 30)

# Normalize sample
sample_input_normalized = (sample_input - sample_input.mean()) / (sample_input.std() + 1e-5)

# Convert to list and reshape as needed
data_array = sample_input_normalized.reshape([-1]).tolist()

data = {"input_data": [data_array]}

# Serialize data into file:
with open(data_path, 'w') as f:
    json.dump(data, f)

print(f"Serialized test input to {data_path}")
print(f"Input data shape for EZKL: {np.array(data['input_data']).shape}")  # Should be (1, 72*30)

# ==============================
# Step 2: Generate Settings
# ==============================

# Set up EZKL run arguments
py_run_args = ezkl.PyRunArgs()
py_run_args.input_visibility = "private"
py_run_args.output_visibility = "public"
py_run_args.param_visibility = "fixed"  # private by default

# Generate settings
print("Generating settings...")
res = ezkl.gen_settings(model_path, settings_path, py_run_args=py_run_args)
print("gen_settings result:", res)
print("Generated settings file:", settings_path)

assert res == True, "Failed to generate settings."

# ==============================
# Step 3: Prepare Calibration Data
# ==============================

# Load calibration data from 'calibration.json'
# Prepare calibration data using multiple samples from X_test
num_cal_samples = 20  # Number of calibration samples
if X_test.shape[0] < num_cal_samples:
    raise ValueError(f"Not enough test samples ({X_test.shape[0]}) for calibration.")

calibration_samples = X_test[:num_cal_samples]
calibration_samples_normalized = (calibration_samples - calibration_samples.mean(axis=(1,2), keepdims=True)) / (calibration_samples.std(axis=(1,2), keepdims=True) + 1e-5)

cal_data_array = calibration_samples_normalized.reshape(num_cal_samples, -1).tolist()

cal_data = {"input_data": cal_data_array}

# Serialize calibration data into file:
with open(cal_path, 'w') as f:
    json.dump(cal_data, f)

print(f"Serialized calibration data to {cal_path}")
print(f"Calibration data shape for EZKL: {np.array(cal_data['input_data']).shape}")  # Should be (20, 72*30)

# ==============================
# Step 4: Calibrate Settings
# ==============================

print("Calibrating settings...")
await ezkl.calibrate_settings(cal_path, model_path, settings_path, "resources")
print("Calibration completed.")

# ==============================
# Step 5: Compile Circuit
# ==============================

print("Compiling circuit...")
res = ezkl.compile_circuit(model_path, compiled_model_path, settings_path)
print("Circuit compilation result:", res)
assert res == True, "Failed to compile circuit."

# ==============================
# Step 6: Obtain SRS (Structured Reference String)
# ==============================

print("Obtaining SRS...")
res = await ezkl.get_srs(settings_path)
print("SRS obtained:", res)

# ==============================
# Step 7: Generate Witness
# ==============================

print("Generating witness...")
res = await ezkl.gen_witness(data_path, compiled_model_path, witness_path)
print("Witness generation result:", res)
assert os.path.isfile(witness_path), "Witness file was not created."
print("Witness file created at:", witness_path)


Running in Google Colab. Installing required packages...
Packages installed successfully.
Model path: /content/model.onnx
Compiled model path: /content/model.compiled
Proving key (pk) path: /content/test.pk
Verifying key (vk) path: /content/test.vk
Settings path: /content/settings.json
Witness path: /content/witness.json
Input (data) path: /content/input.json
Calibration path: /content/calibration.json
Proof path: /content/test.pf
Loaded test data from /content/test_data.npz
Test data shape: (120, 1, 72, 30)
Serialized test input to /content/input.json
Input data shape for EZKL: (1, 2160)
Generating settings...
gen_settings result: True
Generated settings file: /content/settings.json
Serialized calibration data to /content/calibration.json
Calibration data shape for EZKL: (20, 2160)
Calibrating settings...



 <------------- Numerical Fidelity Report (input_scale: 11, param_scale: 11, scale_input_multiplier: 1) ------------->

+---------------+----------------+----------------+---------------+----------------+------------------+---------------+---------------+--------------------+--------------------+------------------------+
| mean_error    | median_error   | max_error      | min_error     | mean_abs_error | median_abs_error | max_abs_error | min_abs_error | mean_squared_error | mean_percent_error | mean_abs_percent_error |
+---------------+----------------+----------------+---------------+----------------+------------------+---------------+---------------+--------------------+--------------------+------------------------+
| -0.0009571314 | -0.00007581711 | -0.00007581711 | -0.0018384457 | 0.0009571314   | 0.00007581711    | 0.0018384457  | 0.00007581711 | 0.0000016928153    | -0.00079423893     | 0.00086424005          |
+---------------+----------------+----------------+---------------+

Calibration completed.
Compiling circuit...
Circuit compilation result: True
Obtaining SRS...
SRS obtained: True
Generating witness...
Witness generation result: {'inputs': [['45feffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', '59feffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', '71feffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', '8dfeffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', 'acfeffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', 'd0feffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', 'f8feffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', '23ffffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', '52ffffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', '85ffffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', 'bcffffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', 'f6ffffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430', '32000000

In [8]:

# ==============================
# Step 8: Setup (Key Generation)
# ==============================

print("Running setup to generate proving and verifying keys...")
res = ezkl.setup(
    compiled_model_path,
    vk_path,
    pk_path,
)
print("Setup result:", res)
assert res == True, "Failed to setup keys."

# Verify that key files exist
assert os.path.isfile(vk_path), f"Verifying key file not found at {vk_path}"
assert os.path.isfile(pk_path), f"Proving key file not found at {pk_path}"
assert os.path.isfile(settings_path), f"Settings file not found at {settings_path}"
print("Verifying key created at:", vk_path)
print("Proving key created at:", pk_path)
print("Settings file exists:", settings_path)


Running setup to generate proving and verifying keys...
Setup result: True
Verifying key created at: /content/test.vk
Proving key created at: /content/test.pk
Settings file exists: /content/settings.json


In [9]:
# ==============================
# Step 9: Generate a Proof
# ==============================

print("Generating proof...")
res = ezkl.prove(
    witness_path,
    compiled_model_path,
    pk_path,
    proof_path,
    "single",
)
print("Proof generation result:", res)
assert os.path.isfile(proof_path), "Proof file was not created."
print("Proof file created at:", proof_path)


Generating proof...
Proof generation result: {'instances': [['5408000000000000000000000000000000000000000000000000000000000000', 'e9f7ffef93f5e1439170b97948e833285d588181b64550b829a031e1724e6430']], 'proof': '0x22b8405f4d268b000b74402e140f4fecd0874ac26ef2d7986c1dc4c92037a59117c53f716aa62b4cd3f96817ab4ab55c6f2c659f0fccae12885fe6424de3aea5176fd1d90f57a04eea8876bbb153fd6599f40206322ac09d19cd31cf23e0602004d690dddd0bc02cf2943f7aec1c912e57b1c5bec9521e9deafcac2f8e66f10f0c649f48dfc56626b0a743222761a4dad10d8ce67368ca09b5024a1ddd68c7f32e9cc6b04518c9ea7354fc8faa2b2b962c62afe02fa35b252e7116f73ea0b22a264cbf056b56b2c42ec0efd590393fb75920fb240247d4fe1424606a2157ce79170c3062a19ae502fcb651e73f83abb7fe4cecf6ae6dda1ae34c008bf2ae605810922e763baeb7fcf9f5a480c8edc1e2293191caa1def78741264289984a19a826e2ece7ae23ae7701f2aa15ccc4865d1044f37e37fef84baea3a396b17b34632e5cea2995b5dbe43503d5d91bc03284b76d2c7fe359023b382201a87d74c138090efec63afec41be3b151e76939320dd96ca03968a20258d078e23297ecc95f1f087252b2d593b4ac2c6

In [10]:

# ==============================
# Step 10: Verify the Proof
# ==============================

print("Verifying proof...")
res = ezkl.verify(
    proof_path,
    settings_path,
    vk_path,
)
assert res == True, "Proof verification failed."
print("Proof verification successful: verified")


Verifying proof...
Proof verification successful: verified
