In [None]:
!pip install tensorflow tensorflow-federated numpy tenseal optuna

Collecting tensorflow-federated
  Downloading tensorflow_federated-0.87.0-py3-none-manylinux_2_31_x86_64.whl.metadata (19 kB)
Collecting tenseal
  Downloading tenseal-0.3.14-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (8.2 kB)
Collecting optuna
  Downloading optuna-4.0.0-py3-none-any.whl.metadata (16 kB)
Collecting attrs~=23.1 (from tensorflow-federated)
  Downloading attrs-23.2.0-py3-none-any.whl.metadata (9.5 kB)
Collecting dp-accounting==0.4.3 (from tensorflow-federated)
  Downloading dp_accounting-0.4.3-py3-none-any.whl.metadata (1.8 kB)
Collecting google-vizier==0.1.11 (from tensorflow-federated)
  Downloading google_vizier-0.1.11-py3-none-any.whl.metadata (10 kB)
Collecting jaxlib==0.4.14 (from tensorflow-federated)
  Downloading jaxlib-0.4.14-cp310-cp310-manylinux2014_x86_64.whl.metadata (2.0 kB)
Collecting jax==0.4.14 (from tensorflow-federated)
  Downloading jax-0.4.14.tar.gz (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1

## 1. Importing Libraries

In [None]:
import tensorflow as tf
import numpy as np
import optuna
import tenseal as ts

* **tensorflow**: For building and training deep learning models.
* **numpy**: For numerical operations on arrays and matrices.
* **optuna**: A library for hyperparameter optimization.
* **tenseal**: A library for performing homomorphic encryption operations, specifically CKKS (used for
encrypted computations on real numbers).

## 2. Defining Constants

In [None]:
num_clients = 5
num_rounds = 10

These constants define how many clients participate in the federated learning process and how many training rounds are performed. In federated learning, multiple clients (devices) contribute to the model's training without sharing raw data. Instead, the updates (gradients) are shared, and here, they are also encrypted to preserve privacy.

## 3. Model Creation Function

In [None]:
def create_model(units):
    model = tf.keras.models.Sequential([
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(units, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')
    ])
    return model

### **Mechanism**:
A simple feed-forward neural network is created.
* Flatten layer: Converts 28x28 pixel images from the MNIST dataset into a 1D vector of size 784. This simplifies the input data for the fully connected layers.
* Dense (hidden) layer: A fully connected layer with units neurons and ReLU activation, optimized by the optuna library during hyperparameter tuning.
* Dense (output) layer: A softmax layer with 10 neurons to classify the images into one of the 10 digit classes (0-9).
The number of hidden units and learning rate will be tuned by optuna.

## 4. Preprocessing Data

In [None]:
def preprocess_data():
    (x_train, y_train), (_, _) = tf.keras.datasets.mnist.load_data()
    x_train = x_train / 255.0
    return tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(16)


### **Mechanism**:
* **Normalization**: The MNIST image data is normalized by dividing pixel values by 255 to scale them between 0 and 1, improving training stability.
* **Batching**: The data is converted into a TensorFlow Dataset and batched in sizes of 16, allowing the model to process smaller chunks of data at once for efficiency.

## 5. Create Federated Data

In [None]:
def create_federated_data(num_clients):
    data = preprocess_data()
    return [data.take(50) for _ in range(num_clients)]


### **Mechanism**:
This step creates synthetic federated datasets for each client by splitting the dataset into smaller chunks.
* Each client receives a portion of the dataset (50 batches of 16 images).
* Federated Learning Concept: Instead of pooling all data at a central server, each client holds its own data, performs local training, and sends updates (gradients) to the central server.

## 6. Encryption Initialization

In [None]:
def initialize_encryption():
    context = ts.context(
        ts.SCHEME_TYPE.CKKS,
        poly_modulus_degree=8192,
        coeff_mod_bit_sizes=[60, 40, 40, 60]
    )
    context.global_scale = 2 ** 40
    context.generate_galois_keys()
    return context

### **Mechanism**:
* CKKS Scheme: CKKS is a homomorphic encryption scheme that allows operations on encrypted floating-point numbers. This makes it ideal for encrypting gradients in federated learning.
* Encryption Context: A context is initialized that specifies encryption parameters:

    * poly_modulus_degree: Controls the size of ciphertexts (increasing it makes ciphertexts larger but more accurate).
    * coeff_mod_bit_sizes: Determines the precision of encryption (bit lengths for each modulus).
* Global Scale: Sets the precision for encrypted computations (important for preventing overflow during encryption operations).
* Galois Keys: These keys are necessary for performing certain encrypted operations, such as rotations or summations, in a privacy-preserving manner.

## 7. Encryption/Decryption Functions

In [None]:
def encrypt_tensor(context, tensor):
    tensor = np.array(tensor)
    flat_tensor = tensor.reshape(-1)
    encrypted_tensor = ts.ckks_vector(context, flat_tensor)
    return encrypted_tensor

def decrypt_tensor(context, encrypted_tensor):
    if isinstance(encrypted_tensor, list):
        return [decrypt_tensor(context, item) for item in encrypted_tensor]
    elif isinstance(encrypted_tensor, tuple):
        return tuple(decrypt_tensor(context, item) for item in encrypted_tensor)
    elif hasattr(encrypted_tensor, 'decrypt'):
        return np.array(encrypted_tensor.decrypt())
    else:
        return np.array(encrypted_tensor)


* **Encryption**:
    * The tensor (gradient or weight) is converted into a flat 1D array, as encryption algorithms often operate on vectors.
    * The CKKS scheme encrypts this array using the encryption context, producing an encrypted vector that can be safely transmitted to the server without revealing the original data.
* **Decryption**:
    * The encrypted data is decrypted back into its original form using the same context.
    * Decryption happens on the server side after the encrypted updates from clients are collected.

## 8. Federated Averaging

In [None]:
def federated_averaging(model, federated_data, context):
    def client_update(model, data):
        with tf.GradientTape() as tape:
            for x_batch, y_batch in data:
                y_pred = model(x_batch, training=True)
                loss = tf.keras.losses.sparse_categorical_crossentropy(y_batch, y_pred)

        gradients = tape.gradient(loss, model.trainable_variables)
        encrypted_gradients = [encrypt_tensor(context, g.numpy()) for g in gradients]
        return encrypted_gradients


    def server_aggregate(encrypted_gradients):
        decrypted_gradients = [decrypt_tensor(context, g) for g in zip(*encrypted_gradients)]
        avg_gradients = [np.mean(grad, axis=0) for grad in decrypted_gradients]
        return avg_gradients

    initial_weights = model.get_weights()
    for round_num in range(num_rounds):
        encrypted_updates = []
        for data in federated_data:
            client_model = create_model(model.layers[1].units)
            client_model.set_weights(initial_weights)
            encrypted_updates.append(client_update(client_model, data))
        aggregated_gradients = server_aggregate(encrypted_updates)

        new_weights = []
        for w, g in zip(initial_weights, aggregated_gradients):
            if w.shape != g.shape:
                g = g.reshape(w.shape)
            new_weights.append(w - g)

        model.set_weights(new_weights)
        initial_weights = new_weights

    return model

* **federated_averaging**: Implements the core of federated learning with encrypted data.
* Client Update:
    * Each client trains the model locally on its own dataset using gradient descent.
    * The gradients (changes in weights based on the loss) are computed for each batch using TensorFlow’s GradientTape.
    * These gradients are then encrypted using the CKKS encryption method before being sent back to the server.
* Server Aggregation:
    * The server decrypts the encrypted gradients from all clients.
    * It then averages the gradients from different clients to update the global model. This ensures that no single client's data is used to influence the model disproportionately.
    * Averaging is a key mechanism in federated learning, ensuring collaborative learning without data centralization.


## 9. Objective Function for Optuna

In [None]:
def objective(trial):
    units = trial.suggest_int('units', 64, 256)
    learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1e-1)

    model = create_model(units)
    model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate),
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])

    federated_data = create_federated_data(num_clients)
    context = initialize_encryption()
    trained_model = federated_averaging(model, federated_data, context)

    test_data = preprocess_data().take(1)
    performance_metric = trained_model.evaluate(test_data, return_dict=True)['accuracy']

    return performance_metric

### **Mechanism**:
This function defines the optimization process that optuna will perform.
* Units and learning rate are suggested by optuna as hyperparameters to tune.
* The model is compiled and trained using federated averaging.
* After training, the model is evaluated on a small test dataset, and the accuracy is returned.
* This accuracy is used as the performance metric for the optuna study, where it tries different hyperparameter configurations over several trials.

## 10. Running Optuna Study

In [None]:
import warnings
warnings.filterwarnings("ignore")
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=10)
print(f"Best hyperparameters: {study.best_params}")
print(f"Best accuracy: {study.best_value}")

[I 2024-09-19 18:15:09,965] A new study created in memory with name: no-name-5e8d42d6-5e0a-4f25-a011-6ac755ca8b5d


The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_mod

[I 2024-09-19 18:15:45,599] Trial 0 finished with value: 0.0625 and parameters: {'units': 221, 'learning_rate': 7.679563994524996e-05}. Best is trial 0 with value: 0.0625.


The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_mod

[I 2024-09-19 18:16:16,584] Trial 1 finished with value: 0.0625 and parameters: {'units': 175, 'learning_rate': 0.08792827304795463}. Best is trial 0 with value: 0.0625.


The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_mod

[I 2024-09-19 18:16:44,559] Trial 2 finished with value: 0.0625 and parameters: {'units': 148, 'learning_rate': 9.308977420429588e-05}. Best is trial 0 with value: 0.0625.


The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_mod

[I 2024-09-19 18:17:08,881] Trial 3 finished with value: 0.0625 and parameters: {'units': 104, 'learning_rate': 0.01767955932618649}. Best is trial 0 with value: 0.0625.


The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_mod

[I 2024-09-19 18:17:36,317] Trial 4 finished with value: 0.0 and parameters: {'units': 139, 'learning_rate': 0.0001603913519285198}. Best is trial 0 with value: 0.0625.


The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_mod

[I 2024-09-19 18:18:00,573] Trial 5 finished with value: 0.0 and parameters: {'units': 108, 'learning_rate': 0.024226323084575008}. Best is trial 0 with value: 0.0625.


The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_mod

[I 2024-09-19 18:18:30,083] Trial 6 finished with value: 0.0625 and parameters: {'units': 147, 'learning_rate': 0.011410007115614124}. Best is trial 0 with value: 0.0625.


The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_mod

[I 2024-09-19 18:18:52,514] Trial 7 finished with value: 0.125 and parameters: {'units': 90, 'learning_rate': 0.00019876801041687414}. Best is trial 7 with value: 0.125.


The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_mod

[I 2024-09-19 18:19:19,133] Trial 8 finished with value: 0.0625 and parameters: {'units': 133, 'learning_rate': 0.040138966288066964}. Best is trial 7 with value: 0.125.


The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_modulus parameter, to fit your input.
The following operations are disabled in this setup: matmul, matmul_plain, enc_matmul_plain, conv2d_im2col.
If you need to use those operations, try increasing the poly_mod

[I 2024-09-19 18:19:53,357] Trial 9 finished with value: 0.0625 and parameters: {'units': 228, 'learning_rate': 0.016890425575250285}. Best is trial 7 with value: 0.125.


Best hyperparameters: {'units': 90, 'learning_rate': 0.00019876801041687414}
Best accuracy: 0.125


* Optuna creates a study to maximize the accuracy metric.
* It runs 10 trials, each with a different set of hyperparameters (hidden units and learning rate) to find the best combination that maximizes the model’s performance.

## 11. Testing Encryption

In [None]:
context = initialize_encryption()
sample_tensor = np.array([1.0, 2.0, 3.0])
encrypted_tensor = encrypt_tensor(context, sample_tensor)
decrypted_tensor = decrypt_tensor(context, encrypted_tensor)

print(f"Original: {sample_tensor}")
print(f"Encrypted: {encrypted_tensor}")
print(f"Decrypted: {decrypted_tensor}")


Original: [1. 2. 3.]
Encrypted: <tenseal.tensors.ckksvector.CKKSVector object at 0x7acb6bf30100>
Decrypted: [1. 2. 3.]


### **Mechanism Recap:**
1. Federated Learning: Multiple clients hold local data, train a shared model locally, and send encrypted updates to a central server.
2. Homomorphic Encryption: Gradients are encrypted before being sent to the server, allowing secure aggregation of updates without revealing client data.
3. Federated Averaging: The server decrypts, averages the gradients, and updates the global model iteratively over several rounds.
4. Hyperparameter Optimization: Optuna tries different model configurations to improve accuracy, tuning the model for optimal performance.