# Federated Learning with Differential Privacy

Please make sure you set up a virtual environment and follow [example root readme](../../README.md) before starting this notebook.
Then, install the requirements.

<div class="alert alert-block alert-info"> <b>NOTE</b> Some of the cells below generate long text output.  We're using <pre>%%capture --no-display --no-stderr cell_output</pre> to suppress this output.  Comment or delete this line in the cells below to restore full output.</div>

In [1]:
%%capture --no-display --no-stderr cell_output
import sys
!{sys.executable} -m pip install -r requirements.txt

### Differential Privacy (DP)
[Differential Privacy (DP)](https://arxiv.org/abs/1910.00962) [7] is a rigorous mathematical framework designed to provide strong privacy guarantees when handling sensitive data. In the context of Federated Learning (FL), DP plays a crucial role in safeguarding user information by introducing randomness into the training process. Specifically, it ensures privacy by adding carefully calibrated noise to the model updates—such as gradients or weights—before they are transmitted from clients to the central server. This obfuscation mechanism makes it statistically difficult to infer whether any individual data point contributed to a particular update, thereby protecting user-specific information.

By integrating DP into FL, even if an adversary gains access to the aggregated updates or models, the added noise prevents them from accurately deducing sensitive details about any individual client's data. Common approaches include 

1. **local differential privacy (LDP)**, where noise is added directly on the client side before updates are sent
2. **global differential privacy (GDP)**, where noise is injected after aggregation at the server.

The balance between privacy and model utility is typically managed through a privacy budget (ϵ), which quantifies the trade-off between the level of noise added and the resulting model accuracy.


This example shows the usage of a CIFAR-10 training code with NVFlare, as well as the usage of **local** DP filters in your FL training. Here, we use the "Sparse Vector Technique", i.e. the [SVTPrivacy](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.app_common.filters.svt_privacy.html) protocol, as utilized in [Li et al. 2019](https://arxiv.org/abs/1910.00962) [7] (see [Lyu et al. 2016](https://arxiv.org/abs/1603.01699) [8] for more information). 

DP is added as a filter using the [FedJob API](https://nvflare.readthedocs.io/en/main/programming_guide/fed_job_api.html#fedjob-api) you should have seen in prior chapters.

## Run experiments with FL simulator
FL simulator is used to simulate FL experiments or debug codes, not for real FL deployment.

First, train a model using the FedAvg algorithm with four clients without DP.

#### 0. Download the CIFAR-10 data
First, we download the CIFAR-10 dataset to avoid clients overwriting each other's local dataset during this simulation.

In [2]:
import torchvision
DATASET_PATH = "/tmp/nvflare/data"
torchvision.datasets.CIFAR10(root=DATASET_PATH, train=True, download=True)
torchvision.datasets.CIFAR10(root=DATASET_PATH, train=False, download=True)

Dataset CIFAR10
    Number of datapoints: 10000
    Root location: /tmp/nvflare/data
    Split: Test

#### 1. Define a FedJob
The `FedJob` is used to define how controllers and executors are placed within a federated job using the `to(object, target)` routine.

Here we use a PyTorch `BaseFedJob`, where we can define the job name and the initial global model.
The `BaseFedJob` automatically configures components for model persistence, model selection, and TensorBoard streaming for convenience.

In [3]:
from src.net import Net

from nvflare.app_common.workflows.fedavg import FedAvg
from nvflare.app_opt.pt.job_config.base_fed_job import BaseFedJob
from nvflare.job_config.script_runner import ScriptRunner

job = BaseFedJob(
    name="cifar10_fedavg",
    initial_model=Net(),
)

#### 2. Define the Controller Workflow
Define the controller workflow and send it to the server. For simplicity, we will run the simulation only for a few round but you can increase it for the models to converge.

In [4]:
n_clients = 2
num_rounds = 30  # 30

controller = FedAvg(
    num_clients=n_clients,
    num_rounds=num_rounds,
)
job.to(controller, "server")

That completes the components that need to be defined on the server.

#### 3. Add clients
Next, we can use the `ScriptRunner` and send it to each of the clients to run our training script.

Note that our script could have additional input arguments, such as batch size or data path, but we don't use them here for simplicity.

In [5]:
for i in range(n_clients):
    runner = ScriptRunner(
        script="src/cifar10_fl.py"
    )
    job.to(runner, f"site-{i+1}")

That's it!

#### 4. Optionally export the job
Now, we could export the job and submit it to a real NVFlare deployment using the [Admin client](https://nvflare.readthedocs.io/en/main/real_world_fl/operation.html) or [FLARE API](https://nvflare.readthedocs.io/en/main/real_world_fl/flare_api.html).

In [6]:
job.export_job("job_configs")

#### 5. Run FL Simulation
Finally, we can run our FedJob in simulation using NVFlare's [simulator](https://nvflare.readthedocs.io/en/main/user_guide/nvflare_cli/fl_simulator.html) under the hood. We can also specify which GPU should be used to run the clients, which is helpful for simulated environments using the `gpu` argument.

The results will be saved in the specified `workdir`.

In [7]:
job.simulator_run(f"/tmp/nvflare/{job.name}")

[38m2025-02-25 19:48:49,373 - IntimeModelSelector - INFO - model selection weights control: {}[0m
[38m2025-02-25 19:48:50,698 - TBAnalyticsReceiver - INFO - Tensorboard records can be found in /tmp/nvflare/cifar10_fedavg/server/simulate_job/tb_events you can view it using `tensorboard --logdir=/tmp/nvflare/cifar10_fedavg/server/simulate_job/tb_events`[0m
[38m2025-02-25 19:48:50,700 - FedAvg - INFO - Initializing BaseModelController workflow.[0m
[38m2025-02-25 19:48:50,700 - FedAvg - INFO - Beginning model controller run.[0m
[38m2025-02-25 19:48:50,701 - FedAvg - INFO - Start FedAvg.[0m
[38m2025-02-25 19:48:50,701 - FedAvg - INFO - loading initial model from persistor[0m
[38m2025-02-25 19:48:50,701 - PTFileModelPersistor - INFO - Both source_ckpt_file_full_name and ckpt_preload_path are not provided. Using the default model weights initialized on the persistor side.[0m
[38m2025-02-25 19:48:50,702 - FedAvg - INFO - Round 0 started.[0m
[38m2025-02-25 19:48:50,703 - FedAvg

#### 6. Run FL Simulation with DP
Run the FL simulator with two clients for federated learning with differential privacy. The key now is to add a filer to each client that applies DP before sending the model updates back to the server
using the `job.to()` method.

Let's create a new FedJob with the DP add through the [SVTPrivacy](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.app_common.filters.html#nvflare.app_common.filters.SVTPrivacy) Filter.

> **Note:** Use `filter_type=FilterType.TASK_RESULT` as we are adding the filter on top of the model updates after local training.

In [12]:
from nvflare import FilterType
from nvflare.client.config import TransferType
from nvflare.app_common.filters import SVTPrivacy

# Create BaseFedJob with the initial model
job = BaseFedJob(
  name="cifar10_fedavg_dp",
  initial_model=Net(),
)

# Define the controller and send to server
controller = FedAvg(
    num_clients=n_clients,
    num_rounds=100 #num_rounds,
)
job.to_server(controller)

# Add clients
for i in range(n_clients):
    runner = ScriptRunner(
        script="src/cifar10_fl.py",
        params_transfer_type=TransferType.DIFF
    )
    job.to(runner, f"site-{i+1}")

    # add privacy filter.
    #dp_filter = SVTPrivacy(fraction=0.9, epsilon=0.001, noise_var=1.0, gamma=1e-4)
    #dp_filter = SVTPrivacy(fraction=0.1, epsilon=0.1, noise_var=0.1, gamma=1e-5)  # low performance
    #dp_filter = SVTPrivacy(fraction=0.9, epsilon=0.1, noise_var=1.0, gamma=1e-4)  # kind of works, not always
    dp_filter = SVTPrivacy(fraction=0.9, epsilon=0.1, noise_var=0.1, gamma=1e-5)  # maybe!!!
    #dp_filter = SVTPrivacy(fraction=0.9, epsilon=5.0, noise_var=0.01, gamma=1e-5)
    #dp_filter = SVTPrivacy(fraction=0.95, epsilon=0.1, noise_var=0.01, gamma=1e-5)
    job.to(dp_filter, f"site-{i+1}", tasks=["train"], filter_type=FilterType.TASK_RESULT)

# optionally export the configuration
job.export_job("job_configs")

Next, start the training

In [None]:
job.simulator_run(f"/tmp/nvflare/{job.name}")

[38m2025-02-25 20:04:32,756 - IntimeModelSelector - INFO - model selection weights control: {}[0m
[38m2025-02-25 20:04:34,079 - TBAnalyticsReceiver - INFO - Tensorboard records can be found in /tmp/nvflare/cifar10_fedavg_dp/server/simulate_job/tb_events you can view it using `tensorboard --logdir=/tmp/nvflare/cifar10_fedavg_dp/server/simulate_job/tb_events`[0m
[38m2025-02-25 20:04:34,081 - FedAvg - INFO - Initializing BaseModelController workflow.[0m
[38m2025-02-25 20:04:34,082 - FedAvg - INFO - Beginning model controller run.[0m
[38m2025-02-25 20:04:34,082 - FedAvg - INFO - Start FedAvg.[0m
[38m2025-02-25 20:04:34,082 - FedAvg - INFO - loading initial model from persistor[0m
[38m2025-02-25 20:04:34,083 - PTFileModelPersistor - INFO - Both source_ckpt_file_full_name and ckpt_preload_path are not provided. Using the default model weights initialized on the persistor side.[0m
[38m2025-02-25 20:04:34,084 - FedAvg - INFO - Round 0 started.[0m
[38m2025-02-25 20:04:34,084 - 

> **Note:** you can also try adding or combining the filters with other privacy filters or customize them. For example, use the [PercentilePrivacy](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.app_common.filters.html#nvflare.app_common.filters.PercentilePrivacy) filter based on Shokri and Shmatikov ([Privacy-preserving deep learning, CCS '15](https://dl.acm.org/doi/abs/10.1145/2810103.2813687)) or [ExcludeVars](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.app_common.filters.html#nvflare.app_common.filters.ExcludeVars) filter to exclude variables that shouldn't be shared with the server.

### 7. Visualize the results
You can plot the results by running `tensorboard --logdir /tmp/nvflare` in a new terminal

![TensorBoard Training curve of FedAvg without and with DP](tb_curve.png)