# Differential Privacy in flwr Part 2
## Step by step guide

In this tutorial, we will learn the recommended best practices to build an effective differential privacy setting in federated learning using Flower. 

The documentation will provide a step-by-step guide, code examples, and best practices for implementing user-level differential privacy gurantees in federated learning systems.

You may consider to look at the previous tutorial where we introduced Differential Privacy in federated learning with Flower ([part 1](Flower-5-Differential-Privacy-in-flwr-part-1.ipynb)), the introductory notebook (again, using [Flower](https://flower.dev/) and [PyTorch](https://pytorch.org/))..


> [Star Flower on GitHub](https://github.com/adap/flower) ⭐️ and join the Flower community on Slack to connect, ask questions, and get help: [Join Slack](https://flower.dev/join-slack) 🌼 We'd love to hear from you in the `#introductions` channel! And if anything is unclear, head over to the `#questions` channel.


In this notebook, we will demonstrate the recommended best practice for training models with user-level Differential Privacy using.
 
We will train (30 rounds) a model with high trade-off between utility and privacy on the CIFAR-10 training and test set, partition them into ten smaller datasets. If we used more training rounds, we could certainly have a higher-accuracy private model, but not as high as a model trained without DP.

Let's get started!

### Installing dependencies

Next, we install the necessary packages for PyTorch (`torch` and `torchvision`) and Flower (`flwr`):

In [2]:
cd ../../..

/workspaces/flower


In [3]:
pip install -e . 

Defaulting to user installation because normal site-packages is not writeable


Obtaining file:///workspaces/flower
  Installing build dependencies ... [?25ldone
[?25h  Checking if build backend supports build_editable ... [?25ldone
[?25h  Getting requirements to build editable ... [?25ldone
[?25h  Preparing editable metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: flwr
  Building editable for flwr (pyproject.toml) ... [?25ldone
[?25h  Created wheel for flwr: filename=flwr-1.5.0-py3-none-any.whl size=9086 sha256=a2fa1e5fbe2225df5f87b0e5a3bb8193854278a115ab68ab70faca6e7f592ffb
  Stored in directory: /tmp/pip-ephem-wheel-cache-v_gt_j84/wheels/d4/ef/3f/b808d09666e16dcbbe3c9cbf7f3d223e3acb30348d118fd5d1
Successfully built flwr
Installing collected packages: flwr
  Attempting uninstall: flwr
    Found existing installation: flwr 1.5.0
    Uninstalling flwr-1.5.0:
      Successfully uninstalled flwr-1.5.0
Successfully installed flwr-1.5.0

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[

In [4]:
# !pip install  flwr[simulation] torch torchvision matplotlib

Now that we have all dependencies installed, we can import everything we need for this tutorial:

In [5]:
from collections import OrderedDict
from typing import Dict, Optional, List, Tuple

import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader, random_split
from torchvision.datasets import CIFAR10

import flwr as fl
# import tensorflow_privacy as tfp
from flwr.common import Metrics


DEVICE = torch.device("cpu")  # Try "cuda" to train on GPU
print(
    f"Training on {DEVICE} using PyTorch {torch.__version__} and Flower {fl.__version__}"
)

Training on cpu using PyTorch 2.0.1+cu117 and Flower 1.5.0



### Downloading and preprocessing the data

We will use a convolutional neural network (CNN) on the popular CIFAR-10 dataset (10 Classes) in a Federated Learning setting of 10 clients.

In [6]:
#Setting our constants

NUM_CLIENTS = 10
BATCH_SIZE = 32
NUM_ROUNDS = 15


def load_datasets():
    # Download and transform CIFAR-10 (train and test)
    transform = transforms.Compose(
        [transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
    )
    trainset = CIFAR10("./dataset", train=True, download=True, transform=transform)
    testset = CIFAR10("./dataset", train=False, download=True, transform=transform)

    # Split training set into 10 partitions to simulate the individual dataset
    partition_size = len(trainset) // NUM_CLIENTS
    lengths = [partition_size] * NUM_CLIENTS
    datasets = random_split(trainset, lengths, torch.Generator().manual_seed(42))

    # Split each partition into train/val and create DataLoader
    trainloaders = []
    valloaders = []
    for ds in datasets:
        len_val = len(ds) // 10  # 10 % validation set
        len_train = len(ds) - len_val
        lengths = [len_train, len_val]
        ds_train, ds_val = random_split(ds, lengths, torch.Generator().manual_seed(42))
        trainloaders.append(DataLoader(ds_train, batch_size=BATCH_SIZE, shuffle=True))
        valloaders.append(DataLoader(ds_val, batch_size=BATCH_SIZE))
    testloader = DataLoader(testset, batch_size=BATCH_SIZE)
    return trainloaders, valloaders, testloader


trainloaders, valloaders, testloader = load_datasets()


Files already downloaded and verified
Files already downloaded and verified


### Model, Training and Evaluation functions

Let's define our model (including `set_parameters` and `get_parameters`), training and test functions:

In [7]:
class Net(nn.Module):
    def __init__(self) -> None:
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


def get_parameters(net) -> List[np.ndarray]:
    return [val.cpu().numpy() for _, val in net.state_dict().items()]


def set_parameters(net, parameters: List[np.ndarray]):
    params_dict = zip(net.state_dict().keys(), parameters)
    state_dict = OrderedDict({k: torch.Tensor(v) for k, v in params_dict})
    net.load_state_dict(state_dict, strict=True)


def train(net, trainloader, epochs: int):
    """Train the network on the training set."""
    criterion = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(net.parameters())
    net.train()
    for epoch in range(epochs):
        correct, total, epoch_loss = 0, 0, 0.0
        for images, labels in trainloader:
            images, labels = images.to(DEVICE), labels.to(DEVICE)
            optimizer.zero_grad()
            outputs = net(images)
            loss = criterion(net(images), labels)
            loss.backward()
            optimizer.step()
            # Metrics
            epoch_loss += loss
            total += labels.size(0)
            correct += (torch.max(outputs.data, 1)[1] == labels).sum().item()
        epoch_loss /= len(trainloader.dataset)
        epoch_acc = correct / total
        print(f"Epoch {epoch+1}: train loss {epoch_loss}, accuracy {epoch_acc}")


def test(net, testloader):
    """Evaluate the network on the entire test set."""
    criterion = torch.nn.CrossEntropyLoss()
    correct, total, loss = 0, 0, 0.0
    net.eval()
    with torch.no_grad():
        for images, labels in testloader:
            images, labels = images.to(DEVICE), labels.to(DEVICE)
            outputs = net(images)
            loss += criterion(outputs, labels).item()
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    loss /= len(testloader.dataset)
    accuracy = correct / total
    return loss, accuracy

### Flower client

We create a subclass of `flwr.client.DPFedAvgNumPyClient` and implement the three methods `get_parameters`, `fit`, and `evaluate`. Here, we also pass the `cid` to the client and use it log additional details:

In [8]:
class FlowerClient (fl.client.DPFedAvgNumPyClient):
    def __init__(self, cid, net, trainloader, valloader):
        self.cid = cid
        self.net = net
        self.trainloader = trainloader
        self.valloader = valloader

    def get_parameters(self, config):
        print(f"[Client {self.cid}] get_parameters")
        return get_parameters(self.net)

    def fit(self, parameters, config):
        print(f"[Client {self.cid}] fit, config: {config}")
        set_parameters(self.net, parameters)
        train(self.net, self.trainloader, epochs=1)
        return get_parameters(self.net), len(self.trainloader), {}

    def evaluate(self, parameters, config):
        print(f"[Client {self.cid}] evaluate, config: {config}")
        set_parameters(self.net, parameters)
        loss, accuracy = test(self.net, self.valloader)
        return float(loss), len(self.valloader), {"accuracy": float(accuracy)}
    
def client_fn(cid) -> FlowerClient:
    net = Net().to(DEVICE)
    trainloader = trainloaders[int(cid)]
    valloader = valloaders[int(cid)]
    return FlowerClient(cid, net, trainloader, valloader)

# **Strategy customization**

## Simplifying Assumptions

### DP-FedAvg Modification
Originally proposed by McMahan, H. Brendan, et al.,"Learning differentially private recurrent language models". DP-FedAvg, has been extended by [Andrew et al. 2021, Differentially Private Learning with Adaptive Clipping](https://arxiv.org/abs/1905.03871), is essentially FedAvg with the following modifications:

To guarantee DP at a user-level, we need to apply the following modification in our Federated Averaging algorithm:

 - **Clipping:**, the clients' model updates must be clipped before transmission to the server, bounding the maximum influence of any one client. Moreover, knowing that the distribution of the update norm has been shown to vary from task-to-task and to evolve as training progresses. Therefore, we will use an adaptive approach of [Andrew et al. 2021, Differentially Private Learning with Adaptive Clipping](#dp-fedavg-modification) that continuously adjusts the clipping threshold to track a prespecified quantile of the update norm distribution.
 - **Noising:** Gaussian noise must be added by either the server or the clients. Adding noise could degrade the utility of the model, but we can control it using   the standard deviation of the Gaussian noise added to the sum, and the number of sampled clients at each round.
 >>>> *We provide users with the flexibility to set up the training such that each client independently adds a small amount of noise to the clipped update, with the result that simply aggregating the noisy updates is equivalent to the explicit addition of noise to the non-noisy aggregate at the server.*


 > *Remarks:* In order to do this, we need first to determine how much noise the model can tolerate with the chosed number of clients (relatively small) per round with fair loss to the model utility. We will eventually train the final model with an increased amount of noise with proportional increased number of clients. 

### Simplifying assumptions for the training process

To ensure that the training process realises the $(\epsilon, \delta )$-guarantees at the user-level, we made the following assumptions:

-**Fixed-size subsampling** :Fixed-size subsamples of the clients must be taken at each round, as opposed to variable-sized Poisson subsamples.

-**Unweighted averaging** : The contributions from all the clients must weighted equally in the aggregate to eliminate the requirement for the server to know in advance the sum of the weights of all clients available for selection.

-**No client failures** : The set of available clients must stay constant across all rounds of training. In other words, clients cannot drop out or fail.


 >*Note:* The dataset should be large enough to support the selected number of clients. It is also important to note that since we are using an adaptative clipping, and the noise multiplier being the ratio of the noise standard deviation to the clipping norm; therefore the magnitude of the noise will change from rounds to rounds. 
 >>*The above assumptions are in line with the contraints imposed by [Andrew et al.](#dp-fedavg-modification)*

In [12]:
# Create an instance of the model and get the parameters

NUM_SAMPLED_CLIENTS = 5
# Pass parameters to the Strategy for server-side parameter initialization

strategy = fl.server.strategy.FedAvg(
    fraction_fit=0.3,
    fraction_evaluate=0.3,
    min_fit_clients=3,
    min_evaluate_clients=3,
    min_available_clients=NUM_CLIENTS,
)

strategy = fl.server.strategy.DPFedAvgAdaptive(
    strategy = strategy ,
    num_sampled_clients =  NUM_SAMPLED_CLIENTS,
    init_clip_norm = 0.3,
    noise_multiplier = 0.9,
    server_side_noising = False ,
    clip_norm_lr =0.2,
    clip_norm_target_quantile = 0.5,
    clip_count_stddev = None,
)

# Specify client resources if you need GPU (defaults to 1 CPU and 0 GPU)
client_resources = {"num_cpus": 1, "num_gpus": 0}

# Start simulation
fl.simulation.start_simulation(
    client_fn=client_fn,
    num_clients=NUM_CLIENTS,
    config=fl.server.ServerConfig(num_rounds=3),  # Just three rounds
    strategy=strategy,
    client_resources=client_resources,
)

INFO flwr 2023-08-15 21:52:03,458 | app.py:145 | Starting Flower simulation, config: ServerConfig(num_rounds=3, round_timeout=None)
2023-08-15 21:52:07,575	INFO worker.py:1636 -- Started a local Ray instance.
INFO flwr 2023-08-15 21:52:08,381 | app.py:179 | Flower VCE: Ray initialized with resources: {'CPU': 4.0, 'node:172.16.5.4': 1.0, 'object_store_memory': 4398966374.0, 'memory': 8797932750.0}
INFO flwr 2023-08-15 21:52:08,382 | server.py:89 | Initializing global parameters
INFO flwr 2023-08-15 21:52:08,383 | server.py:276 | Requesting initial parameters from one random client
INFO flwr 2023-08-15 21:52:10,513 | server.py:280 | Received initial parameters from one random client
INFO flwr 2023-08-15 21:52:10,514 | server.py:91 | Evaluating initial parameters
INFO flwr 2023-08-15 21:52:10,514 | server.py:104 | FL starting


[2m[36m(launch_and_get_parameters pid=24041)[0m [Client 9] get_parameters


TypeError: float() argument must be a string or a real number, not 'complex'

### Setting up the *$(\epsilon, \delta )$-analysis*

In [None]:
max_order = 32
orders = range(2, max_order + 1)
rdp = tfp.compute_rdp_sample_without_replacement(q, z, n, orders)
eps, _, _ = tfp.rdp_accountant.get_privacy_spent(rdp, target_delta=delta)