## Case 1. DP on local dataset entries: local training with DP-SGD

Models are clipped by costant $C$, minibatch of size $\mbox{batch_size} = q*N$, the variance of the perturbation noise is $\frac{(C\sigma)^2}{\sqrt{q*N}}$.

In [12]:
from fedbiomed.researcher.privacy.rdp_accountant import compute_rdp, get_privacy_spent
import matplotlib.pyplot as plt
import numpy as np

max_data_size = 1000

target_delta = .1/max_data_size
max_eps = 20
sigma = 1


orders = [1 + x / 10. for x in range(1, 100)] + list(range(12, 64))
rdp = compute_rdp(q=0.1,
                  noise_multiplier = sigma,
                  steps=1,
                  orders=orders)

In [13]:
rounds = range(1,2000)
epsilon_range = np.array([get_privacy_spent(orders, i*rdp, target_delta=target_delta)[0] for i in rounds])

max_training_steps = int(np.sum(epsilon_range<max_eps))

In [14]:
max_training_steps

714

At this point each client's model is differentially private $DP(\epsilon,\delta)$, with $\epsilon<\mbox{max_eps}$ and $\delta = \mbox{target_delta}$.

## Case 2. DP on clients' models: LDP with perturbation on trained clients' models

Models are trained with clipping by costant $C$, and the variance of the perturbation noise is $(C\sigma)^2$.

In [27]:
n_clients = 5

# Is this an acceptable delta?
target_delta = .01/n_clients
max_eps = 20
sigma = 1


orders = [1 + x / 10. for x in range(1, 100)] + list(range(12, 64))
rdp = compute_rdp(q=1,
                  noise_multiplier = sigma,
                  steps=1,
                  orders=orders)

In [28]:
rounds = range(1,2000)
epsilon_range = np.array([get_privacy_spent(orders, i*rdp, target_delta=target_delta)[0] for i in rounds])

max_training_steps = int(np.sum(epsilon_range<max_eps))

In [29]:
max_training_steps

15

In [37]:
# Lower delta

target_delta = 1e-5/n_clients

orders = [1 + x / 10. for x in range(1, 100)] + list(range(12, 64))
rdp = compute_rdp(q=1,
                  noise_multiplier = sigma,
                  steps=1,
                  orders=orders)

rounds = range(1,2000)
epsilon_range = np.array([get_privacy_spent(orders, i*rdp, target_delta=target_delta)[0] for i in rounds])

max_training_steps = int(np.sum(epsilon_range<max_eps))
max_training_steps

9

At this point the aggregated model is differentially private $DP(\epsilon,\delta)$, with $\epsilon<\mbox{max_eps}$ and $\delta = \mbox{target_delta}$.

## Case 3. DP on clients' aggregation (McMahan et al.)


Models are clipped by costant $C$, and the variance of the perturbation noise is $\frac{(C\sigma)^2}{\sqrt{n_{clients}}}$.

In [34]:
n_clients = 5

# Is this an acceptable delta?
target_delta = .01/n_clients
max_eps = 20
sigma = 1


orders = [1 + x / 10. for x in range(1, 100)] + list(range(12, 64))
rdp = compute_rdp(q=1,
                  noise_multiplier = sigma,
                  steps=1,
                  orders=orders)

In [35]:
rounds = range(1,2000)
epsilon_range = np.array([get_privacy_spent(orders, i*rdp, target_delta=target_delta)[0] for i in rounds])

max_training_steps = int(np.sum(epsilon_range<max_eps))

In [36]:
max_training_steps

15