# Federated Unlearning: How to Efficiently Erase a Client in FL?

In this paper, we consider the case where a client (referred to as the target client) wants to opt out of federation after the federated learning process, and as a result wants to remove their contribution from the global model. As an example, consider an FL scenario wherein multiple hospitals from different countries (or geographical regions) collaborate on training a model for tumor detection. After training the model, suppose one of the hospitals decides to opt out of federation due to changes in the privacy regulations in their country. Then, the question is how to unlearn the hospital's data from the trained model. We argue that the hospital that wants to opt out can effectively perform unlearning by *reversing* the learning process.

In an FL training round, each client essentially solves an empirical risk minimization problem, and strives to learn a model that minimizes the empirical loss. We propose to perform unlearning at the target client (to be erased) by training a model to *maximize* the empirical loss. Instead of optimizing arbitrarily over the entire space of model parameters, we formulate the unlearning problem as a constrained maximization problem by restricting to an $\ell_2$-norm ball around a suitably chosen *reference model*. We argue that the average of the other clients' local models (except the local model of the target client) in the last round of FL training is an effective reference model, as it helps to retain some knowledge learnt from the other clients' data. This formulation allows the client to perform the unlearning by using the Projected Gradient Descent (PGD) algorithm. To improve the performance of the unlearned model, we continue the FL process for a few rounds without the participation of the target client. Our results on the MNIST dataset show that the proposed method successfully unlearns the contribution of the target client in an efficient way. In fact, our results demonstrate that just one round of FL process after unlearning is sufficient to yield the performance comparable to retraining from scratch for a large number of rounds.

## Design

In this work, we consider the following unlearning scenario in the FL setting. After FL training is performed with $N$ clients for the specified $T$ rounds, then a client $i\in[N]$ requests to opt out of federation and wants to remove their contribution from the FL model. We refer to this client as the *target client*. Similar to the centralized machine unlearning, the most legitimate way of erasing a client is to retrain the FL model from the beginning. However, retraining from scratch is often prohibitively expensive. Thus, in this work, we focus on approximate unlearning with the goal of obtaining a performance *close to* retraining. 

### Unlearning Metrics

Unlearning can be formally defined by considering the distribution of all models that a federated learning algorithm can produce and requiring that the distributions of retrained and unlearned models are *close* in some distance metric. However, since computing a distance between distributions is typically expensive, prior works measure the quality of unlearning in terms of distance in the weight or output space of retrained and unlearned models, e.g., $\ell_2$-distance or Kullback-Leibler (KL) divergence. Other metrics include privacy leakage in the differential privacy framework and membership inference attacks. In this paper, we follow the approach similar to described in the following.

- **Backdoors to Evaluate Unlearning**: We use the backdoor triggers as an effective way to evaluate the performance of unlearning methods. In particular, the target client uses a dataset with a certain fraction of images which have a backdoor trigger inserted in them. Thus, the global FL model becomes susceptible to the backdoor trigger. Then, a successful unlearning process should produce a model that reduces the accuracy on the images with the backdoor trigger, while maintaining a good performance on regular (clean) images. Note that we use the backdoor triggers as a way to evaluate the performance of unlearning methods; we do not consider any malicious client nor apply any scaling/model replacement. As a future work, we will explore other metrics such as the accuracy of the membership inference attacks on the unlearnt data of the target client, $\ell_2$-distance between a model retrained from a naive approach and the one obtained via the proposed approach.

## Unlearning with Projected Gradient Ascent

During a federated training round $t$, the goal of a client is to learn a local model that *minimizes* the (local) empirical risk, i.e., to solve the following optimization problem: 
$$\textrm{(Train)}\:\: \min_{\mathbf{w}\in\mathbb{R}^d}F_i(\mathbf{w}):=\frac{1}{n_i}\sum_{j\in D_i}L(\mathbf{w};(\mathbf{x}_j,y_j)),$$
where $L(\mathbf{w};(\mathbf{x}_j,y_j))$ is the loss of the prediction on example $(\mathbf{x}_j,y_j)$ made with model parameters $\mathbf{w}$. Each client locally makes several passes of (mini-batch stochastic) *gradient descent* to find a model that has *low* empirical loss. (It is also possible to use other optimization algorithms.)

We argue that a natural idea for unlearning is to *reverse* this learning process. That is, during unlearning, instead of learning model parameters that minimize the empirical loss, the client strives to learn the model parameters to *maximize* the loss. To find a model with *large* empirical loss, the client can simply make several local passes of (mini-batch stochastic) *gradient ascent*. However, simply maximizing the loss with gradient ascent can be problematic, the loss is unbounded, which is often the case in practice, e.g., the cross-entropy loss. In the case of an unbounded loss, each gradient step moves towards a model that increases the loss, and after several steps it is likely to produce an arbitrary model similar to a random model.

To tackle this issue, we ensure that the unlearned model is *sufficiently close* to a *reference model* that has effectively learned the other clients' data distributions. In particular, we propose to use the average of the other clients' models as a reference model, i.e, $\mathbf{w}_{\textrm{ref}} = \frac{1}{N-1}\sum_{i\ne j}\mathbf{w}^{T-1}_j$. Note that the target client $i$ can compute this reference model locally as $\mathbf{w}_{\textrm{ref}} = \frac{1}{N-1}\left(N\mathbf{w}^T - \mathbf{w}^{T-1}_i\right).$ The client $i$ then optimizes over the model parameters that lie in the $\ell_2$-norm ball of radius $\delta$ around $\mathbf{w}_{\textrm{ref}}$. (The radius $\delta$ will be treated as a hyperparameter in our experiments.) Thus, during unlearning, the client solves the following optimization problem:
$$\textrm{(Unlearn)}\:\: \max_{\mathbf{w}\in\{\mathbf{v}\in\mathbb{R}^d:\lVert\mathbf{v}-\mathbf{w}_{\textrm{ref}} \rVert_2 \leq \delta\}}F_i(\mathbf{w}).$$

An obvious choice for solving the above is to use projected gradient ascent. More specifically, let us denote the $\ell_2$-norm ball of radius $\delta$ around $\mathbf{w}_{\textrm{ref}}$ as $\Omega = \{\mathbf{v}\in\mathbb{R}^d:\lVert\mathbf{v}-\mathbf{w}_{\textrm{ref}} \rVert_2 \leq \delta\}$. Let $\mathcal{P}:\mathbb{R}^d\rightarrow\mathbb{R}^d$ denote the projector operator onto $\Omega$. Then, for a given step-size $\eta_u$, client $i$ iterates the update:
$$\mathbf{w} \leftarrow \mathcal{P}\left(\mathbf{w} + \eta_u\nabla F_i(\mathbf{w})\right).$$
To avoid learning an arbitrary model, we perform early stopping if the validation accuracy drops below a predetermined threshold $\tau$ (which is treated as a hyperparameter). 

## Evaluation

![1](https://drive.google.com/uc?export=view&id=1MOFUC4sJStgkaiQxYoaf6ouO20pF6KyB)

![i](https://drive.google.com/uc?export=view&id=1oQuF7UigL4mK6O0POQMKjBuaMLEkbwZ-)


# References
- A. Halimi, S. Kadhe, A. Rawat, and N. Baracaldo, Federated Unlearning: How to Efficiently Erase a Client in FL? arXiv, 2022. doi: 10.48550/ARXIV.2207.05521. [[Paper](https://arxiv.org/abs/2207.05521)]