# Federated Unlearning: Guarantee the Right of Clients to Forget

## Goals of Federated Unlearning

![2](https://drive.google.com/uc?export=view&id=1n_frVFrmXgVhvtHoZ0K1wtFD9hb4UtqN)

- **Goal1**: Zero contribution is the fundamental goal of federated unlearning. It requires that the deleted data has zero contribution to the unlearned model, which means the deleted data does not influence the model parameters. The model prediction on the deleted data is the same as a model without the training of deleted data. 

- **Goal 2**: Accuracy means that the federated unlearning technology should strive to introduce a small accuracy gap in comparison to the baseline for any number of points unlearned. In other words, there is no serious degradation in the accuracy of the unlearned model on remaining data. 

- **Goal 3**: Unlearning privacy means that the federated unlearning technology should not result in privacy exposure. For example, the attackers cannot recover the clients’ deleted data via gradient leakage attacks during the unlearning. 

- **Goal 4**: Model Agnostic means that the proposed federated unlearning technology should be model agnostic. In other words, it should be able to be applied to any FL models of varying nature and complexity. 

- **Goal 5**: Unlearning efficiency means that the proposed federated unlearning technology should be more efficient than the retraining baseline as much as possible, no matter how much data needs to be forgotten.

## Possible Research Directions of Federated UnlearnIng

- **Class Unlearning**: The clients require to eliminate specific classes of data from the trained model. For example, Client C and D decide to unlearn all data samples in the class of nine. 

- **Client Unlearning**: A client requires to remove all of its data from the trained model. For example, Client B decides to unlearn all its data samples. 

- **Sample Unlearning**: A client requires to remove a subset of its data from the trained model, which is fine-grained and more difficult compared with the client unlearning. , Client A decides to unlearn a subset of its data samples.

![3](https://drive.google.com/uc?export=view&id=1nBh29q1Eqaq4yvG5DR1sBc9kZKwOZBOW)

## A General SGA-based Federated UnlearnIng Framework for Different Levels

### Class Unlearning

For class-level federated unlearning, our target is to exclude the specific class from the model generalization boundary, which means the global model distributed to all clients will lose the classification capability on that specified class. To achieve this target, the simple SGA-based unlearning is enough to control the model generalization boundary. Since in the traditional FL scenario, there is normally a testing dataset in the server-side including labeled data of all classes, this testing dataset can be utilized to implement SGA-based unlearning. This assumption is reasonable, because even if that testing dataset is unavailable, some existing work have shown that synthetic data with similar features can be generated on the server-side by using techniques such as GAN. Finally, the SGA-based unlearning is applied to eliminate the classifi cation capability of specific class from the global model, where unlearning data x* is defined as the data with specifi ed unlearning label in the server-side (can also be synthetic data).

### Client Unlearning

For client-level federated unlearning, the target is
to eliminate previously learned knowledge of the
global model from the specified client, where this
specifi ed client is defi ned as unlearning client $C_u$ and its all local data is defi ned as unlearning data $x^*$. 

**Why Simple SGA Fails in Client Unlearning**:
In some extreme data Non-IID cases, where the
unlearning client Cu uniquely owns all data of a
particular class, the client unlearning is degraded
to be equivalent to class unlearning, and then the
simple SGA is suffi cient for the unlearning purpose.
However, the data distribution is not so extreme
Non-IID in practical scenarios, and normally a certain class of data are shared by multiple clients. In this case, the simple SGA-based unlearning cannot satisfy the client unlearning requirements for the following reasons: 

- The performance of a well-trained global model is based on good generalization, which
is not affected by whether the data comes from
different clients or not. In other words, if label 9
data is shared by client A and B, it’s impossible to
obtain a model that have poor accuracy on client
A’s label 9 data, but good accuracy on client B.
Otherwise, it must be a extreme over-fitting model.

- The data proportion matters, when
the unlearning client Cu only have 10 percent of
label 9 data (just a small portion), directly excluding unlearning client from the FL training process
will barely aff ect the model performance for label
9, which means the absence of a small portion
of data in a class does not aff ect the model generalization to that class. This observation comes
from the comparison between retraining baseline
and SGA-based unlearning approach, where the
retaining model Mr have 94.36 percent accuracy
on label 9 data while unlearning model Mu only
have 61.53 percent. Therefore, the simple SGAbased unlearning will not work in client unlearning
with normal Non-IID data distribution, since the
observation in Fig. 4a clearly shows that it overly corrupts the model generalization boundary
(accuracy reduce to 0 percent in just two steps).

![4](https://drive.google.com/uc?export=view&id=1e7gLLGENRb8RBe9gGVtEfelgDF-JUhIf)

**Previous Memory Protection With Continual Learning**: Above analysis motivates us that a
protection strategy is necessary for model generalization boundary in the SGA-based unlearning
process. Inspired by studies on continual learning, which fi nd models trained on sequentially
diff erent datasets lead to catastrophic forgetting
for previously learned datasets. The root cause
of catastrophic forgetting is that when a model is
trained on a new dataset, it adjusts the parameters
learned about the old data to fit the new data, so
that the knowledge learned on the old data is forgotten. The situations of SGA-based unlearning
are very similar to catastrophic forgetting, if the
unlearning process can be viewed as the model
gradually adapting to the new dataset, where all
remaining data $x/x^*$ denotes the new dataset.

Therefore, a typical approach from continual learning is introduced: Elastic Weight Consolidation (EWC), the main idea of which is to limit
the update magnitude of different parameters
through the regularization terms, so that when a
parameter is more important to the previous old
data, the less it will changed during the training
on new data. In other words, if certain parameters
were important to the previous tasks, they should
be changed as little as possible during the learning process of the new task.

**Combined EWC-SGA-Based Unlearning**:
According to above insights, our EWC-SGAbased unlearning framework is presented. First,
we need to calculate the importance factor of
each parameter in the model by using the Fisher
Information matrix as: $F_i (\theta_i - \theta_{\text{global,}i})^2$. Then, the
importance factor Fi
(·) will be added into the traditional cross-entropy loss as a regularization term,
which can restrict the parameter update magnitude. Parameters with higher importance factor
are more diffi cult to update. This new loss
$$L_u(\theta) = L_{ce}(\theta) + \frac{\lambda}{2} \sum_i F_i(\theta_i - \theta_{\text{global,}i})^2$$
is named the unlearning loss, where $\lambda$ is the constraint strength and $L_{ce}(\cdot)$ is the cross-entropy loss. 

Finally, the SGA-based unlearning will be applied
with unlearning loss, which can control the model
generalization boundary and protect remaining
clients unaff ected.

### Sample Unlearning

For sample-level federated unlearning, the target
is to eliminate the knowledge learned from a part
of client data out of the global model, while client unlearning is for whole client data. Therefore,
the EWC-SGA-based unlearning framework can
still be eff ectively applied to sample Unlearning.
Besides, due to the inherent data privacy protection in FL, the unlearning data $x^*$ of both client-level and sample-level only exist in the local
client-side, which means our unlearning process
is conducted in the client-side. Thus, to achieve
our target, we first need to download the current
global model to the specific unlearning client,
and then apply the EWC-SGA-based framework
to obtain the unlearning model $M_u$. Finally, we
upload the unlearning model back to the server-side as the new global model: $M_u \rightarrow M_{\text{global}}^{\text{new}}$.

## Evaluation

![t](https://drive.google.com/uc?export=view&id=1FbfV03FmfjRUQbijfmp1GkMtLmG5Zhas)

# References
- L. Wu, S. Guo, J. Wang, Z. Hong, J. Zhang, and Y. Ding, “Federated Unlearning: Guarantee the Right of Clients to Forget,” IEEE Network, vol. 36, no. 5, pp. 129–135, 2022, doi: 10.1109/MNET.001.2200198. [[Paper](https://ieeexplore.ieee.org/abstract/document/9964015)]